Over 10 years we help companies reach their financial and branding goals. Engitech is a values-driven technology agency dedicated.

Gallery

Contacts

411 University St, Seattle, USA

engitech@oceanthemes.net

+1 -800-456-478-23

Security News
Moltbot, Clawdbot, Security Moltbot, Security Clawdbot,

Clawdbot / MoltBot / OpenClaw – TJS Security Briefing

Classification: PublicDate: February 6, 2026 Distribution: Security Operations, IT Leadership, Executive Team, Endpoint Management Prepared By: Tech Jacks Solutions Security Intelligence 1. Executive Summary An open-source AI agent called Clawdbot (rebranded to MoltBot on January 27, then to OpenClaw on January 29-30, 2026) represents one of the most significant shadow AI risks to emerge in […]

Daily AI News
AI News & Insights Featured Image

SCALE: Self-uncertainty Conditioned Adaptive Looking and Execution for Vision-Language-Action Models AI updates on arXiv.org

SCALE: Self-uncertainty Conditioned Adaptive Looking and Execution for Vision-Language-Action Modelscs.AI updates on arXiv.org arXiv:2602.04208v1 Announce Type: cross
Abstract: Vision-Language-Action (VLA) models have emerged as a promising paradigm for general-purpose robotic control, with test-time scaling (TTS) gaining attention to enhance robustness beyond training. However, existing TTS methods for VLAs require additional training, verifiers, and multiple forward passes, making them impractical for deployment. Moreover, they intervene only at action decoding while keeping visual representations fixed-insufficient under perceptual ambiguity, where reconsidering how to perceive is as important as deciding what to do. To address these limitations, we propose SCALE, a simple inference strategy that jointly modulates visual perception and action based on ‘self-uncertainty’, inspired by uncertainty-driven exploration in Active Inference theory-requiring no additional training, no verifier, and only a single forward pass. SCALE broadens exploration in both perception and action under high uncertainty, while focusing on exploitation when confident-enabling adaptive execution across varying conditions. Experiments on simulated and real-world benchmarks demonstrate that SCALE improves state-of-the-art VLAs and outperforms existing TTS methods while maintaining single-pass efficiency.

 arXiv:2602.04208v1 Announce Type: cross
Abstract: Vision-Language-Action (VLA) models have emerged as a promising paradigm for general-purpose robotic control, with test-time scaling (TTS) gaining attention to enhance robustness beyond training. However, existing TTS methods for VLAs require additional training, verifiers, and multiple forward passes, making them impractical for deployment. Moreover, they intervene only at action decoding while keeping visual representations fixed-insufficient under perceptual ambiguity, where reconsidering how to perceive is as important as deciding what to do. To address these limitations, we propose SCALE, a simple inference strategy that jointly modulates visual perception and action based on ‘self-uncertainty’, inspired by uncertainty-driven exploration in Active Inference theory-requiring no additional training, no verifier, and only a single forward pass. SCALE broadens exploration in both perception and action under high uncertainty, while focusing on exploitation when confident-enabling adaptive execution across varying conditions. Experiments on simulated and real-world benchmarks demonstrate that SCALE improves state-of-the-art VLAs and outperforms existing TTS methods while maintaining single-pass efficiency. Read More  

Daily AI News
AI News & Insights Featured Image

SkeletonGaussian: Editable 4D Generation through Gaussian Skeletonization AI updates on arXiv.org

SkeletonGaussian: Editable 4D Generation through Gaussian Skeletonizationcs.AI updates on arXiv.org arXiv:2602.04271v1 Announce Type: cross
Abstract: 4D generation has made remarkable progress in synthesizing dynamic 3D objects from input text, images, or videos. However, existing methods often represent motion as an implicit deformation field, which limits direct control and editability. To address this issue, we propose SkeletonGaussian, a novel framework for generating editable dynamic 3D Gaussians from monocular video input. Our approach introduces a hierarchical articulated representation that decomposes motion into sparse rigid motion explicitly driven by a skeleton and fine-grained non-rigid motion. Concretely, we extract a robust skeleton and drive rigid motion via linear blend skinning, followed by a hexplane-based refinement for non-rigid deformations, enhancing interpretability and editability. Experimental results demonstrate that SkeletonGaussian surpasses existing methods in generation quality while enabling intuitive motion editing, establishing a new paradigm for editable 4D generation. Project page: https://wusar.github.io/projects/skeletongaussian/

 arXiv:2602.04271v1 Announce Type: cross
Abstract: 4D generation has made remarkable progress in synthesizing dynamic 3D objects from input text, images, or videos. However, existing methods often represent motion as an implicit deformation field, which limits direct control and editability. To address this issue, we propose SkeletonGaussian, a novel framework for generating editable dynamic 3D Gaussians from monocular video input. Our approach introduces a hierarchical articulated representation that decomposes motion into sparse rigid motion explicitly driven by a skeleton and fine-grained non-rigid motion. Concretely, we extract a robust skeleton and drive rigid motion via linear blend skinning, followed by a hexplane-based refinement for non-rigid deformations, enhancing interpretability and editability. Experimental results demonstrate that SkeletonGaussian surpasses existing methods in generation quality while enabling intuitive motion editing, establishing a new paradigm for editable 4D generation. Project page: https://wusar.github.io/projects/skeletongaussian/ Read More  

Daily AI News
AI News & Insights Featured Image

DeFrame: Debiasing Large Language Models Against Framing Effects AI updates on arXiv.org

DeFrame: Debiasing Large Language Models Against Framing Effectscs.AI updates on arXiv.org arXiv:2602.04306v1 Announce Type: cross
Abstract: As large language models (LLMs) are increasingly deployed in real-world applications, ensuring their fair responses across demographics has become crucial. Despite many efforts, an ongoing challenge is hidden bias: LLMs appear fair under standard evaluations, but can produce biased responses outside those evaluation settings. In this paper, we identify framing — differences in how semantically equivalent prompts are expressed (e.g., “A is better than B” vs. “B is worse than A”) — as an underexplored contributor to this gap. We first introduce the concept of “framing disparity” to quantify the impact of framing on fairness evaluation. By augmenting fairness evaluation benchmarks with alternative framings, we find that (1) fairness scores vary significantly with framing and (2) existing debiasing methods improve overall (i.e., frame-averaged) fairness, but often fail to reduce framing-induced disparities. To address this, we propose a framing-aware debiasing method that encourages LLMs to be more consistent across framings. Experiments demonstrate that our approach reduces overall bias and improves robustness against framing disparities, enabling LLMs to produce fairer and more consistent responses.

 arXiv:2602.04306v1 Announce Type: cross
Abstract: As large language models (LLMs) are increasingly deployed in real-world applications, ensuring their fair responses across demographics has become crucial. Despite many efforts, an ongoing challenge is hidden bias: LLMs appear fair under standard evaluations, but can produce biased responses outside those evaluation settings. In this paper, we identify framing — differences in how semantically equivalent prompts are expressed (e.g., “A is better than B” vs. “B is worse than A”) — as an underexplored contributor to this gap. We first introduce the concept of “framing disparity” to quantify the impact of framing on fairness evaluation. By augmenting fairness evaluation benchmarks with alternative framings, we find that (1) fairness scores vary significantly with framing and (2) existing debiasing methods improve overall (i.e., frame-averaged) fairness, but often fail to reduce framing-induced disparities. To address this, we propose a framing-aware debiasing method that encourages LLMs to be more consistent across framings. Experiments demonstrate that our approach reduces overall bias and improves robustness against framing disparities, enabling LLMs to produce fairer and more consistent responses. Read More  

Daily AI News
AI News & Insights Featured Image

Structural shifts in institutional participation and collaboration within the AI arXiv preprint research ecosystem AI updates on arXiv.org

Structural shifts in institutional participation and collaboration within the AI arXiv preprint research ecosystemcs.AI updates on arXiv.org arXiv:2602.03969v1 Announce Type: cross
Abstract: The emergence of large language models (LLMs) represents a significant technological shift within the scientific ecosystem, particularly within the field of artificial intelligence (AI). This paper examines structural changes in the AI research landscape using a dataset of arXiv preprints (cs.AI) from 2021 through 2025. Given the rapid pace of AI development, the preprint ecosystem has become a critical barometer for real-time scientific shifts, often preceding formal peer-reviewed publication by months or years. By employing a multi-stage data collection and enrichment pipeline in conjunction with LLM-based institution classification, we analyze the evolution of publication volumes, author team sizes, and academic–industry collaboration patterns. Our results reveal an unprecedented surge in publication output following the introduction of ChatGPT, with academic institutions continuing to provide the largest volume of research. However, we observe that academic–industry collaboration is still suppressed, as measured by a Normalized Collaboration Index (NCI) that remains significantly below the random-mixing baseline across all major subfields. These findings highlight a continuing institutional divide and suggest that the capital-intensive nature of generative AI research may be reshaping the boundaries of scientific collaboration.

 arXiv:2602.03969v1 Announce Type: cross
Abstract: The emergence of large language models (LLMs) represents a significant technological shift within the scientific ecosystem, particularly within the field of artificial intelligence (AI). This paper examines structural changes in the AI research landscape using a dataset of arXiv preprints (cs.AI) from 2021 through 2025. Given the rapid pace of AI development, the preprint ecosystem has become a critical barometer for real-time scientific shifts, often preceding formal peer-reviewed publication by months or years. By employing a multi-stage data collection and enrichment pipeline in conjunction with LLM-based institution classification, we analyze the evolution of publication volumes, author team sizes, and academic–industry collaboration patterns. Our results reveal an unprecedented surge in publication output following the introduction of ChatGPT, with academic institutions continuing to provide the largest volume of research. However, we observe that academic–industry collaboration is still suppressed, as measured by a Normalized Collaboration Index (NCI) that remains significantly below the random-mixing baseline across all major subfields. These findings highlight a continuing institutional divide and suggest that the capital-intensive nature of generative AI research may be reshaping the boundaries of scientific collaboration. Read More  

Daily AI News
AI News & Insights Featured Image

When Chains of Thought Don’t Matter: Causal Bypass in Large Language Models AI updates on arXiv.org

When Chains of Thought Don’t Matter: Causal Bypass in Large Language Modelscs.AI updates on arXiv.org arXiv:2602.03994v1 Announce Type: cross
Abstract: Chain-of-thought (CoT) prompting is widely assumed to expose a model’s reasoning process and improve transparency. We attempted to enforce this assumption by penalizing unfaithful reasoning, but found that surface-level compliance does not guarantee causal reliance. Our central finding is negative: even when CoT is verbose, strategic, and flagged by surface-level manipulation detectors, model answers are often causally independent of the CoT content. We present a diagnostic framework for auditing this failure mode: it combines (i) an interpretable behavioral module that scores manipulation-relevant signals in CoT text and (ii) a causal probe that measures CoT-mediated influence (CMI) via hidden-state patching and reports a bypass score ($1-mathrm{CMI}$), quantifying the degree to which the answer is produced by a bypass circuit independent of the rationale. In pilot evaluations, audit-aware prompting increases detectable manipulation signals (mean risk-score delta: $+5.10$), yet causal probes reveal task-dependent mediation: many QA items exhibit near-total bypass (CMI $approx 0$), while some logic problems show stronger mediation (CMI up to $0.56$). Layer-wise analysis reveals narrow and task-dependent “reasoning windows” even when mean CMI is low.

 arXiv:2602.03994v1 Announce Type: cross
Abstract: Chain-of-thought (CoT) prompting is widely assumed to expose a model’s reasoning process and improve transparency. We attempted to enforce this assumption by penalizing unfaithful reasoning, but found that surface-level compliance does not guarantee causal reliance. Our central finding is negative: even when CoT is verbose, strategic, and flagged by surface-level manipulation detectors, model answers are often causally independent of the CoT content. We present a diagnostic framework for auditing this failure mode: it combines (i) an interpretable behavioral module that scores manipulation-relevant signals in CoT text and (ii) a causal probe that measures CoT-mediated influence (CMI) via hidden-state patching and reports a bypass score ($1-mathrm{CMI}$), quantifying the degree to which the answer is produced by a bypass circuit independent of the rationale. In pilot evaluations, audit-aware prompting increases detectable manipulation signals (mean risk-score delta: $+5.10$), yet causal probes reveal task-dependent mediation: many QA items exhibit near-total bypass (CMI $approx 0$), while some logic problems show stronger mediation (CMI up to $0.56$). Layer-wise analysis reveals narrow and task-dependent “reasoning windows” even when mean CMI is low. Read More  

Daily AI News
AI News & Insights Featured Image

Prompt Fidelity: Measuring How Much of Your Intent an AI Agent Actually ExecutesTowards Data Science

Prompt Fidelity: Measuring How Much of Your Intent an AI Agent Actually ExecutesTowards Data Science How much of your AI agent’s output is real data versus confident guesswork?
The post Prompt Fidelity: Measuring How Much of Your Intent an AI Agent Actually Executes appeared first on Towards Data Science.

 How much of your AI agent’s output is real data versus confident guesswork?
The post Prompt Fidelity: Measuring How Much of Your Intent an AI Agent Actually Executes appeared first on Towards Data Science. Read More  

Daily AI News
How separating logic and search boosts AI agent scalability AI News

How separating logic and search boosts AI agent scalability AI News

How separating logic and search boosts AI agent scalabilityAI News Separating logic from inference improves AI agent scalability by decoupling core workflows from execution strategies. The transition from generative AI prototypes to production-grade agents introduces a specific engineering hurdle: reliability. LLMs are stochastic by nature. A prompt that works once may fail on the second attempt. To mitigate this, development teams often wrap core business
The post How separating logic and search boosts AI agent scalability appeared first on AI News.

 Separating logic from inference improves AI agent scalability by decoupling core workflows from execution strategies. The transition from generative AI prototypes to production-grade agents introduces a specific engineering hurdle: reliability. LLMs are stochastic by nature. A prompt that works once may fail on the second attempt. To mitigate this, development teams often wrap core business
The post How separating logic and search boosts AI agent scalability appeared first on AI News. Read More  

Daily AI News
AI News & Insights Featured Image

Intuit, Uber, and State Farm trial AI agents inside enterprise workflows AI News

Intuit, Uber, and State Farm trial AI agents inside enterprise workflowsAI News The way large companies use artificial intelligence is changing. For years, AI in business meant experimenting with tools that could answer questions or help with small tasks. Now, some big enterprises are moving beyond tools to AI agents that can actually do practical work in systems and workflows. This week, OpenAI introduced a new platform
The post Intuit, Uber, and State Farm trial AI agents inside enterprise workflows appeared first on AI News.

 The way large companies use artificial intelligence is changing. For years, AI in business meant experimenting with tools that could answer questions or help with small tasks. Now, some big enterprises are moving beyond tools to AI agents that can actually do practical work in systems and workflows. This week, OpenAI introduced a new platform
The post Intuit, Uber, and State Farm trial AI agents inside enterprise workflows appeared first on AI News. Read More  

Daily AI News
AI News & Insights Featured Image

When Good Sounds Go Adversarial: Jailbreaking Audio-Language Models with Benign Inputs AI updates on arXiv.org

When Good Sounds Go Adversarial: Jailbreaking Audio-Language Models with Benign Inputscs.AI updates on arXiv.org arXiv:2508.03365v3 Announce Type: replace-cross
Abstract: As large language models (LLMs) become increasingly integrated into daily life, audio has emerged as a key interface for human-AI interaction. However, this convenience also introduces new vulnerabilities, making audio a potential attack surface for adversaries. Our research introduces WhisperInject, a two-stage adversarial audio attack framework that manipulates state-of-the-art audio language models to generate harmful content. Our method embeds harmful payloads as subtle perturbations into audio inputs that remain intelligible to human listeners. The first stage uses a novel reward-based white-box optimization method, Reinforcement Learning with Projected Gradient Descent (RL-PGD), to jailbreak the target model and elicit harmful native responses. This native harmful response then serves as the target for Stage 2, Payload Injection, where we use gradient-based optimization to embed subtle perturbations into benign audio carriers, such as weather queries or greeting messages. Our method achieves average attack success rates of 60-78% across two benchmarks and five multimodal LLMs, validated by multiple evaluation frameworks. Our work demonstrates a new class of practical, audio-native threats, moving beyond theoretical exploits to reveal a feasible and covert method for manipulating multimodal AI systems.

 arXiv:2508.03365v3 Announce Type: replace-cross
Abstract: As large language models (LLMs) become increasingly integrated into daily life, audio has emerged as a key interface for human-AI interaction. However, this convenience also introduces new vulnerabilities, making audio a potential attack surface for adversaries. Our research introduces WhisperInject, a two-stage adversarial audio attack framework that manipulates state-of-the-art audio language models to generate harmful content. Our method embeds harmful payloads as subtle perturbations into audio inputs that remain intelligible to human listeners. The first stage uses a novel reward-based white-box optimization method, Reinforcement Learning with Projected Gradient Descent (RL-PGD), to jailbreak the target model and elicit harmful native responses. This native harmful response then serves as the target for Stage 2, Payload Injection, where we use gradient-based optimization to embed subtle perturbations into benign audio carriers, such as weather queries or greeting messages. Our method achieves average attack success rates of 60-78% across two benchmarks and five multimodal LLMs, validated by multiple evaluation frameworks. Our work demonstrates a new class of practical, audio-native threats, moving beyond theoretical exploits to reveal a feasible and covert method for manipulating multimodal AI systems. Read More