Over 10 years we help companies reach their financial and branding goals. Engitech is a values-driven technology agency dedicated.

Gallery

Contacts

411 University St, Seattle, USA

engitech@oceanthemes.net

+1 -800-456-478-23

Daily AI News
AI News & Insights Featured Image

SkeletonGaussian: Editable 4D Generation through Gaussian Skeletonization AI updates on arXiv.org

SkeletonGaussian: Editable 4D Generation through Gaussian Skeletonizationcs.AI updates on arXiv.org arXiv:2602.04271v1 Announce Type: cross
Abstract: 4D generation has made remarkable progress in synthesizing dynamic 3D objects from input text, images, or videos. However, existing methods often represent motion as an implicit deformation field, which limits direct control and editability. To address this issue, we propose SkeletonGaussian, a novel framework for generating editable dynamic 3D Gaussians from monocular video input. Our approach introduces a hierarchical articulated representation that decomposes motion into sparse rigid motion explicitly driven by a skeleton and fine-grained non-rigid motion. Concretely, we extract a robust skeleton and drive rigid motion via linear blend skinning, followed by a hexplane-based refinement for non-rigid deformations, enhancing interpretability and editability. Experimental results demonstrate that SkeletonGaussian surpasses existing methods in generation quality while enabling intuitive motion editing, establishing a new paradigm for editable 4D generation. Project page: https://wusar.github.io/projects/skeletongaussian/

 arXiv:2602.04271v1 Announce Type: cross
Abstract: 4D generation has made remarkable progress in synthesizing dynamic 3D objects from input text, images, or videos. However, existing methods often represent motion as an implicit deformation field, which limits direct control and editability. To address this issue, we propose SkeletonGaussian, a novel framework for generating editable dynamic 3D Gaussians from monocular video input. Our approach introduces a hierarchical articulated representation that decomposes motion into sparse rigid motion explicitly driven by a skeleton and fine-grained non-rigid motion. Concretely, we extract a robust skeleton and drive rigid motion via linear blend skinning, followed by a hexplane-based refinement for non-rigid deformations, enhancing interpretability and editability. Experimental results demonstrate that SkeletonGaussian surpasses existing methods in generation quality while enabling intuitive motion editing, establishing a new paradigm for editable 4D generation. Project page: https://wusar.github.io/projects/skeletongaussian/ Read More  

Daily AI News
AI News & Insights Featured Image

SCALE: Self-uncertainty Conditioned Adaptive Looking and Execution for Vision-Language-Action Models AI updates on arXiv.org

SCALE: Self-uncertainty Conditioned Adaptive Looking and Execution for Vision-Language-Action Modelscs.AI updates on arXiv.org arXiv:2602.04208v1 Announce Type: cross
Abstract: Vision-Language-Action (VLA) models have emerged as a promising paradigm for general-purpose robotic control, with test-time scaling (TTS) gaining attention to enhance robustness beyond training. However, existing TTS methods for VLAs require additional training, verifiers, and multiple forward passes, making them impractical for deployment. Moreover, they intervene only at action decoding while keeping visual representations fixed-insufficient under perceptual ambiguity, where reconsidering how to perceive is as important as deciding what to do. To address these limitations, we propose SCALE, a simple inference strategy that jointly modulates visual perception and action based on ‘self-uncertainty’, inspired by uncertainty-driven exploration in Active Inference theory-requiring no additional training, no verifier, and only a single forward pass. SCALE broadens exploration in both perception and action under high uncertainty, while focusing on exploitation when confident-enabling adaptive execution across varying conditions. Experiments on simulated and real-world benchmarks demonstrate that SCALE improves state-of-the-art VLAs and outperforms existing TTS methods while maintaining single-pass efficiency.

 arXiv:2602.04208v1 Announce Type: cross
Abstract: Vision-Language-Action (VLA) models have emerged as a promising paradigm for general-purpose robotic control, with test-time scaling (TTS) gaining attention to enhance robustness beyond training. However, existing TTS methods for VLAs require additional training, verifiers, and multiple forward passes, making them impractical for deployment. Moreover, they intervene only at action decoding while keeping visual representations fixed-insufficient under perceptual ambiguity, where reconsidering how to perceive is as important as deciding what to do. To address these limitations, we propose SCALE, a simple inference strategy that jointly modulates visual perception and action based on ‘self-uncertainty’, inspired by uncertainty-driven exploration in Active Inference theory-requiring no additional training, no verifier, and only a single forward pass. SCALE broadens exploration in both perception and action under high uncertainty, while focusing on exploitation when confident-enabling adaptive execution across varying conditions. Experiments on simulated and real-world benchmarks demonstrate that SCALE improves state-of-the-art VLAs and outperforms existing TTS methods while maintaining single-pass efficiency. Read More  

Daily AI News
AI News & Insights Featured Image

Structural shifts in institutional participation and collaboration within the AI arXiv preprint research ecosystem AI updates on arXiv.org

Structural shifts in institutional participation and collaboration within the AI arXiv preprint research ecosystemcs.AI updates on arXiv.org arXiv:2602.03969v1 Announce Type: cross
Abstract: The emergence of large language models (LLMs) represents a significant technological shift within the scientific ecosystem, particularly within the field of artificial intelligence (AI). This paper examines structural changes in the AI research landscape using a dataset of arXiv preprints (cs.AI) from 2021 through 2025. Given the rapid pace of AI development, the preprint ecosystem has become a critical barometer for real-time scientific shifts, often preceding formal peer-reviewed publication by months or years. By employing a multi-stage data collection and enrichment pipeline in conjunction with LLM-based institution classification, we analyze the evolution of publication volumes, author team sizes, and academic–industry collaboration patterns. Our results reveal an unprecedented surge in publication output following the introduction of ChatGPT, with academic institutions continuing to provide the largest volume of research. However, we observe that academic–industry collaboration is still suppressed, as measured by a Normalized Collaboration Index (NCI) that remains significantly below the random-mixing baseline across all major subfields. These findings highlight a continuing institutional divide and suggest that the capital-intensive nature of generative AI research may be reshaping the boundaries of scientific collaboration.

 arXiv:2602.03969v1 Announce Type: cross
Abstract: The emergence of large language models (LLMs) represents a significant technological shift within the scientific ecosystem, particularly within the field of artificial intelligence (AI). This paper examines structural changes in the AI research landscape using a dataset of arXiv preprints (cs.AI) from 2021 through 2025. Given the rapid pace of AI development, the preprint ecosystem has become a critical barometer for real-time scientific shifts, often preceding formal peer-reviewed publication by months or years. By employing a multi-stage data collection and enrichment pipeline in conjunction with LLM-based institution classification, we analyze the evolution of publication volumes, author team sizes, and academic–industry collaboration patterns. Our results reveal an unprecedented surge in publication output following the introduction of ChatGPT, with academic institutions continuing to provide the largest volume of research. However, we observe that academic–industry collaboration is still suppressed, as measured by a Normalized Collaboration Index (NCI) that remains significantly below the random-mixing baseline across all major subfields. These findings highlight a continuing institutional divide and suggest that the capital-intensive nature of generative AI research may be reshaping the boundaries of scientific collaboration. Read More  

Daily AI News
AI News & Insights Featured Image

DeFrame: Debiasing Large Language Models Against Framing Effects AI updates on arXiv.org

DeFrame: Debiasing Large Language Models Against Framing Effectscs.AI updates on arXiv.org arXiv:2602.04306v1 Announce Type: cross
Abstract: As large language models (LLMs) are increasingly deployed in real-world applications, ensuring their fair responses across demographics has become crucial. Despite many efforts, an ongoing challenge is hidden bias: LLMs appear fair under standard evaluations, but can produce biased responses outside those evaluation settings. In this paper, we identify framing — differences in how semantically equivalent prompts are expressed (e.g., “A is better than B” vs. “B is worse than A”) — as an underexplored contributor to this gap. We first introduce the concept of “framing disparity” to quantify the impact of framing on fairness evaluation. By augmenting fairness evaluation benchmarks with alternative framings, we find that (1) fairness scores vary significantly with framing and (2) existing debiasing methods improve overall (i.e., frame-averaged) fairness, but often fail to reduce framing-induced disparities. To address this, we propose a framing-aware debiasing method that encourages LLMs to be more consistent across framings. Experiments demonstrate that our approach reduces overall bias and improves robustness against framing disparities, enabling LLMs to produce fairer and more consistent responses.

 arXiv:2602.04306v1 Announce Type: cross
Abstract: As large language models (LLMs) are increasingly deployed in real-world applications, ensuring their fair responses across demographics has become crucial. Despite many efforts, an ongoing challenge is hidden bias: LLMs appear fair under standard evaluations, but can produce biased responses outside those evaluation settings. In this paper, we identify framing — differences in how semantically equivalent prompts are expressed (e.g., “A is better than B” vs. “B is worse than A”) — as an underexplored contributor to this gap. We first introduce the concept of “framing disparity” to quantify the impact of framing on fairness evaluation. By augmenting fairness evaluation benchmarks with alternative framings, we find that (1) fairness scores vary significantly with framing and (2) existing debiasing methods improve overall (i.e., frame-averaged) fairness, but often fail to reduce framing-induced disparities. To address this, we propose a framing-aware debiasing method that encourages LLMs to be more consistent across framings. Experiments demonstrate that our approach reduces overall bias and improves robustness against framing disparities, enabling LLMs to produce fairer and more consistent responses. Read More  

Daily AI News
AI News & Insights Featured Image

When Chains of Thought Don’t Matter: Causal Bypass in Large Language Models AI updates on arXiv.org

When Chains of Thought Don’t Matter: Causal Bypass in Large Language Modelscs.AI updates on arXiv.org arXiv:2602.03994v1 Announce Type: cross
Abstract: Chain-of-thought (CoT) prompting is widely assumed to expose a model’s reasoning process and improve transparency. We attempted to enforce this assumption by penalizing unfaithful reasoning, but found that surface-level compliance does not guarantee causal reliance. Our central finding is negative: even when CoT is verbose, strategic, and flagged by surface-level manipulation detectors, model answers are often causally independent of the CoT content. We present a diagnostic framework for auditing this failure mode: it combines (i) an interpretable behavioral module that scores manipulation-relevant signals in CoT text and (ii) a causal probe that measures CoT-mediated influence (CMI) via hidden-state patching and reports a bypass score ($1-mathrm{CMI}$), quantifying the degree to which the answer is produced by a bypass circuit independent of the rationale. In pilot evaluations, audit-aware prompting increases detectable manipulation signals (mean risk-score delta: $+5.10$), yet causal probes reveal task-dependent mediation: many QA items exhibit near-total bypass (CMI $approx 0$), while some logic problems show stronger mediation (CMI up to $0.56$). Layer-wise analysis reveals narrow and task-dependent “reasoning windows” even when mean CMI is low.

 arXiv:2602.03994v1 Announce Type: cross
Abstract: Chain-of-thought (CoT) prompting is widely assumed to expose a model’s reasoning process and improve transparency. We attempted to enforce this assumption by penalizing unfaithful reasoning, but found that surface-level compliance does not guarantee causal reliance. Our central finding is negative: even when CoT is verbose, strategic, and flagged by surface-level manipulation detectors, model answers are often causally independent of the CoT content. We present a diagnostic framework for auditing this failure mode: it combines (i) an interpretable behavioral module that scores manipulation-relevant signals in CoT text and (ii) a causal probe that measures CoT-mediated influence (CMI) via hidden-state patching and reports a bypass score ($1-mathrm{CMI}$), quantifying the degree to which the answer is produced by a bypass circuit independent of the rationale. In pilot evaluations, audit-aware prompting increases detectable manipulation signals (mean risk-score delta: $+5.10$), yet causal probes reveal task-dependent mediation: many QA items exhibit near-total bypass (CMI $approx 0$), while some logic problems show stronger mediation (CMI up to $0.56$). Layer-wise analysis reveals narrow and task-dependent “reasoning windows” even when mean CMI is low. Read More  

Security News
samsung PG5FOI

How Samsung Knox Helps Stop Your Network Security Breach The Hacker Newsinfo@thehackernews.com (The Hacker News)

As you know, enterprise network security has undergone significant evolution over the past decade. Firewalls have become more intelligent, threat detection methods have advanced, and access controls are now more detailed. However (and it’s a big “however”), the increasing use of mobile devices in business operations necessitates network security measures that are specifically Read More 

Daily AI News
AI News & Insights Featured Image

When AI Persuades: Adversarial Explanation Attacks on Human Trust in AI-Assisted Decision Making AI updates on arXiv.org

When AI Persuades: Adversarial Explanation Attacks on Human Trust in AI-Assisted Decision Makingcs.AI updates on arXiv.org arXiv:2602.04003v1 Announce Type: new
Abstract: Most adversarial threats in artificial intelligence target the computational behavior of models rather than the humans who rely on them. Yet modern AI systems increasingly operate within human decision loops, where users interpret and act on model recommendations. Large Language Models generate fluent natural-language explanations that shape how users perceive and trust AI outputs, revealing a new attack surface at the cognitive layer: the communication channel between AI and its users. We introduce adversarial explanation attacks (AEAs), where an attacker manipulates the framing of LLM-generated explanations to modulate human trust in incorrect outputs. We formalize this behavioral threat through the trust miscalibration gap, a metric that captures the difference in human trust between correct and incorrect outputs under adversarial explanations. By incorporating this gap, AEAs explore the daunting threats in which persuasive explanations reinforce users’ trust in incorrect predictions. To characterize this threat, we conducted a controlled experiment (n = 205), systematically varying four dimensions of explanation framing: reasoning mode, evidence type, communication style, and presentation format. Our findings show that users report nearly identical trust for adversarial and benign explanations, with adversarial explanations preserving the vast majority of benign trust despite being incorrect. The most vulnerable cases arise when AEAs closely resemble expert communication, combining authoritative evidence, neutral tone, and domain-appropriate reasoning. Vulnerability is highest on hard tasks, in fact-driven domains, and among participants who are less formally educated, younger, or highly trusting of AI. This is the first systematic security study that treats explanations as an adversarial cognitive channel and quantifies their impact on human trust in AI-assisted decision making.

 arXiv:2602.04003v1 Announce Type: new
Abstract: Most adversarial threats in artificial intelligence target the computational behavior of models rather than the humans who rely on them. Yet modern AI systems increasingly operate within human decision loops, where users interpret and act on model recommendations. Large Language Models generate fluent natural-language explanations that shape how users perceive and trust AI outputs, revealing a new attack surface at the cognitive layer: the communication channel between AI and its users. We introduce adversarial explanation attacks (AEAs), where an attacker manipulates the framing of LLM-generated explanations to modulate human trust in incorrect outputs. We formalize this behavioral threat through the trust miscalibration gap, a metric that captures the difference in human trust between correct and incorrect outputs under adversarial explanations. By incorporating this gap, AEAs explore the daunting threats in which persuasive explanations reinforce users’ trust in incorrect predictions. To characterize this threat, we conducted a controlled experiment (n = 205), systematically varying four dimensions of explanation framing: reasoning mode, evidence type, communication style, and presentation format. Our findings show that users report nearly identical trust for adversarial and benign explanations, with adversarial explanations preserving the vast majority of benign trust despite being incorrect. The most vulnerable cases arise when AEAs closely resemble expert communication, combining authoritative evidence, neutral tone, and domain-appropriate reasoning. Vulnerability is highest on hard tasks, in fact-driven domains, and among participants who are less formally educated, younger, or highly trusting of AI. This is the first systematic security study that treats explanations as an adversarial cognitive channel and quantifies their impact on human trust in AI-assisted decision making. Read More  

Daily AI News
AI News & Insights Featured Image

Beyond Fixed Frames: Dynamic Character-Aligned Speech Tokenization AI updates on arXiv.org

Beyond Fixed Frames: Dynamic Character-Aligned Speech Tokenizationcs.AI updates on arXiv.org arXiv:2601.23174v2 Announce Type: replace-cross
Abstract: Neural audio codecs are at the core of modern conversational speech technologies, converting continuous speech into sequences of discrete tokens that can be processed by LLMs. However, existing codecs typically operate at fixed frame rates, allocating tokens uniformly in time and producing unnecessarily long sequences. In this work, we introduce DyCAST, a Dynamic Character-Aligned Speech Tokenizer that enables variable-frame-rate tokenization through soft character-level alignment and explicit duration modeling. DyCAST learns to associate tokens with character-level linguistic units during training and supports alignment-free inference with direct control over token durations at decoding time. To improve speech resynthesis quality at low frame rates, we further introduce a retrieval-augmented decoding mechanism that enhances reconstruction fidelity without increasing bitrate. Experiments show that DyCAST achieves competitive speech resynthesis quality and downstream performance while using significantly fewer tokens than fixed-frame-rate codecs. Code and checkpoints will be released publicly at https://github.com/lucadellalib/dycast.

 arXiv:2601.23174v2 Announce Type: replace-cross
Abstract: Neural audio codecs are at the core of modern conversational speech technologies, converting continuous speech into sequences of discrete tokens that can be processed by LLMs. However, existing codecs typically operate at fixed frame rates, allocating tokens uniformly in time and producing unnecessarily long sequences. In this work, we introduce DyCAST, a Dynamic Character-Aligned Speech Tokenizer that enables variable-frame-rate tokenization through soft character-level alignment and explicit duration modeling. DyCAST learns to associate tokens with character-level linguistic units during training and supports alignment-free inference with direct control over token durations at decoding time. To improve speech resynthesis quality at low frame rates, we further introduce a retrieval-augmented decoding mechanism that enhances reconstruction fidelity without increasing bitrate. Experiments show that DyCAST achieves competitive speech resynthesis quality and downstream performance while using significantly fewer tokens than fixed-frame-rate codecs. Code and checkpoints will be released publicly at https://github.com/lucadellalib/dycast. Read More  

Daily AI News
AI News & Insights Featured Image

Plug-and-Play Emotion Graphs for Compositional Prompting in Zero-Shot Speech Emotion Recognition AI updates on arXiv.org

Plug-and-Play Emotion Graphs for Compositional Prompting in Zero-Shot Speech Emotion Recognitioncs.AI updates on arXiv.org arXiv:2509.25458v2 Announce Type: replace
Abstract: Large audio-language models (LALMs) exhibit strong zero-shot performance across speech tasks but struggle with speech emotion recognition (SER) due to weak paralinguistic modeling and limited cross-modal reasoning. We propose Compositional Chain-of-Thought Prompting for Emotion Reasoning (CCoT-Emo), a framework that introduces structured Emotion Graphs (EGs) to guide LALMs in emotion inference without fine-tuning. Each EG encodes seven acoustic features (e.g., pitch, speech rate, jitter, shimmer), textual sentiment, keywords, and cross-modal associations. Embedded into prompts, EGs provide interpretable and compositional representations that enhance LALM reasoning. Experiments across SER benchmarks show that CCoT-Emo outperforms prior SOTA and improves accuracy over zero-shot baselines.

 arXiv:2509.25458v2 Announce Type: replace
Abstract: Large audio-language models (LALMs) exhibit strong zero-shot performance across speech tasks but struggle with speech emotion recognition (SER) due to weak paralinguistic modeling and limited cross-modal reasoning. We propose Compositional Chain-of-Thought Prompting for Emotion Reasoning (CCoT-Emo), a framework that introduces structured Emotion Graphs (EGs) to guide LALMs in emotion inference without fine-tuning. Each EG encodes seven acoustic features (e.g., pitch, speech rate, jitter, shimmer), textual sentiment, keywords, and cross-modal associations. Embedded into prompts, EGs provide interpretable and compositional representations that enhance LALM reasoning. Experiments across SER benchmarks show that CCoT-Emo outperforms prior SOTA and improves accuracy over zero-shot baselines. Read More  

Daily AI News
AI News & Insights Featured Image

DISCOVER: Identifying Patterns of Daily Living in Human Activities from Smart Home Data AI updates on arXiv.org

DISCOVER: Identifying Patterns of Daily Living in Human Activities from Smart Home Datacs.AI updates on arXiv.org arXiv:2503.01733v3 Announce Type: replace-cross
Abstract: Smart homes equipped with ambient sensors offer a transformative approach to continuous health monitoring and assisted living. Traditional research in this domain primarily focuses on Human Activity Recognition (HAR), which relies on mapping sensor data to a closed set of predefined activity labels. However, the fixed granularity of these labels often constrains their practical utility, failing to capture the subtle, household-specific nuances essential, for example, for tracking individual health over time. To address this, we propose DISCOVER, a framework for discovering and annotating Patterns of Daily Living (PDL) – fine-grained, recurring sequences of sensor events that emerge directly from a resident’s unique routines. DISCOVER utilizes a self-supervised feature extraction and representation-aware clustering pipeline, supported by a custom visualization interface that enables experts to interpret and label discovered patterns with minimal effort. Our evaluation across multiple smart-home environments demonstrates that DISCOVER identifies cohesive behavioral clusters with high inter-rater agreement while achieving classification performance comparable to fully-supervised baselines using only 0.01% of the labels. Beyond reducing annotation overhead, DISCOVER establishes a foundation for longitudinal analysis. By grounding behavior in a resident’s specific environment rather than rigid semantic categories, our framework facilitates the observation of within-person habitual drift. This capability positions the system as a potential tool for identifying subtle behavioral indicators associated with early-stage cognitive decline in future longitudinal studies.

 arXiv:2503.01733v3 Announce Type: replace-cross
Abstract: Smart homes equipped with ambient sensors offer a transformative approach to continuous health monitoring and assisted living. Traditional research in this domain primarily focuses on Human Activity Recognition (HAR), which relies on mapping sensor data to a closed set of predefined activity labels. However, the fixed granularity of these labels often constrains their practical utility, failing to capture the subtle, household-specific nuances essential, for example, for tracking individual health over time. To address this, we propose DISCOVER, a framework for discovering and annotating Patterns of Daily Living (PDL) – fine-grained, recurring sequences of sensor events that emerge directly from a resident’s unique routines. DISCOVER utilizes a self-supervised feature extraction and representation-aware clustering pipeline, supported by a custom visualization interface that enables experts to interpret and label discovered patterns with minimal effort. Our evaluation across multiple smart-home environments demonstrates that DISCOVER identifies cohesive behavioral clusters with high inter-rater agreement while achieving classification performance comparable to fully-supervised baselines using only 0.01% of the labels. Beyond reducing annotation overhead, DISCOVER establishes a foundation for longitudinal analysis. By grounding behavior in a resident’s specific environment rather than rigid semantic categories, our framework facilitates the observation of within-person habitual drift. This capability positions the system as a potential tool for identifying subtle behavioral indicators associated with early-stage cognitive decline in future longitudinal studies. Read More