Inside Standard Chartered’s approach to running AI under privacy rulesAI News For banks trying to put AI into real use, the hardest questions often come before any model is trained. Can the data be used at all? Where is it allowed to be stored? Who is responsible once the system goes live? At Standard Chartered, these privacy-driven questions now shape how AI systems are built, and
The post Inside Standard Chartered’s approach to running AI under privacy rules appeared first on AI News.
For banks trying to put AI into real use, the hardest questions often come before any model is trained. Can the data be used at all? Where is it allowed to be stored? Who is responsible once the system goes live? At Standard Chartered, these privacy-driven questions now shape how AI systems are built, and
The post Inside Standard Chartered’s approach to running AI under privacy rules appeared first on AI News. Read More
AI that talks to itself learns faster and smarterArtificial Intelligence News — ScienceDaily AI may learn better when it’s allowed to talk to itself. Researchers showed that internal “mumbling,” combined with short-term memory, helps AI adapt to new tasks, switch goals, and handle complex challenges more easily. This approach boosts learning efficiency while using far less training data. It could pave the way for more flexible, human-like AI systems.
AI may learn better when it’s allowed to talk to itself. Researchers showed that internal “mumbling,” combined with short-term memory, helps AI adapt to new tasks, switch goals, and handle complex challenges more easily. This approach boosts learning efficiency while using far less training data. It could pave the way for more flexible, human-like AI systems. Read More
AutoGameUI: Constructing High-Fidelity GameUI via Multimodal Correspondence Matchingcs.AI updates on arXiv.org arXiv:2411.03709v2 Announce Type: replace-cross
Abstract: Game UI development is essential to the game industry. However, the traditional workflow requires substantial manual effort to integrate pairwise UI and UX designs into a cohesive game user interface (GameUI). The inconsistency between the aesthetic UI design and the functional UX design typically results in mismatches and inefficiencies. To address the issue, we present an automatic system, AutoGameUI, for efficiently and accurately constructing GameUI. The system centers on a two-stage multimodal learning pipeline to obtain the optimal correspondences between UI and UX designs. The first stage learns the comprehensive representations of UI and UX designs from multimodal perspectives. The second stage incorporates grouped cross-attention modules with constrained integer programming to estimate the optimal correspondences through top-down hierarchical matching. The optimal correspondences enable the automatic GameUI construction. We create the GAMEUI dataset, comprising pairwise UI and UX designs from real-world games, to train and validate the proposed method. Besides, an interactive web tool is implemented to ensure high-fidelity effects and facilitate human-in-the-loop construction. Extensive experiments on the GAMEUI and RICO datasets demonstrate the effectiveness of our system in maintaining consistency between the constructed GameUI and the original designs. When deployed in the workflow of several mobile games, AutoGameUI achieves a 3$times$ improvement in time efficiency, conveying significant practical value for game UI development.
arXiv:2411.03709v2 Announce Type: replace-cross
Abstract: Game UI development is essential to the game industry. However, the traditional workflow requires substantial manual effort to integrate pairwise UI and UX designs into a cohesive game user interface (GameUI). The inconsistency between the aesthetic UI design and the functional UX design typically results in mismatches and inefficiencies. To address the issue, we present an automatic system, AutoGameUI, for efficiently and accurately constructing GameUI. The system centers on a two-stage multimodal learning pipeline to obtain the optimal correspondences between UI and UX designs. The first stage learns the comprehensive representations of UI and UX designs from multimodal perspectives. The second stage incorporates grouped cross-attention modules with constrained integer programming to estimate the optimal correspondences through top-down hierarchical matching. The optimal correspondences enable the automatic GameUI construction. We create the GAMEUI dataset, comprising pairwise UI and UX designs from real-world games, to train and validate the proposed method. Besides, an interactive web tool is implemented to ensure high-fidelity effects and facilitate human-in-the-loop construction. Extensive experiments on the GAMEUI and RICO datasets demonstrate the effectiveness of our system in maintaining consistency between the constructed GameUI and the original designs. When deployed in the workflow of several mobile games, AutoGameUI achieves a 3$times$ improvement in time efficiency, conveying significant practical value for game UI development. Read More
RIFT: Reordered Instruction Following Testbed To Evaluate Instruction Following in Singular Multistep Prompt Structurescs.AI updates on arXiv.org arXiv:2601.18924v1 Announce Type: new
Abstract: Large Language Models (LLMs) are increasingly relied upon for complex workflows, yet their ability to maintain flow of instructions remains underexplored. Existing benchmarks conflate task complexity with structural ordering, making it difficult to isolate the impact of prompt topology on performance. We introduce RIFT, Reordered Instruction Following Testbed, to assess instruction following by disentangling structure from content. Using rephrased Jeopardy! question-answer pairs, we test LLMs across two prompt structures: linear prompts, which progress sequentially, and jumping prompts, which preserve identical content but require non-sequential traversal. Across 10,000 evaluations spanning six state-of-the-art open-source LLMs, accuracy dropped by up to 72% under jumping conditions (compared to baseline), revealing a strong dependence on positional continuity. Error analysis shows that approximately 50% of failures stem from instruction-order violations and semantic drift, indicating that current architectures internalize instruction following as a sequential pattern rather than a reasoning skill. These results reveal structural sensitivity as a fundamental limitation in current architectures, with direct implications for applications requiring non-sequential control flow such as workflow automation and multi-agent systems.
arXiv:2601.18924v1 Announce Type: new
Abstract: Large Language Models (LLMs) are increasingly relied upon for complex workflows, yet their ability to maintain flow of instructions remains underexplored. Existing benchmarks conflate task complexity with structural ordering, making it difficult to isolate the impact of prompt topology on performance. We introduce RIFT, Reordered Instruction Following Testbed, to assess instruction following by disentangling structure from content. Using rephrased Jeopardy! question-answer pairs, we test LLMs across two prompt structures: linear prompts, which progress sequentially, and jumping prompts, which preserve identical content but require non-sequential traversal. Across 10,000 evaluations spanning six state-of-the-art open-source LLMs, accuracy dropped by up to 72% under jumping conditions (compared to baseline), revealing a strong dependence on positional continuity. Error analysis shows that approximately 50% of failures stem from instruction-order violations and semantic drift, indicating that current architectures internalize instruction following as a sequential pattern rather than a reasoning skill. These results reveal structural sensitivity as a fundamental limitation in current architectures, with direct implications for applications requiring non-sequential control flow such as workflow automation and multi-agent systems. Read More
A Systemic Evaluation of Multimodal RAG Privacycs.AI updates on arXiv.org arXiv:2601.17644v2 Announce Type: replace-cross
Abstract: The growing adoption of multimodal Retrieval-Augmented Generation (mRAG) pipelines for vision-centric tasks (e.g. visual QA) introduces important privacy challenges. In particular, while mRAG provides a practical capability to connect private datasets to improve model performance, it risks the leakage of private information from these datasets during inference. In this paper, we perform an empirical study to analyze the privacy risks inherent in the mRAG pipeline observed through standard model prompting. Specifically, we implement a case study that attempts to infer the inclusion of a visual asset, e.g. image, in the mRAG, and if present leak the metadata, e.g. caption, related to it. Our findings highlight the need for privacy-preserving mechanisms and motivate future research on mRAG privacy.
arXiv:2601.17644v2 Announce Type: replace-cross
Abstract: The growing adoption of multimodal Retrieval-Augmented Generation (mRAG) pipelines for vision-centric tasks (e.g. visual QA) introduces important privacy challenges. In particular, while mRAG provides a practical capability to connect private datasets to improve model performance, it risks the leakage of private information from these datasets during inference. In this paper, we perform an empirical study to analyze the privacy risks inherent in the mRAG pipeline observed through standard model prompting. Specifically, we implement a case study that attempts to infer the inclusion of a visual asset, e.g. image, in the mRAG, and if present leak the metadata, e.g. caption, related to it. Our findings highlight the need for privacy-preserving mechanisms and motivate future research on mRAG privacy. Read More
Gallup Workforce shows details of AI adoption in US workplacesAI News Artificial intelligence has moved into the US workplace, but its adoption remains uneven, fragmented, and tied to role, industry, and organisation. Findings from a Gallup Workforce survey covering the period to the end of December 2025 show how employees use AI, who benefits most from it, and where areas of uncertainty remain. The findings draw
The post Gallup Workforce shows details of AI adoption in US workplaces appeared first on AI News.
Artificial intelligence has moved into the US workplace, but its adoption remains uneven, fragmented, and tied to role, industry, and organisation. Findings from a Gallup Workforce survey covering the period to the end of December 2025 show how employees use AI, who benefits most from it, and where areas of uncertainty remain. The findings draw
The post Gallup Workforce shows details of AI adoption in US workplaces appeared first on AI News. Read More
More at Stake: How Payoff and Language Shape LLM Agent Strategies in Cooperation Dilemmascs.AI updates on arXiv.org arXiv:2601.19082v1 Announce Type: new
Abstract: As LLMs increasingly act as autonomous agents in interactive and multi-agent settings, understanding their strategic behavior is critical for safety, coordination, and AI-driven social and economic systems. We investigate how payoff magnitude and linguistic context shape LLM strategies in repeated social dilemmas, using a payoff-scaled Prisoner’s Dilemma to isolate sensitivity to incentive strength. Across models and languages, we observe consistent behavioral patterns, including incentive-sensitive conditional strategies and cross-linguistic divergence. To interpret these dynamics, we train supervised classifiers on canonical repeated-game strategies and apply them to LLM decisions, revealing systematic, model- and language-dependent behavioral intentions, with linguistic framing sometimes matching or exceeding architectural effects. Our results provide a unified framework for auditing LLMs as strategic agents and highlight cooperation biases with direct implications for AI governance and multi-agent system design.
arXiv:2601.19082v1 Announce Type: new
Abstract: As LLMs increasingly act as autonomous agents in interactive and multi-agent settings, understanding their strategic behavior is critical for safety, coordination, and AI-driven social and economic systems. We investigate how payoff magnitude and linguistic context shape LLM strategies in repeated social dilemmas, using a payoff-scaled Prisoner’s Dilemma to isolate sensitivity to incentive strength. Across models and languages, we observe consistent behavioral patterns, including incentive-sensitive conditional strategies and cross-linguistic divergence. To interpret these dynamics, we train supervised classifiers on canonical repeated-game strategies and apply them to LLM decisions, revealing systematic, model- and language-dependent behavioral intentions, with linguistic framing sometimes matching or exceeding architectural effects. Our results provide a unified framework for auditing LLMs as strategic agents and highlight cooperation biases with direct implications for AI governance and multi-agent system design. Read More
Can We Trust LLM Detectors?cs.AI updates on arXiv.org arXiv:2601.15301v2 Announce Type: replace-cross
Abstract: The rapid adoption of LLMs has increased the need for reliable AI text detection, yet existing detectors often fail outside controlled benchmarks. We systematically evaluate 2 dominant paradigms (training-free and supervised) and show that both are brittle under distribution shift, unseen generators, and simple stylistic perturbations. To address these limitations, we propose a supervised contrastive learning (SCL) framework that learns discriminative style embeddings. Experiments show that while supervised detectors excel in-domain, they degrade sharply out-of-domain, and training-free methods remain highly sensitive to proxy choice. Overall, our results expose fundamental challenges in building domain-agnostic detectors. Our code is available at: https://github.com/HARSHITJAIS14/DetectAI
arXiv:2601.15301v2 Announce Type: replace-cross
Abstract: The rapid adoption of LLMs has increased the need for reliable AI text detection, yet existing detectors often fail outside controlled benchmarks. We systematically evaluate 2 dominant paradigms (training-free and supervised) and show that both are brittle under distribution shift, unseen generators, and simple stylistic perturbations. To address these limitations, we propose a supervised contrastive learning (SCL) framework that learns discriminative style embeddings. Experiments show that while supervised detectors excel in-domain, they degrade sharply out-of-domain, and training-free methods remain highly sensitive to proxy choice. Overall, our results expose fundamental challenges in building domain-agnostic detectors. Our code is available at: https://github.com/HARSHITJAIS14/DetectAI Read More
Bridging Visual and Wireless Sensing: A Unified Radiation Field for 3D Radio Map Constructioncs.AI updates on arXiv.org arXiv:2601.19216v1 Announce Type: cross
Abstract: The emerging applications of next-generation wireless networks (e.g., immersive 3D communication, low-altitude networks, and integrated sensing and communication) necessitate high-fidelity environmental intelligence. 3D radio maps have emerged as a critical tool for this purpose, enabling spectrum-aware planning and environment-aware sensing by bridging the gap between physical environments and electromagnetic signal propagation. However, constructing accurate 3D radio maps requires fine-grained 3D geometric information and a profound understanding of electromagnetic wave propagation. Existing approaches typically treat optical and wireless knowledge as distinct modalities, failing to exploit the fundamental physical principles governing both light and electromagnetic propagation. To bridge this gap, we propose URF-GS, a unified radio-optical radiation field representation framework for accurate and generalizable 3D radio map construction based on 3D Gaussian splatting (3D-GS) and inverse rendering. By fusing visual and wireless sensing observations, URF-GS recovers scene geometry and material properties while accurately predicting radio signal behavior at arbitrary transmitter-receiver (Tx-Rx) configurations. Experimental results demonstrate that URF-GS achieves up to a 24.7% improvement in spatial spectrum prediction accuracy and a 10x increase in sample efficiency for 3D radio map construction compared with neural radiance field (NeRF)-based methods. This work establishes a foundation for next-generation wireless networks by integrating perception, interaction, and communication through holistic radiation field reconstruction.
arXiv:2601.19216v1 Announce Type: cross
Abstract: The emerging applications of next-generation wireless networks (e.g., immersive 3D communication, low-altitude networks, and integrated sensing and communication) necessitate high-fidelity environmental intelligence. 3D radio maps have emerged as a critical tool for this purpose, enabling spectrum-aware planning and environment-aware sensing by bridging the gap between physical environments and electromagnetic signal propagation. However, constructing accurate 3D radio maps requires fine-grained 3D geometric information and a profound understanding of electromagnetic wave propagation. Existing approaches typically treat optical and wireless knowledge as distinct modalities, failing to exploit the fundamental physical principles governing both light and electromagnetic propagation. To bridge this gap, we propose URF-GS, a unified radio-optical radiation field representation framework for accurate and generalizable 3D radio map construction based on 3D Gaussian splatting (3D-GS) and inverse rendering. By fusing visual and wireless sensing observations, URF-GS recovers scene geometry and material properties while accurately predicting radio signal behavior at arbitrary transmitter-receiver (Tx-Rx) configurations. Experimental results demonstrate that URF-GS achieves up to a 24.7% improvement in spatial spectrum prediction accuracy and a 10x increase in sample efficiency for 3D radio map construction compared with neural radiance field (NeRF)-based methods. This work establishes a foundation for next-generation wireless networks by integrating perception, interaction, and communication through holistic radiation field reconstruction. Read More
AgenticSCR: An Autonomous Agentic Secure Code Review for Immature Vulnerabilities Detectioncs.AI updates on arXiv.org arXiv:2601.19138v1 Announce Type: cross
Abstract: Secure code review is critical at the pre-commit stage, where vulnerabilities must be caught early under tight latency and limited-context constraints. Existing SAST-based checks are noisy and often miss immature, context-dependent vulnerabilities, while standalone Large Language Models (LLMs) are constrained by context windows and lack explicit tool use. Agentic AI, which combine LLMs with autonomous decision-making, tool invocation, and code navigation, offer a promising alternative, but their effectiveness for pre-commit secure code review is not yet well understood. In this work, we introduce AgenticSCR, an agentic AI for secure code review for detecting immature vulnerabilities during the pre-commit stage, augmented by security-focused semantic memories. Using our own curated benchmark of immature vulnerabilities, tailored to the pre-commit secure code review, we empirically evaluate how accurate is our AgenticSCR for localizing, detecting, and explaining immature vulnerabilities. Our results show that AgenticSCR achieves at least 153% relatively higher percentage of correct code review comments than the static LLM-based baseline, and also substantially surpasses SAST tools. Moreover, AgenticSCR generates more correct comments in four out of five vulnerability types, consistently and significantly outperforming all other baselines. These findings highlight the importance of Agentic Secure Code Review, paving the way towards an emerging research area of immature vulnerability detection.
arXiv:2601.19138v1 Announce Type: cross
Abstract: Secure code review is critical at the pre-commit stage, where vulnerabilities must be caught early under tight latency and limited-context constraints. Existing SAST-based checks are noisy and often miss immature, context-dependent vulnerabilities, while standalone Large Language Models (LLMs) are constrained by context windows and lack explicit tool use. Agentic AI, which combine LLMs with autonomous decision-making, tool invocation, and code navigation, offer a promising alternative, but their effectiveness for pre-commit secure code review is not yet well understood. In this work, we introduce AgenticSCR, an agentic AI for secure code review for detecting immature vulnerabilities during the pre-commit stage, augmented by security-focused semantic memories. Using our own curated benchmark of immature vulnerabilities, tailored to the pre-commit secure code review, we empirically evaluate how accurate is our AgenticSCR for localizing, detecting, and explaining immature vulnerabilities. Our results show that AgenticSCR achieves at least 153% relatively higher percentage of correct code review comments than the static LLM-based baseline, and also substantially surpasses SAST tools. Moreover, AgenticSCR generates more correct comments in four out of five vulnerability types, consistently and significantly outperforming all other baselines. These findings highlight the importance of Agentic Secure Code Review, paving the way towards an emerging research area of immature vulnerability detection. Read More