Multi-Agent Arena: Insights from London Great Agent Hack 2025Towards Data Science What mattered: robust agents, glass-box reasoning, and red-team resilience
The post Multi-Agent Arena: Insights from London Great Agent Hack 2025 appeared first on Towards Data Science.
What mattered: robust agents, glass-box reasoning, and red-team resilience
The post Multi-Agent Arena: Insights from London Great Agent Hack 2025 appeared first on Towards Data Science. Read More
MIT engineers design an aerial microrobot that can fly as fast as a bumblebeeMIT News – Machine learning With insect-like speed and agility, the tiny robot could someday aid in search-and-rescue missions.
With insect-like speed and agility, the tiny robot could someday aid in search-and-rescue missions. Read More
Overcoming the Hidden Performance Traps of Variable-Shaped Tensors: Efficient Data Sampling in PyTorchTowards Data Science PyTorch Model Performance Analysis and Optimization — Part 11
The post Overcoming the Hidden Performance Traps of Variable-Shaped Tensors: Efficient Data Sampling in PyTorch appeared first on Towards Data Science.
PyTorch Model Performance Analysis and Optimization — Part 11
The post Overcoming the Hidden Performance Traps of Variable-Shaped Tensors: Efficient Data Sampling in PyTorch appeared first on Towards Data Science. Read More
Time Series and Trend Analysis Challenge Inspired by Real World DatasetsKDnuggets See how different time series methods reveal the shifts, surges, and stabilization in inflation expectations.
See how different time series methods reveal the shifts, surges, and stabilization in inflation expectations. Read More
The Machine Learning “Advent Calendar” Day 3: GNB, LDA and QDA in ExcelTowards Data Science From local distance to global probability
The post The Machine Learning “Advent Calendar” Day 3: GNB, LDA and QDA in Excel appeared first on Towards Data Science.
From local distance to global probability
The post The Machine Learning “Advent Calendar” Day 3: GNB, LDA and QDA in Excel appeared first on Towards Data Science. Read More
How to Turn Your LLM Prototype into a Production-Ready SystemTowards Data Science The most famous applications of LLMs are the ones that I like to call the “wow effect LLMs.” There are plenty of viral LinkedIn posts about them, and they all sound like this: “I built [x] that does [y] in [z] minutes using AI.” Where: If you notice carefully, the focus of the sentence is
The post How to Turn Your LLM Prototype into a Production-Ready System appeared first on Towards Data Science.
The most famous applications of LLMs are the ones that I like to call the “wow effect LLMs.” There are plenty of viral LinkedIn posts about them, and they all sound like this: “I built [x] that does [y] in [z] minutes using AI.” Where: If you notice carefully, the focus of the sentence is
The post How to Turn Your LLM Prototype into a Production-Ready System appeared first on Towards Data Science. Read More
HTB AI Range offers experiments in cyber-resilience trainingAI News The cybersecurity training provider Hack The Box (HTB) has launched the HTB AI Range, designed to let organisations test autonomous AI security agents under realistic conditions, albeit with oversight from human cybersecurity professionals. Its goal is to help users assess how well AI, and mixed human–AI teams might defend infrastructure. Vulnerabilities in AI models add
The post HTB AI Range offers experiments in cyber-resilience training appeared first on AI News.
The cybersecurity training provider Hack The Box (HTB) has launched the HTB AI Range, designed to let organisations test autonomous AI security agents under realistic conditions, albeit with oversight from human cybersecurity professionals. Its goal is to help users assess how well AI, and mixed human–AI teams might defend infrastructure. Vulnerabilities in AI models add
The post HTB AI Range offers experiments in cyber-resilience training appeared first on AI News. Read More
How confessions can keep language models honestOpenAI News OpenAI researchers are testing “confessions,” a method that trains models to admit when they make mistakes or act undesirably, helping improve AI honesty, transparency, and trust in model outputs.
OpenAI researchers are testing “confessions,” a method that trains models to admit when they make mistakes or act undesirably, helping improve AI honesty, transparency, and trust in model outputs. Read More
TradeTrap: Are LLM-based Trading Agents Truly Reliable and Faithful?cs.AI updates on arXiv.org arXiv:2512.02261v1 Announce Type: new
Abstract: LLM-based trading agents are increasingly deployed in real-world financial markets to perform autonomous analysis and execution. However, their reliability and robustness under adversarial or faulty conditions remain largely unexamined, despite operating in high-risk, irreversible financial environments. We propose TradeTrap, a unified evaluation framework for systematically stress-testing both adaptive and procedural autonomous trading agents. TradeTrap targets four core components of autonomous trading agents: market intelligence, strategy formulation, portfolio and ledger handling, and trade execution, and evaluates their robustness under controlled system-level perturbations. All evaluations are conducted in a closed-loop historical backtesting setting on real US equity market data with identical initial conditions, enabling fair and reproducible comparisons across agents and attacks. Extensive experiments show that small perturbations at a single component can propagate through the agent decision loop and induce extreme concentration, runaway exposure, and large portfolio drawdowns across both agent types, demonstrating that current autonomous trading agents can be systematically misled at the system level. Our code is available at https://github.com/Yanlewen/TradeTrap.
arXiv:2512.02261v1 Announce Type: new
Abstract: LLM-based trading agents are increasingly deployed in real-world financial markets to perform autonomous analysis and execution. However, their reliability and robustness under adversarial or faulty conditions remain largely unexamined, despite operating in high-risk, irreversible financial environments. We propose TradeTrap, a unified evaluation framework for systematically stress-testing both adaptive and procedural autonomous trading agents. TradeTrap targets four core components of autonomous trading agents: market intelligence, strategy formulation, portfolio and ledger handling, and trade execution, and evaluates their robustness under controlled system-level perturbations. All evaluations are conducted in a closed-loop historical backtesting setting on real US equity market data with identical initial conditions, enabling fair and reproducible comparisons across agents and attacks. Extensive experiments show that small perturbations at a single component can propagate through the agent decision loop and induce extreme concentration, runaway exposure, and large portfolio drawdowns across both agent types, demonstrating that current autonomous trading agents can be systematically misled at the system level. Our code is available at https://github.com/Yanlewen/TradeTrap. Read More
DialogGuard: Multi-Agent Psychosocial Safety Evaluation of Sensitive LLM Responsescs.AI updates on arXiv.org arXiv:2512.02282v1 Announce Type: new
Abstract: Large language models (LLMs) now mediate many web-based mental- health, crisis, and other emotionally sensitive services, yet their psychosocial safety in these settings remains poorly understood and weakly evaluated. We present DialogGuard, a multi-agent frame- work for assessing psychosocial risks in LLM-generated responses along five high-severity dimensions: privacy violations, discrimi- natory behaviour, mental manipulation, psychological harm, and insulting behaviour. DialogGuard can be applied to diverse gen- erative models through four LLM-as-a-judge pipelines, including single-agent scoring, dual-agent correction, multi-agent debate, and stochastic majority voting, grounded in a shared three-level rubric usable by both human annotators and LLM judges. Using PKU-SafeRLHF with human safety annotations, we show that multi- agent mechanisms detect psychosocial risks more accurately than non-LLM baselines and single-agent judging; dual-agent correction and majority voting provide the best trade-off between accuracy, alignment with human ratings, and robustness, while debate attains higher recall but over-flags borderline cases. We release Dialog- Guard as open-source software with a web interface that provides per-dimension risk scores and explainable natural-language ratio- nales. A formative study with 12 practitioners illustrates how it supports prompt design, auditing, and supervision of web-facing applications for vulnerable users.
arXiv:2512.02282v1 Announce Type: new
Abstract: Large language models (LLMs) now mediate many web-based mental- health, crisis, and other emotionally sensitive services, yet their psychosocial safety in these settings remains poorly understood and weakly evaluated. We present DialogGuard, a multi-agent frame- work for assessing psychosocial risks in LLM-generated responses along five high-severity dimensions: privacy violations, discrimi- natory behaviour, mental manipulation, psychological harm, and insulting behaviour. DialogGuard can be applied to diverse gen- erative models through four LLM-as-a-judge pipelines, including single-agent scoring, dual-agent correction, multi-agent debate, and stochastic majority voting, grounded in a shared three-level rubric usable by both human annotators and LLM judges. Using PKU-SafeRLHF with human safety annotations, we show that multi- agent mechanisms detect psychosocial risks more accurately than non-LLM baselines and single-agent judging; dual-agent correction and majority voting provide the best trade-off between accuracy, alignment with human ratings, and robustness, while debate attains higher recall but over-flags borderline cases. We release Dialog- Guard as open-source software with a web interface that provides per-dimension risk scores and explainable natural-language ratio- nales. A formative study with 12 practitioners illustrates how it supports prompt design, auditing, and supervision of web-facing applications for vulnerable users. Read More