These protocols will help AI agents navigate our messy livesMIT Technology Reviewon August 4, 2025 at 3:00 pm A growing number of companies are launching AI agents that can do things on your behalf—actions like sending an email, making a document, or editing a database. Initial reviews for these agents have been mixed at best, though, because they struggle to interact with all the different components of our digital lives. Part of the…
A growing number of companies are launching AI agents that can do things on your behalf—actions like sending an email, making a document, or editing a database. Initial reviews for these agents have been mixed at best, though, because they struggle to interact with all the different components of our digital lives. Part of the… Read More
World Model-Based Learning for Long-Term Age of Information Minimization in Vehicular Networkscs.AI updates on arXiv.orgon August 4, 2025 at 4:00 am arXiv:2505.01712v2 Announce Type: replace
Abstract: Traditional reinforcement learning (RL)-based learning approaches for wireless networks rely on expensive trial-and-error mechanisms and real-time feedback based on extensive environment interactions, which leads to low data efficiency and short-sighted policies. These limitations become particularly problematic in complex, dynamic networks with high uncertainty and long-term planning requirements. To address these limitations, in this paper, a novel world model-based learning framework is proposed to minimize packet-completeness-aware age of information (CAoI) in a vehicular network. Particularly, a challenging representative scenario is considered pertaining to a millimeter-wave (mmWave) vehicle-to-everything (V2X) communication network, which is characterized by high mobility, frequent signal blockages, and extremely short coherence time. Then, a world model framework is proposed to jointly learn a dynamic model of the mmWave V2X environment and use it to imagine trajectories for learning how to perform link scheduling. In particular, the long-term policy is learned in differentiable imagined trajectories instead of environment interactions. Moreover, owing to its imagination abilities, the world model can jointly predict time-varying wireless data and optimize link scheduling in real-world wireless and V2X networks. Thus, during intervals without actual observations, the world model remains capable of making efficient decisions. Extensive experiments are performed on a realistic simulator based on Sionna that integrates physics-based end-to-end channel modeling, ray-tracing, and scene geometries with material properties. Simulation results show that the proposed world model achieves a significant improvement in data efficiency, and achieves 26% improvement and 16% improvement in CAoI, respectively, compared to the model-based RL (MBRL) method and the model-free RL (MFRL) method.
arXiv:2505.01712v2 Announce Type: replace
Abstract: Traditional reinforcement learning (RL)-based learning approaches for wireless networks rely on expensive trial-and-error mechanisms and real-time feedback based on extensive environment interactions, which leads to low data efficiency and short-sighted policies. These limitations become particularly problematic in complex, dynamic networks with high uncertainty and long-term planning requirements. To address these limitations, in this paper, a novel world model-based learning framework is proposed to minimize packet-completeness-aware age of information (CAoI) in a vehicular network. Particularly, a challenging representative scenario is considered pertaining to a millimeter-wave (mmWave) vehicle-to-everything (V2X) communication network, which is characterized by high mobility, frequent signal blockages, and extremely short coherence time. Then, a world model framework is proposed to jointly learn a dynamic model of the mmWave V2X environment and use it to imagine trajectories for learning how to perform link scheduling. In particular, the long-term policy is learned in differentiable imagined trajectories instead of environment interactions. Moreover, owing to its imagination abilities, the world model can jointly predict time-varying wireless data and optimize link scheduling in real-world wireless and V2X networks. Thus, during intervals without actual observations, the world model remains capable of making efficient decisions. Extensive experiments are performed on a realistic simulator based on Sionna that integrates physics-based end-to-end channel modeling, ray-tracing, and scene geometries with material properties. Simulation results show that the proposed world model achieves a significant improvement in data efficiency, and achieves 26% improvement and 16% improvement in CAoI, respectively, compared to the model-based RL (MBRL) method and the model-free RL (MFRL) method. Read More
The Download: how China’s universities approach AI, and the pitfalls of welfare algorithmsMIT Technology Reviewon July 28, 2025 at 12:10 pm This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. Chinese universities want students to use more AI, not less Just two years ago, students in China were told to avoid using AI for their assignments. At the time, to get around a…
This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. Chinese universities want students to use more AI, not less Just two years ago, students in China were told to avoid using AI for their assignments. At the time, to get around a… Read More
End-to-End AWS RDS Setup with Bastion Host Using TerraformTowards Data Scienceon July 28, 2025 at 3:27 pm Learn how to automate secure AWS infrastructure using Terraform — including VPC, public/private subnets, a MySQL RDS database, and a Bastion host for secure access.
The post End-to-End AWS RDS Setup with Bastion Host Using Terraform appeared first on Towards Data Science.
Learn how to automate secure AWS infrastructure using Terraform — including VPC, public/private subnets, a MySQL RDS database, and a Bastion host for secure access.
The post End-to-End AWS RDS Setup with Bastion Host Using Terraform appeared first on Towards Data Science. Read More
China doubles chooses AI self-reliance amid intense US competitionAI Newson July 29, 2025 at 10:01 am The artificial intelligence sector in China has entered a new phase intensifying AI competition with the United States, as Chinese megacities launch massive subsidy programmes. At the same time, domestic firms are hoping to reduce their dependence on US technology. The stakes extend far beyond technological supremacy, with both nations viewing AI dominance as critical
The post China doubles chooses AI self-reliance amid intense US competition appeared first on AI News.
The artificial intelligence sector in China has entered a new phase intensifying AI competition with the United States, as Chinese megacities launch massive subsidy programmes. At the same time, domestic firms are hoping to reduce their dependence on US technology. The stakes extend far beyond technological supremacy, with both nations viewing AI dominance as critical
The post China doubles chooses AI self-reliance amid intense US competition appeared first on AI News. Read More
How Your Prompts Lead AI AstrayTowards Data Scienceon July 29, 2025 at 4:08 pm Practical tips to recognise and avoid prompt bias.
The post How Your Prompts Lead AI Astray appeared first on Towards Data Science.
Practical tips to recognise and avoid prompt bias.
The post How Your Prompts Lead AI Astray appeared first on Towards Data Science. Read More
How to Evaluate Graph Retrieval in MCP Agentic SystemsTowards Data Scienceon July 29, 2025 at 3:33 pm A framework for measuring retrieval quality in Model Context Protocol agents.
The post How to Evaluate Graph Retrieval in MCP Agentic Systems appeared first on Towards Data Science.
A framework for measuring retrieval quality in Model Context Protocol agents.
The post How to Evaluate Graph Retrieval in MCP Agentic Systems appeared first on Towards Data Science. Read More
OpenAI is launching a version of ChatGPT for college studentsMIT Technology Reviewon July 29, 2025 at 5:18 pm OpenAI is launching Study Mode, a version of ChatGPT for college students that it promises will act less like a lookup tool and more like a friendly, always-available tutor. It’s part of a wider push by the company to get AI more embedded into classrooms when the new academic year starts in September. A demonstration…
OpenAI is launching Study Mode, a version of ChatGPT for college students that it promises will act less like a lookup tool and more like a friendly, always-available tutor. It’s part of a wider push by the company to get AI more embedded into classrooms when the new academic year starts in September. A demonstration… Read More
Model Tampering Attacks Enable More Rigorous Evaluations of LLM Capabilitiescs.AI updates on arXiv.orgon July 28, 2025 at 4:00 am arXiv:2502.05209v4 Announce Type: replace-cross
Abstract: Evaluations of large language model (LLM) risks and capabilities are increasingly being incorporated into AI risk management and governance frameworks. Currently, most risk evaluations are conducted by designing inputs that elicit harmful behaviors from the system. However, this approach suffers from two limitations. First, input-output evaluations cannot fully evaluate realistic risks from open-weight models. Second, the behaviors identified during any particular input-output evaluation can only lower-bound the model’s worst-possible-case input-output behavior. As a complementary method for eliciting harmful behaviors, we propose evaluating LLMs with model tampering attacks which allow for modifications to latent activations or weights. We pit state-of-the-art techniques for removing harmful LLM capabilities against a suite of 5 input-space and 6 model tampering attacks. In addition to benchmarking these methods against each other, we show that (1) model resilience to capability elicitation attacks lies on a low-dimensional robustness subspace; (2) the success rate of model tampering attacks can empirically predict and offer conservative estimates for the success of held-out input-space attacks; and (3) state-of-the-art unlearning methods can easily be undone within 16 steps of fine-tuning. Together, these results highlight the difficulty of suppressing harmful LLM capabilities and show that model tampering attacks enable substantially more rigorous evaluations than input-space attacks alone.
arXiv:2502.05209v4 Announce Type: replace-cross
Abstract: Evaluations of large language model (LLM) risks and capabilities are increasingly being incorporated into AI risk management and governance frameworks. Currently, most risk evaluations are conducted by designing inputs that elicit harmful behaviors from the system. However, this approach suffers from two limitations. First, input-output evaluations cannot fully evaluate realistic risks from open-weight models. Second, the behaviors identified during any particular input-output evaluation can only lower-bound the model’s worst-possible-case input-output behavior. As a complementary method for eliciting harmful behaviors, we propose evaluating LLMs with model tampering attacks which allow for modifications to latent activations or weights. We pit state-of-the-art techniques for removing harmful LLM capabilities against a suite of 5 input-space and 6 model tampering attacks. In addition to benchmarking these methods against each other, we show that (1) model resilience to capability elicitation attacks lies on a low-dimensional robustness subspace; (2) the success rate of model tampering attacks can empirically predict and offer conservative estimates for the success of held-out input-space attacks; and (3) state-of-the-art unlearning methods can easily be undone within 16 steps of fine-tuning. Together, these results highlight the difficulty of suppressing harmful LLM capabilities and show that model tampering attacks enable substantially more rigorous evaluations than input-space attacks alone. Read More
Chinese universities want students to use more AI, not lessMIT Technology Reviewon July 28, 2025 at 9:00 am Just two years ago, Lorraine He, now a 24-year-old law student, was told to avoid using AI for her assignments. At the time, to get around a national block on ChatGPT, students had to buy a mirror-site version from a secondhand marketplace. Its use was common, but it was at best tolerated and more often…
Just two years ago, Lorraine He, now a 24-year-old law student, was told to avoid using AI for her assignments. At the time, to get around a national block on ChatGPT, students had to buy a mirror-site version from a secondhand marketplace. Its use was common, but it was at best tolerated and more often… Read More