China doubles chooses AI self-reliance amid intense US competitionAI Newson July 29, 2025 at 10:01 am The artificial intelligence sector in China has entered a new phase intensifying AI competition with the United States, as Chinese megacities launch massive subsidy programmes. At the same time, domestic firms are hoping to reduce their dependence on US technology. The stakes extend far beyond technological supremacy, with both nations viewing AI dominance as critical
The post China doubles chooses AI self-reliance amid intense US competition appeared first on AI News.
The artificial intelligence sector in China has entered a new phase intensifying AI competition with the United States, as Chinese megacities launch massive subsidy programmes. At the same time, domestic firms are hoping to reduce their dependence on US technology. The stakes extend far beyond technological supremacy, with both nations viewing AI dominance as critical
The post China doubles chooses AI self-reliance amid intense US competition appeared first on AI News. Read More
How Your Prompts Lead AI AstrayTowards Data Scienceon July 29, 2025 at 4:08 pm Practical tips to recognise and avoid prompt bias.
The post How Your Prompts Lead AI Astray appeared first on Towards Data Science.
Practical tips to recognise and avoid prompt bias.
The post How Your Prompts Lead AI Astray appeared first on Towards Data Science. Read More
How to Evaluate Graph Retrieval in MCP Agentic SystemsTowards Data Scienceon July 29, 2025 at 3:33 pm A framework for measuring retrieval quality in Model Context Protocol agents.
The post How to Evaluate Graph Retrieval in MCP Agentic Systems appeared first on Towards Data Science.
A framework for measuring retrieval quality in Model Context Protocol agents.
The post How to Evaluate Graph Retrieval in MCP Agentic Systems appeared first on Towards Data Science. Read More
OpenAI is launching a version of ChatGPT for college studentsMIT Technology Reviewon July 29, 2025 at 5:18 pm OpenAI is launching Study Mode, a version of ChatGPT for college students that it promises will act less like a lookup tool and more like a friendly, always-available tutor. It’s part of a wider push by the company to get AI more embedded into classrooms when the new academic year starts in September. A demonstration…
OpenAI is launching Study Mode, a version of ChatGPT for college students that it promises will act less like a lookup tool and more like a friendly, always-available tutor. It’s part of a wider push by the company to get AI more embedded into classrooms when the new academic year starts in September. A demonstration… Read More
Author: Derrick D. JacksonTitle: Founder & Senior Director of Cloud Security Architecture & RiskCredentials: CISSP, CRISC, CCSPLast updated: July 28th, 2025 Safe AI Usage: The Complete Workplace Guide Scope: This Safe AI Usage guide covers AI usage for typical business applications (content creation, analysis, customer support) used by knowledge workers. For organizations developing high-risk AI systems or […]
Model Tampering Attacks Enable More Rigorous Evaluations of LLM Capabilitiescs.AI updates on arXiv.orgon July 28, 2025 at 4:00 am arXiv:2502.05209v4 Announce Type: replace-cross
Abstract: Evaluations of large language model (LLM) risks and capabilities are increasingly being incorporated into AI risk management and governance frameworks. Currently, most risk evaluations are conducted by designing inputs that elicit harmful behaviors from the system. However, this approach suffers from two limitations. First, input-output evaluations cannot fully evaluate realistic risks from open-weight models. Second, the behaviors identified during any particular input-output evaluation can only lower-bound the model’s worst-possible-case input-output behavior. As a complementary method for eliciting harmful behaviors, we propose evaluating LLMs with model tampering attacks which allow for modifications to latent activations or weights. We pit state-of-the-art techniques for removing harmful LLM capabilities against a suite of 5 input-space and 6 model tampering attacks. In addition to benchmarking these methods against each other, we show that (1) model resilience to capability elicitation attacks lies on a low-dimensional robustness subspace; (2) the success rate of model tampering attacks can empirically predict and offer conservative estimates for the success of held-out input-space attacks; and (3) state-of-the-art unlearning methods can easily be undone within 16 steps of fine-tuning. Together, these results highlight the difficulty of suppressing harmful LLM capabilities and show that model tampering attacks enable substantially more rigorous evaluations than input-space attacks alone.
arXiv:2502.05209v4 Announce Type: replace-cross
Abstract: Evaluations of large language model (LLM) risks and capabilities are increasingly being incorporated into AI risk management and governance frameworks. Currently, most risk evaluations are conducted by designing inputs that elicit harmful behaviors from the system. However, this approach suffers from two limitations. First, input-output evaluations cannot fully evaluate realistic risks from open-weight models. Second, the behaviors identified during any particular input-output evaluation can only lower-bound the model’s worst-possible-case input-output behavior. As a complementary method for eliciting harmful behaviors, we propose evaluating LLMs with model tampering attacks which allow for modifications to latent activations or weights. We pit state-of-the-art techniques for removing harmful LLM capabilities against a suite of 5 input-space and 6 model tampering attacks. In addition to benchmarking these methods against each other, we show that (1) model resilience to capability elicitation attacks lies on a low-dimensional robustness subspace; (2) the success rate of model tampering attacks can empirically predict and offer conservative estimates for the success of held-out input-space attacks; and (3) state-of-the-art unlearning methods can easily be undone within 16 steps of fine-tuning. Together, these results highlight the difficulty of suppressing harmful LLM capabilities and show that model tampering attacks enable substantially more rigorous evaluations than input-space attacks alone. Read More
Chinese universities want students to use more AI, not lessMIT Technology Reviewon July 28, 2025 at 9:00 am Just two years ago, Lorraine He, now a 24-year-old law student, was told to avoid using AI for her assignments. At the time, to get around a national block on ChatGPT, students had to buy a mirror-site version from a secondhand marketplace. Its use was common, but it was at best tolerated and more often…
Just two years ago, Lorraine He, now a 24-year-old law student, was told to avoid using AI for her assignments. At the time, to get around a national block on ChatGPT, students had to buy a mirror-site version from a secondhand marketplace. Its use was common, but it was at best tolerated and more often… Read More
Distilling a Small Utility-Based Passage Selector to Enhance Retrieval-Augmented Generationcs.AI updates on arXiv.orgon July 28, 2025 at 4:00 am arXiv:2507.19102v1 Announce Type: cross
Abstract: Retrieval-augmented generation (RAG) enhances large language models (LLMs) by incorporating retrieved information. Standard retrieval process prioritized relevance, focusing on topical alignment between queries and passages. In contrast, in RAG, the emphasis has shifted to utility, which considers the usefulness of passages for generating accurate answers. Despite empirical evidence showing the benefits of utility-based retrieval in RAG, the high computational cost of using LLMs for utility judgments limits the number of passages evaluated. This restriction is problematic for complex queries requiring extensive information. To address this, we propose a method to distill the utility judgment capabilities of LLMs into smaller, more efficient models. Our approach focuses on utility-based selection rather than ranking, enabling dynamic passage selection tailored to specific queries without the need for fixed thresholds. We train student models to learn pseudo-answer generation and utility judgments from teacher LLMs, using a sliding window method that dynamically selects useful passages. Our experiments demonstrate that utility-based selection provides a flexible and cost-effective solution for RAG, significantly reducing computational costs while improving answer quality. We present the distillation results using Qwen3-32B as the teacher model for both relevance ranking and utility-based selection, distilled into RankQwen1.7B and UtilityQwen1.7B. Our findings indicate that for complex questions, utility-based selection is more effective than relevance ranking in enhancing answer generation performance. We will release the relevance ranking and utility-based selection annotations for the MS MARCO dataset, supporting further research in this area.
arXiv:2507.19102v1 Announce Type: cross
Abstract: Retrieval-augmented generation (RAG) enhances large language models (LLMs) by incorporating retrieved information. Standard retrieval process prioritized relevance, focusing on topical alignment between queries and passages. In contrast, in RAG, the emphasis has shifted to utility, which considers the usefulness of passages for generating accurate answers. Despite empirical evidence showing the benefits of utility-based retrieval in RAG, the high computational cost of using LLMs for utility judgments limits the number of passages evaluated. This restriction is problematic for complex queries requiring extensive information. To address this, we propose a method to distill the utility judgment capabilities of LLMs into smaller, more efficient models. Our approach focuses on utility-based selection rather than ranking, enabling dynamic passage selection tailored to specific queries without the need for fixed thresholds. We train student models to learn pseudo-answer generation and utility judgments from teacher LLMs, using a sliding window method that dynamically selects useful passages. Our experiments demonstrate that utility-based selection provides a flexible and cost-effective solution for RAG, significantly reducing computational costs while improving answer quality. We present the distillation results using Qwen3-32B as the teacher model for both relevance ranking and utility-based selection, distilled into RankQwen1.7B and UtilityQwen1.7B. Our findings indicate that for complex questions, utility-based selection is more effective than relevance ranking in enhancing answer generation performance. We will release the relevance ranking and utility-based selection annotations for the MS MARCO dataset, supporting further research in this area. Read More
Authored by Derrick Jackson & Co-Author Lisa Yu CCSP Certification Overview for 2025: Why This Certification Commands $171,524 Average Salary Cloud breaches cost companies an average of $5.17 million per incident, according to IBM’s 2024 Cost of a Data Breach Report¹. Most organizations rushing to the cloud don’t realize they’re fundamentally changing their security responsibilities. The old perimeter-based […]
Authored by Derrick Jackson & Co-Author Lisa Yu Why Network+ Still Matters When Everything’s Going Cloud-Native Cloud computing didn’t eliminate networking. It made it exponentially more complex. Organizations thought they’d move everything to the cloud and networking would become someone else’s problem. Instead, they discovered hybrid environments require deeper networking knowledge than ever before. You’re not just managing […]