The Machine Learning Lessons I’ve Learned This MonthTowards Data Scienceon August 31, 2025 at 2:00 pm August 2025: logging, lab notebooks, overnight runs
The post The Machine Learning Lessons I’ve Learned This Month appeared first on Towards Data Science.
August 2025: logging, lab notebooks, overnight runs
The post The Machine Learning Lessons I’ve Learned This Month appeared first on Towards Data Science. Read More
Author: Derrick D. JacksonTitle: Founder & Senior Director of Cloud Security Architecture & RiskCredentials: CISSP, CRISC, CCSPLast updated: September 1st, 2025 Pressed For TimeReview or Download our 2-3 min Quick Slides or the 5-7 min Article Insights to gain knowledge with the time you have! Pressed For TimeReview or Download our 2-3 min Quick Slides or the […]
How to Import Pre-Annotated Data into Label Studio and Run the Full Stack with DockerTowards Data Scienceon August 29, 2025 at 12:30 pm From VOC to JSON: Importing pre-annotations made simple
The post How to Import Pre-Annotated Data into Label Studio and Run the Full Stack with Docker appeared first on Towards Data Science.
From VOC to JSON: Importing pre-annotations made simple
The post How to Import Pre-Annotated Data into Label Studio and Run the Full Stack with Docker appeared first on Towards Data Science. Read More
Toward Digital Well-Being: Using Generative AI to Detect and Mitigate Bias in Social NetworksTowards Data Scienceon August 29, 2025 at 2:45 pm This research answered the question: How can machine learning and artificial intelligence help us to unlearn bias?
The post Toward Digital Well-Being: Using Generative AI to Detect and Mitigate Bias in Social Networks appeared first on Towards Data Science.
This research answered the question: How can machine learning and artificial intelligence help us to unlearn bias?
The post Toward Digital Well-Being: Using Generative AI to Detect and Mitigate Bias in Social Networks appeared first on Towards Data Science. Read More
Surveying the Operational Cybersecurity and Supply Chain Threat Landscape when Developing and Deploying AI Systemscs.AI updates on arXiv.orgon August 29, 2025 at 4:00 am arXiv:2508.20307v1 Announce Type: cross
Abstract: The rise of AI has transformed the software and hardware landscape, enabling powerful capabilities through specialized infrastructures, large-scale data storage, and advanced hardware. However, these innovations introduce unique attack surfaces and objectives which traditional cybersecurity assessments often overlook. Cyber attackers are shifting their objectives from conventional goals like privilege escalation and network pivoting to manipulating AI outputs to achieve desired system effects, such as slowing system performance, flooding outputs with false positives, or degrading model accuracy. This paper serves to raise awareness of the novel cyber threats that are introduced when incorporating AI into a software system. We explore the operational cybersecurity and supply chain risks across the AI lifecycle, emphasizing the need for tailored security frameworks to address evolving threats in the AI-driven landscape. We highlight previous exploitations and provide insights from working in this area. By understanding these risks, organizations can better protect AI systems and ensure their reliability and resilience.
arXiv:2508.20307v1 Announce Type: cross
Abstract: The rise of AI has transformed the software and hardware landscape, enabling powerful capabilities through specialized infrastructures, large-scale data storage, and advanced hardware. However, these innovations introduce unique attack surfaces and objectives which traditional cybersecurity assessments often overlook. Cyber attackers are shifting their objectives from conventional goals like privilege escalation and network pivoting to manipulating AI outputs to achieve desired system effects, such as slowing system performance, flooding outputs with false positives, or degrading model accuracy. This paper serves to raise awareness of the novel cyber threats that are introduced when incorporating AI into a software system. We explore the operational cybersecurity and supply chain risks across the AI lifecycle, emphasizing the need for tailored security frameworks to address evolving threats in the AI-driven landscape. We highlight previous exploitations and provide insights from working in this area. By understanding these risks, organizations can better protect AI systems and ensure their reliability and resilience. Read More
Poison Once, Refuse Forever: Weaponizing Alignment for Injecting Bias in LLMscs.AI updates on arXiv.orgon August 29, 2025 at 4:00 am arXiv:2508.20333v1 Announce Type: cross
Abstract: Large Language Models (LLMs) are aligned to meet ethical standards and safety requirements by training them to refuse answering harmful or unsafe prompts. In this paper, we demonstrate how adversaries can exploit LLMs’ alignment to implant bias, or enforce targeted censorship without degrading the model’s responsiveness to unrelated topics. Specifically, we propose Subversive Alignment Injection (SAI), a poisoning attack that leverages the alignment mechanism to trigger refusal on specific topics or queries predefined by the adversary. Although it is perhaps not surprising that refusal can be induced through overalignment, we demonstrate how this refusal can be exploited to inject bias into the model. Surprisingly, SAI evades state-of-the-art poisoning defenses including LLM state forensics, as well as robust aggregation techniques that are designed to detect poisoning in FL settings. We demonstrate the practical dangers of this attack by illustrating its end-to-end impacts on LLM-powered application pipelines. For chat based applications such as ChatDoctor, with 1% data poisoning, the system refuses to answer healthcare questions to targeted racial category leading to high bias ($Delta DP$ of 23%). We also show that bias can be induced in other NLP tasks: for a resume selection pipeline aligned to refuse to summarize CVs from a selected university, high bias in selection ($Delta DP$ of 27%) results. Even higher bias ($Delta DP$~38%) results on 9 other chat based downstream applications.
arXiv:2508.20333v1 Announce Type: cross
Abstract: Large Language Models (LLMs) are aligned to meet ethical standards and safety requirements by training them to refuse answering harmful or unsafe prompts. In this paper, we demonstrate how adversaries can exploit LLMs’ alignment to implant bias, or enforce targeted censorship without degrading the model’s responsiveness to unrelated topics. Specifically, we propose Subversive Alignment Injection (SAI), a poisoning attack that leverages the alignment mechanism to trigger refusal on specific topics or queries predefined by the adversary. Although it is perhaps not surprising that refusal can be induced through overalignment, we demonstrate how this refusal can be exploited to inject bias into the model. Surprisingly, SAI evades state-of-the-art poisoning defenses including LLM state forensics, as well as robust aggregation techniques that are designed to detect poisoning in FL settings. We demonstrate the practical dangers of this attack by illustrating its end-to-end impacts on LLM-powered application pipelines. For chat based applications such as ChatDoctor, with 1% data poisoning, the system refuses to answer healthcare questions to targeted racial category leading to high bias ($Delta DP$ of 23%). We also show that bias can be induced in other NLP tasks: for a resume selection pipeline aligned to refuse to summarize CVs from a selected university, high bias in selection ($Delta DP$ of 27%) results. Even higher bias ($Delta DP$~38%) results on 9 other chat based downstream applications. Read More
Data-Efficient Symbolic Regression via Foundation Model Distillationcs.AI updates on arXiv.orgon August 28, 2025 at 4:00 am arXiv:2508.19487v1 Announce Type: cross
Abstract: Discovering interpretable mathematical equations from observed data (a.k.a. equation discovery or symbolic regression) is a cornerstone of scientific discovery, enabling transparent modeling of physical, biological, and economic systems. While foundation models pre-trained on large-scale equation datasets offer a promising starting point, they often suffer from negative transfer and poor generalization when applied to small, domain-specific datasets. In this paper, we introduce EQUATE (Equation Generation via QUality-Aligned Transfer Embeddings), a data-efficient fine-tuning framework that adapts foundation models for symbolic equation discovery in low-data regimes via distillation. EQUATE combines symbolic-numeric alignment with evaluator-guided embedding optimization, enabling a principled embedding-search-generation paradigm. Our approach reformulates discrete equation search as a continuous optimization task in a shared embedding space, guided by data-equation fitness and simplicity. Experiments across three standard public benchmarks (Feynman, Strogatz, and black-box datasets) demonstrate that EQUATE consistently outperforms state-of-the-art baselines in both accuracy and robustness, while preserving low complexity and fast inference. These results highlight EQUATE as a practical and generalizable solution for data-efficient symbolic regression in foundation model distillation settings.
arXiv:2508.19487v1 Announce Type: cross
Abstract: Discovering interpretable mathematical equations from observed data (a.k.a. equation discovery or symbolic regression) is a cornerstone of scientific discovery, enabling transparent modeling of physical, biological, and economic systems. While foundation models pre-trained on large-scale equation datasets offer a promising starting point, they often suffer from negative transfer and poor generalization when applied to small, domain-specific datasets. In this paper, we introduce EQUATE (Equation Generation via QUality-Aligned Transfer Embeddings), a data-efficient fine-tuning framework that adapts foundation models for symbolic equation discovery in low-data regimes via distillation. EQUATE combines symbolic-numeric alignment with evaluator-guided embedding optimization, enabling a principled embedding-search-generation paradigm. Our approach reformulates discrete equation search as a continuous optimization task in a shared embedding space, guided by data-equation fitness and simplicity. Experiments across three standard public benchmarks (Feynman, Strogatz, and black-box datasets) demonstrate that EQUATE consistently outperforms state-of-the-art baselines in both accuracy and robustness, while preserving low complexity and fast inference. These results highlight EQUATE as a practical and generalizable solution for data-efficient symbolic regression in foundation model distillation settings. Read More
What Rollup News says about battling disinformationAI Newson August 28, 2025 at 7:41 am Swarm Network, a platform developing decentralised protocols for AI agents, recently announced the successful results of its first Swarm, a tool (perhaps “organism” is the better term) built to tackle disinformation. Called Rollup News, the swarm is not an app, a software platform, nor a centralised algorithm. It is a decentralised collection of AI agents
The post What Rollup News says about battling disinformation appeared first on AI News.
Swarm Network, a platform developing decentralised protocols for AI agents, recently announced the successful results of its first Swarm, a tool (perhaps “organism” is the better term) built to tackle disinformation. Called Rollup News, the swarm is not an app, a software platform, nor a centralised algorithm. It is a decentralised collection of AI agents
The post What Rollup News says about battling disinformation appeared first on AI News. Read More
Get AI-Ready: How to Prepare for a World of Agentic AI as Tech ProfessionalsTowards Data Scienceon August 27, 2025 at 6:30 pm Explore how Agentic AI is reshaping the tech careers, from data to decision-making, and how professionals can prepare for the future of work
The post Get AI-Ready: How to Prepare for a World of Agentic AI as Tech Professionals appeared first on Towards Data Science.
Explore how Agentic AI is reshaping the tech careers, from data to decision-making, and how professionals can prepare for the future of work
The post Get AI-Ready: How to Prepare for a World of Agentic AI as Tech Professionals appeared first on Towards Data Science. Read More
AI security wars: Can Google Cloud defend against tomorrow’s threats?AI Newson August 28, 2025 at 11:02 am In Google’s sleek Singapore office at Block 80, Level 3, Mark Johnston stood before a room of technology journalists at 1:30 PM with a startling admission: after five decades of cybersecurity evolution, defenders are still losing the war. “In 69% of incidents in Japan and Asia Pacific, organisations were notified of their own breaches by
The post AI security wars: Can Google Cloud defend against tomorrow’s threats? appeared first on AI News.
In Google’s sleek Singapore office at Block 80, Level 3, Mark Johnston stood before a room of technology journalists at 1:30 PM with a startling admission: after five decades of cybersecurity evolution, defenders are still losing the war. “In 69% of incidents in Japan and Asia Pacific, organisations were notified of their own breaches by
The post AI security wars: Can Google Cloud defend against tomorrow’s threats? appeared first on AI News. Read More