AI Threat Landscape 2026: Supply Chain, Adversarial Attacks, and Data Security

A practitioner's guide to the threats targeting AI systems in production, from training data poisoning to model extraction and privacy violations.

Derrick D. Jackson | CISSP, CRISC, CCSP April 2026 ~18 min read

6Supply Chain Risks

4Attack Categories

4Testing Requirements

130+Primary Sources

AI systems face a threat landscape that is fundamentally different from traditional software. Attacks can happen before a model even goes into production, during training, through the data supply chain, or through carefully crafted inputs designed to break model behavior at inference time. And the stakes keep rising: organizations are deploying AI into hiring, healthcare, financial services, and critical infrastructure, where a single compromised model can cause cascading harm.

This guide maps the current AI threat landscape across six dimensions: supply chain risk, adversarial attacks, data leakage, bias failures, adversarial testing requirements, and risk register integration. Every threat is tied to actionable controls from NIST AI RMF, ISO 42001, the EU AI Act, OWASP Top 10 for LLM, and MITRE ATLAS.

Scope

This article covers general AI threats that affect all machine learning systems. Agent-specific threats like prompt injection, tool misuse, and cascading failures are covered in the Agentic AI Security Hub.

Threat Vector 01

AI Supply Chain Risk

The AI supply chain introduces attack surface at every layer, from training data sourcing through model deployment. Unlike traditional software supply chains, AI supply chains include data provenance, model lineage, and learned behaviors that are difficult to audit after the fact.

☢

Training Data Poisoning

Corrupted training data introduces biases, backdoors, or targeted misclassifications that persist through retraining. Even small fractions of poisoned training data can embed persistent triggers that survive model updates.

Critical MITRE ATLAS T0020

🔒

Model Integrity

Tampered weights, unauthorized modifications, or compromised model registries allow attackers to alter model behavior without detection. Supply chain attacks on model hosting platforms are increasing.

Critical OWASP LLM03

📦

Dependency Attacks

Compromised ML libraries, frameworks, and packages (PyTorch, TensorFlow, Hugging Face) introduce vulnerabilities through the software supply chain. Malicious packages mimicking popular ML tools appear regularly on package registries.

High NIST GOVERN 6

🔍

Data Provenance

Unknown origin of training datasets introduces unquantified risk. Web-scraped data may contain copyrighted material, biased samples, or deliberately planted adversarial examples. Without provenance tracking, organizations cannot verify data integrity.

High ISO 42001 A.7

🔄

Transfer Learning Risks

Fine-tuning pre-trained models inherits biases, vulnerabilities, and potentially backdoored behaviors from the base model. Downstream users rarely audit the full training lineage of foundation models they build on.

Medium NIST MAP 3.4

🌐

Open-Source Model Risks

Unaudited community models deployed in production without security review. Model cards may be incomplete or misleading. Fine-tuned variants may strip safety training or introduce new vulnerabilities not present in the base release.

High OWASP LLM03

Supply Chain Alert

NIST AI RMF GOVERN 6 requires policies addressing AI risks from third-party entities, including supply chain tracking, data provenance, and component dependencies. If you cannot trace where your training data came from or verify your model weights have not been altered, you have a supply chain gap. Start with the AI Use Case Inventory to catalog what is running in your environment.

Threat Vector 02

Adversarial Attacks on Production Models

Production AI systems face four primary adversarial attack categories. Each targets a different phase of the ML pipeline and requires distinct detection and mitigation strategies.

Evasion Attacks

Carefully crafted inputs that cause misclassification at inference time. Adversaries add imperceptible perturbations to images, audio, or text to change model outputs while appearing normal to human reviewers. These attacks do not require access to model internals.

Pixel perturbation Adversarial patches Text paraphrasing

MITRE ATLAS T0015 NIST MEASURE 2.6

Model Extraction

Stealing model parameters by systematically querying a production API and using the input-output pairs to train a replica. The resulting stolen model can then be used to craft evasion attacks or sold to competitors. API rate limiting is necessary but not sufficient.

API query harvesting Functionally equivalent extraction Side-channel analysis

MITRE ATLAS T0024 OWASP LLM03

Model Inversion

Reconstructing sensitive training data from model outputs. Attackers exploit the fact that models memorize patterns from their training data and can be coerced into revealing private information through targeted queries. Particularly dangerous for models trained on PII or medical data.

Membership inference Attribute inference Training data reconstruction

MITRE ATLAS T0024.001 EU AI Act Art. 10

Data Poisoning (Runtime)

Corrupting model behavior through feedback loops and online learning mechanisms. Unlike training-time poisoning, runtime poisoning targets models that continue learning from production data. Attackers influence retraining datasets by manipulating user interactions.

Feedback loop manipulation Label flipping Backdoor injection

MITRE ATLAS T0020 ISO 42001 Cl. 8.4

Threat Vector 03

Data Leakage and Privacy Violations

AI systems create unique data leakage vectors that traditional DLP tools do not catch. Models can inadvertently memorize and reproduce sensitive data, creating compliance exposure under GDPR, CCPA, and the EU AI Act.

🐘

Training Data Memorization

Large language models memorize and can reproduce verbatim passages from training data, including sensitive documents, credentials, and proprietary code. Research shows models with more parameters memorize more data, and deduplication of training data only partially mitigates the risk.

👤

PII Exposure in Outputs

Models generating text, recommendations, or summaries may surface personal data from training sets. Names, addresses, phone numbers, and email addresses have been extracted from production LLMs through targeted prompting. EU AI Act Art. 10 requires data governance to prevent this.

🔎

Inference Attacks

Deriving private information from model behavior without direct data exposure. Attackers infer sensitive attributes (health status, financial standing, protected characteristics) by observing how a model responds to carefully constructed queries. Model confidence scores make this easier.

🎛

Cross-Tenant Data Leakage

Multi-tenant AI services where data from one customer influences outputs for another. Shared infrastructure, shared fine-tuning pipelines, and insufficient tenant isolation in AI platforms create risks that do not exist in traditional SaaS. Context window contamination in shared LLM services is a growing concern.

Framework Alignment

EU AI Act Art. 10 mandates data governance for high-risk systems, including measures to prevent data leakage. ISO 42001 Cl. 6.1.3 requires AI-specific risk treatment addressing data privacy. NIST AI RMF MAP 5.1 calls for documentation of data privacy properties. Build these controls into your risk management process.

Threat Vector 04

Bias and Fairness Failures as Risk Events

Bias is not just an ethics concern. It is a risk event with legal, financial, and reputational consequences. The EU AI Act and multiple US state laws now impose penalties for discriminatory AI outputs. Treating bias as a threat means detecting, measuring, and remediating it with the same rigor as any security vulnerability.

⚠

Historical Bias Amplification

Models trained on historical data encode and amplify existing societal biases. Hiring algorithms, credit scoring, and predictive policing have all demonstrated this pattern. The model does not create bias, but it scales it to every decision at machine speed.

📊

Representation Bias

Training data that underrepresents or overrepresents certain populations produces models that perform poorly for minority groups. Face recognition accuracy disparities, language model performance gaps, and medical AI misdiagnosis patterns all trace to representation bias.

📏

Measurement Bias

Evaluation metrics that mask performance disparities across subgroups. Aggregate accuracy hides the fact that a model works well for some populations and fails for others. Disaggregated evaluation by protected characteristic is required by EU AI Act Art. 9.

Deep Dive

For a full treatment of bias types, detection methods, and mitigation strategies, see our dedicated article: Understanding AI Bias: Detection, Measurement, and Remediation.

Assurance

Adversarial Testing Requirements

The NIST AI RMF MEASURE function establishes adversarial testing requirements across multiple subcategories (MEASURE 2.6 for safety/robustness, MEASURE 2.11 for fairness/bias). These are not optional hardening exercises. They are required assurance activities that feed directly into risk scoring and go/no-go decisions.

🔍 Bias Audit Methodology

Disaggregated performance testing across protected characteristics
Fairness metrics: demographic parity, equalized odds, calibration
Intersectional analysis (multiple protected attributes combined)
Pre-deployment and post-deployment monitoring cadence
Documented remediation thresholds and escalation triggers

NIST MEASURE 2.11 EU AI Act Art. 9

🛡 Robustness Testing

Adversarial example generation (FGSM, PGD, C&W attacks)
Edge case and boundary condition testing
Distribution shift detection and out-of-distribution inputs
Stress testing under data quality degradation
Backdoor and trojan detection scanning

NIST MEASURE 2.6 MITRE ATLAS

🔐 Security Assessment

Penetration testing targeting ML-specific attack vectors
Model extraction resistance testing (API abuse scenarios)
Data exfiltration pathway analysis
Authentication and authorization review for model endpoints
Supply chain integrity verification

OWASP LLM Top 10 CSA AI Controls

⚠ Fail-Safe Validation

Graceful degradation under adversarial conditions
Fallback behavior when model confidence drops below threshold
Human-in-the-loop escalation triggers
Kill-switch functionality verification
Recovery time objectives for compromised models

EU AI Act Art. 14 ISO 42001 Cl. 8.4

Free Download

Risk Tier Decision Tree

7 questions to determine the right testing depth for each AI use case. Maps threat exposure to proportionate assurance activities.

Download the Decision Tree →

Operationalization

How Threat Intelligence Feeds the Risk Register

Threat identification is only useful if it drives action. Here is how each threat maps to your risk register, triggers reassessment, and connects to continuous monitoring.

🚨

Threat Identified

New attack vector, vulnerability disclosure, or incident report from MITRE ATLAS, OWASP, or CSA feeds

→

📈

Likelihood Scoring

5x5 matrix assessment: likelihood (1-5) x impact (1-5) produces risk score that determines response priority

→

🛠

Control Mapping

New threats trigger risk reassessment for all affected AI systems. Existing controls are evaluated for effectiveness against the new vector.

→

📡

Continuous Monitoring

Updated detection rules, alerting thresholds, and monitoring dashboards integrate threat intelligence into operational workflows

Likelihood x Impact Risk Matrix

5	10	15	20	25
4	8	12	16	20
3	6	9	12	15
2	4	6	8	10
1	2	3	4	5

Impact →

■ Low ■ Medium ■ High Critical

Integration Point

Your risk register should include AI-specific fields: model ID, training data source, deployment environment, and framework alignment status. Use our AI Use Case Inventory as the foundation and our 40-Field Tracker Template to operationalize it.

Emerging Threats

Agentic AI Threats

Agent-specific threats like prompt injection, excessive agency, and cascading failures have their own dedicated analysis. These threats are distinct from general AI risks because agents can take autonomous actions, chain tool calls, and interact with external systems without human approval at each step.

Agentic Security

Explore the Agentic AI Security Hub →

Source Authority

Gallery

Contacts