Every AI System Has a Security Lifecycle
Building an AI model is a six-stage pipeline — and every stage is a target. From data collection through production monitoring, each phase introduces unique threats that require specific skills, frameworks, and roles to defend. This page maps the complete AI security lifecycle so you can see exactly where your career fits.
6 Stages, 6 Attack Surfaces
Click any stage below to explore its threats, security controls, and the career roles that defend it. Each stage maps to specific MITRE ATLAS tactics and Google SAIF elements.
Imagine you are the AI Security Engineer at a healthcare company. Your data science team is building a diagnostic model from patient records sourced from three hospital systems, two public datasets, and a commercial data vendor. Your morning starts with a question no traditional security role would ask: "Can we trust this data?"
You pull the provenance records for last week's data ingestion. The two public datasets have known issues — one was scraped from a forum where users self-reported symptoms, and another was cited in a paper that later retracted findings on label accuracy. The commercial vendor provides a data dictionary but no audit trail for how records were cleaned. You flag this to the data science lead, who pushes back: "We've always used this vendor."
This is the daily tension of Stage 1. Your job is not to block progress — it is to quantify the risk that bad data introduces, document it so auditors can trace it, and recommend controls (statistical validation, access restrictions, provenance logging) that make the pipeline defensible. Nobody will thank you for catching a poisoned dataset. But everyone will blame you if a corrupted model reaches patients.
Grounded in: NIST AI RMF Map function (risk identification); Google SAIF Element 1 (secure foundations); MITRE SAFE-AI Data Environment controls; AI Security Specialist role description
You are the AI Infrastructure Security Specialist at a fintech company. The ML team has a training cluster — 64 A100 GPUs running across 8 nodes in a cloud VPC. They are training a fraud detection model on transaction data that includes PII from 12 million customers. The training job will run for three days.
Your concerns are layered. First, isolation: can anything reach these nodes from the corporate network? You check the VPC peering rules and find that someone added a rule last quarter to allow SSH from a developer bastion host "for debugging." That rule is still live. Second, egress: if an attacker compromised a training node, could they exfiltrate the model weights? You verify that egress is restricted to the model registry and logging endpoints only. Third, integrity: how would you know if someone tampered with the training process mid-run? You implement checkpoint hashing — every 4 hours, the training script writes a hash of the current weights to an immutable log.
The RAND Playbook for Securing AI Model Weights describes model weights as "the crown jewels" of AI organizations. At the training stage, those crown jewels are being forged — and they are at their most vulnerable because they exist in a distributed, compute-intensive environment that engineers are constantly tempted to make more accessible for debugging and iteration.
Grounded in: RAND Playbook for Securing AI Model Weights; MITRE SAFE-AI Platform controls; Google SAIF Element 4 (harmonize platform-level controls)
You are an AI Red Teamer. The product team has built a customer-facing chatbot and wants to ship next Thursday. They hand you the model and say, "We need a security sign-off." You have five working days.
You start with automated adversarial probing using NVIDIA Garak — running hundreds of prompt injection variants, jailbreak templates, and information extraction attempts. Garak catches the obvious things: the model can be tricked into ignoring its system prompt with certain role-play scenarios. The product team patches the system prompt. You run Garak again. Most of the bypasses are gone.
But here is where Microsoft's insight becomes critical: "Red teaming generative AI is probabilistic, not deterministic." Garak cannot tell you if the model will produce harmful content in a conversation that evolves over 15 turns. It cannot evaluate whether the model's advice on a medical question is dangerously misleading but superficially plausible. It cannot assess whether a creative scenario prompt will cause the model to reveal its system instructions in a way that no template anticipated. For that, you need human expertise — people who understand the domain, the threat model, and the social engineering tactics that real adversaries will use.
Your red team report does not say "pass" or "fail." It says: "Here are the residual risks we identified, here is our confidence level in coverage, and here are the attack categories we could not adequately test in five days." The product team makes a risk-informed decision, not a binary one.
Grounded in: Microsoft AI Red Team (3 Takeaways from Red Teaming 100 Products); Google AI Red Team methodology; EU AI Act adversarial testing requirements; NVIDIA Garak documentation
Deployment day. Your team is pushing a fine-tuned language model into production behind a REST API. The ML engineers want to use the same model artifact they tested — downloaded from an internal model registry that mirrors selected models from Hugging Face. You ask three questions that would never come up in a traditional software deployment:
1. "What format are the weights in?" The model was saved in a serialization format that can execute code on load. You require conversion to SafeTensors before deployment — a format designed to be safe for untrusted model files. The ML team says this will take four hours. You explain that four hours now is better than a compromised production server later.
2. "What permissions does the inference container have?" You discover the container has network access to the internal metrics database — a leftover from a debugging session. You strip it. The model serving process should have access to: the model weights (read-only), the input queue, and the output endpoint. Nothing else.
3. "How do we know this is the same model that passed red team evaluation?" You verify the cryptographic hash of the model artifact against the signed hash from the red team's evaluation report. They match. If they did not, the deployment would stop — because any gap between "what was tested" and "what is deployed" is an integrity failure.
These are the kinds of questions that make deployment in AI security fundamentally different from traditional DevOps. The OWASP LLM Top 10 (LLM03: Supply Chain Vulnerabilities) exists because organizations learned these lessons the hard way.
Grounded in: OWASP LLM Top 10 2025 (LLM03); MITRE SAFE-AI Model integrity controls; Google SAIF Element 1 (secure foundations); SentinelOne AI Model Security CISO Guide
It is 2:00 PM on a Tuesday. You are the AI Security Specialist monitoring a production LLM that handles 50,000 customer interactions per day. Your anomaly detection dashboard flags something: a single user account has made 847 API calls in the last three hours, each with subtly different phrasing of the same question. The confidence scores returned by each call are being recorded — someone is mapping your model's decision boundary.
This is a model extraction attempt in progress. MITRE ATLAS catalogs this as AML.T0024 — Exfiltration via ML Inference API. The attacker is using your own API, with valid credentials, to steal your model. Nothing in a traditional WAF or SIEM would flag this. The requests are well-formed. The authentication is valid. The payloads contain no malicious code.
You implement an immediate rate limit on the account and escalate. But the deeper question is architectural: why was the API returning raw confidence scores at all? Each additional piece of information in an API response — logits, token probabilities, embedding vectors — gives an attacker more signal to reconstruct your model. You draft a proposal to truncate confidence scores to two decimal places and suppress logits entirely from the public API. The ML team pushes back because a downstream application uses those scores for routing logic. Now you are negotiating between security and functionality — the daily reality of inference-stage defense.
Grounded in: MITRE ATLAS AML.T0024; IBM Adversarial Robustness Toolbox documentation; Google SAIF Element 3 (automate defenses); AI Security Specialist role description (TJS)
You are the AI Governance Lead at a mid-size enterprise. The CISO just forwarded you an email from the CTO: "How many AI models do we have in production?" You do not know. Nobody does.
You start building an inventory. The official model registry shows 12 models deployed by the data science team. But as you dig, the real number emerges: Engineering deployed 3 open-source models through a separate pipeline that bypasses the registry. Marketing is using an AI content tool with API access to your CMS. The sales team built a lead scoring model in a no-code platform using exported CRM data. Finance has an LLM summarizing earnings transcripts. None of these went through security review. None have documented owners. HiddenLayer calls this exactly what it is: 73% of organizations have no clear ownership of AI security responsibilities.
This is the monitoring and governance gap in action. You cannot secure what you cannot see. Your first deliverable is not a policy document — it is a discovery scan. You work with IT to identify AI-related API calls in network logs, cloud billing anomalies that suggest GPU usage, and SaaS subscriptions that include "AI" or "ML" in their service descriptions. Within a week, your inventory goes from 12 models to 31 — and you still suspect you are missing some.
The ISO 42001 standard exists for exactly this reason: establishing an AI management system that provides the organizational structure to govern AI across the enterprise. Without it, security is reactive. With it, you have a framework for continuous oversight.
Grounded in: HiddenLayer 2026 AI Threat Landscape (73% no clear AI security ownership, 76% shadow AI); ISO/IEC 42001; NIST AI RMF Govern function; Google SAIF Element 6 (contextualize risks in business processes)
Sources: MITRE ATLAS v5.1.0; NIST AI 100-2e2023; Google SAIF; OWASP LLM Top 10 2025; HiddenLayer 2026 AI Threat Landscape; MITRE SAFE-AI Framework
The Frameworks That Map This Lifecycle
Four major frameworks provide the vocabulary and structure for AI security across the lifecycle. Each has a different focus — understanding when to apply which framework is a core competency for AI security professionals.
ATLAS is the ATT&CK equivalent for AI/ML systems. It catalogs real-world adversarial techniques organized by tactic (the adversary's goal) and technique (how they achieve it). It is the primary reference for threat-informed defense of ML systems, covering the full lifecycle from reconnaissance through impact.
ATLAS v5.1.0 represents the most comprehensive catalog of AI-specific threats available, with techniques mapped to 42 documented case studies from real incidents. The MITRE Center for Threat-Informed Defense has also released the SAFE-AI Framework, which maps ATLAS techniques to NIST SP 800-53 security controls.
Use ATLAS when you need to think like an attacker. If you are building a threat model for an AI system, ATLAS gives you the adversary's playbook organized by goal (tactic) and method (technique). Reach for it when: planning red team exercises, writing AI-specific threat models, building detection rules for AI-targeted attacks, or responding to an incident involving model compromise. ATLAS is how you answer the question: "What could an adversary do to our AI system, and how would they do it?"
If you come from traditional security, think of ATLAS as ATT&CK for AI — same structure, same rigor, different domain. If you already use ATT&CK for network defense, ATLAS extends your threat-informed approach to ML systems.
Source: atlas.mitre.org — ATLAS v5.1.0
The most widely referenced risk catalog for LLM-powered applications. Focuses on application-layer risks specific to large language models, from prompt injection (LLM01) to unbounded consumption (LLM10). Essential knowledge for anyone building or securing LLM-based systems.
Key risks: LLM01 Prompt Injection, LLM02 Sensitive Information Disclosure, LLM03 Supply Chain, LLM04 Data and Model Poisoning, LLM05 Improper Output Handling, LLM06 Excessive Agency, LLM07 System Prompt Leakage, LLM08 Vector and Embedding Weaknesses, LLM09 Misinformation, LLM10 Unbounded Consumption.
Use this when you are building or reviewing an LLM-powered application. This is your developer-facing checklist. If your team is integrating an LLM into a product — a chatbot, a RAG system, a code assistant, an agent — walk through each of the 10 risks and ask: "Are we exposed to this?" It is deliberately practical: each risk comes with attack scenarios, prevention strategies, and example payloads.
Where ATLAS tells you how adversaries attack AI systems broadly, the OWASP LLM Top 10 tells you how they attack your application specifically. It is the security review checklist for anyone shipping LLM features. If you do one thing before launching an LLM application, review these ten risks.
Google's conceptual framework for securing AI systems, built on six core elements. SAIF 2.0 extends the original framework with specific controls for agentic AI systems. Includes an interactive Risk Assessment Tool for practitioners and is aligned with the HackTheBox AI Red Teamer certification path.
6 Elements: (1) Expand strong security foundations to AI, (2) Extend detection and response, (3) Automate defenses, (4) Harmonize platform-level controls, (5) Adapt controls based on context, (6) Contextualize AI system risks in business processes.
Use SAIF when you are operationalizing AI security at the infrastructure level. SAIF is the framework for security teams who need to extend existing security programs (IAM, network segmentation, incident response) to cover AI workloads. It is particularly strong on runtime defense, supply chain hardening, and agentic AI controls (SAIF 2.0).
Practically: if you are the security team figuring out how to add AI model serving to your existing cloud security posture, SAIF provides the mental model. Element 1 says "don't reinvent the wheel — extend your existing security foundations." Element 3 says "automate defenses because the attack surface is too large for manual review." Element 4 says "enforce controls at the platform level so individual teams can't misconfigure." This is the framework for security operators, not just security architects.
The U.S. federal framework for managing AI risk across the entire lifecycle. Built on four core functions: Govern (organizational context), Map (risk identification), Measure (risk analysis), and Manage (risk treatment). NIST AI 600-1 extends the framework with generative AI-specific guidance.
While voluntary, NIST AI RMF is becoming the de facto standard for U.S. organizations. Federal agencies are required to align with it, and regulated industries (financial services, healthcare) increasingly use it as their compliance baseline.
Use NIST AI RMF when you need to anchor AI security in organizational governance and policy. This is the framework that speaks to boards, regulators, and risk committees. Its four functions — Govern, Map, Measure, Manage — provide a structured approach to AI risk management that maps to how enterprises already think about risk.
Reach for NIST AI RMF when: establishing your organization's AI risk tolerance, defining roles and responsibilities for AI oversight, building the policy architecture that other frameworks' controls will plug into, or when you need to demonstrate to auditors that you have a defensible AI risk management program. If ATLAS is the adversary's playbook and OWASP is the developer's checklist, NIST AI RMF is the executive's governance structure.
The most comprehensive open-source AI security resource, with 300+ pages covering all AI types (not just LLMs). An OWASP Flagship project that has contributed to ISO/IEC standards development. Covers threats, controls, and best practices across the entire AI lifecycle.
Unlike the LLM Top 10 which focuses on the ten highest risks, the AI Exchange is an exhaustive reference covering traditional ML, deep learning, generative AI, and emerging modalities. It maps to multiple frameworks and is continuously updated by the community.
Use the OWASP AI Exchange when your AI system is not just an LLM. The LLM Top 10 covers language models specifically. The AI Exchange covers everything: computer vision models, recommendation systems, time-series forecasters, reinforcement learning agents, and multi-modal systems. If you are securing a fraud detection model, an autonomous vehicle perception stack, or a medical imaging classifier, this is your primary reference.
It is also the deepest resource available — 300+ pages compared to the LLM Top 10's focused ten risks. Use it when you need to go beyond "what are the top risks" to "what is the complete threat and control landscape for my specific AI system type." Its contribution to ISO/IEC standards means the guidance has been reviewed at the international standards level.
Which Roles Defend Which Stages?
This matrix maps 12 key AI security roles to the lifecycle stages where they operate. A filled circle indicates a primary responsibility; a ring indicates secondary involvement. Use this to understand where your current skills have the most impact.
| Role | Data | Training | Evaluation | Deploy | Inference | Monitoring |
|---|---|---|---|---|---|---|
| AI Security Engineer | ● | ○ | ○ | ● | ● | ○ |
| AI Red Teamer | ○ | ○ | ● | ○ | ● | |
| AI Model Validator | ○ | ● | ● | ○ | ||
| MLOps Governance Engineer | ● | ● | ○ | ● | ● | |
| AI Privacy Engineer | ● | ● | ○ | ● | ||
| Data Governance Manager (AI) | ● | ○ | ● | |||
| AI Governance Lead | ○ | ○ | ○ | ● | ||
| AI Risk Manager | ○ | ● | ○ | ● | ||
| AI Auditor | ○ | ● | ● | |||
| AI Compliance Manager | ○ | ● | ● | |||
| AI Bias Mitigation Specialist | ● | ○ | ● | ○ | ||
| AI Infrastructure Security Specialist | ● | ● | ● | ○ |
● Primary responsibility ○ Secondary involvement
Go Deeper: Lifecycle & Framework Resources
What to Read Next
All Sub-Pages in This Series
Why AI Security Matters
The AI Security Lifecycle
Career Transition Playbooks 04
Frameworks & Practices Deep Dive 05
Your First 90 Days
Related Tech Jacks Solutions Resources
Ready to explore the 20 AI security career paths?
Explore All 20 AI Security Roles →