What Is Amazon Bedrock? AWS's Foundation Model Platform Explained (2026)
Last verified: May 14, 2026 · Format: Breakdown
Bedrock is not a model. It is a menu. AWS built a fully managed service that gives you API access to over 100 foundation models from 15+ providers, and you never touch a server, provision a GPU, or manage an endpoint.
That distinction matters because it changes who can build with generative AI. If your team already runs infrastructure on AWS, Bedrock turns model selection into a configuration choice rather than an infrastructure project. The same API call that hits Claude today can point to Llama tomorrow. For the 100,000+ organizations already using it, that flexibility is the point.
What Is Amazon Bedrock
Amazon Bedrock is a fully managed, serverless service that provides secure, enterprise-grade access to high-performing foundation models through a single API. It went generally available on September 28, 2023, after an initial preview announcement on April 13, 2023.
The value proposition is straightforward. You pick a model (or several models), call it through a standardized API, and AWS handles the compute, scaling, and infrastructure behind the scenes. No model weights to download. No GPU instances to configure. No cold-start debugging at 2 AM.
Bedrock is a platform, not a model. It provides API access to 100+ models from 15+ providers. You choose which model to call per request.
Need custom training loops or full control over the inference stack? That is SageMaker. Bedrock offers fine-tuning and distillation, not ground-up model training.
Bedrock integrates deeply with S3, Aurora, IAM, and VPC. The model calls are portable via OpenAI-compatible APIs, but the orchestration layer (Knowledge Bases, Agents, Flows) is AWS-native.
The model catalog spans Amazon's own Nova family, Anthropic's Claude, Meta's Llama, Mistral AI, AI21 Labs, Cohere, DeepSeek, Stability AI, and several others. The Bedrock Marketplace adds another 100+ models beyond the core catalog. This multi-provider approach is what separates Bedrock from Azure AI Foundry (which centers on OpenAI) and Google Vertex AI (which centers on Gemini). You are not locked to a single vendor's model roadmap.
Bedrock connects directly to S3, Aurora, OpenSearch, and the broader AWS ecosystem. If your data already lives in AWS, the integration path is short. For background on where Bedrock fits in the broader AI tool landscape, see the AI Tools Hub and the AWS AI Services sub-hub. Ready to build? The How to Use Amazon Bedrock guide walks through setup from zero to first API call.
How Bedrock Works
The architecture follows a pattern familiar to anyone who has used AWS managed services. You make an API call. AWS routes it to the appropriate model endpoint, handles inference compute, and returns the response. You pay per token (or per image, per second of audio, per API call) with no upfront commitment required.
APIs
Bedrock supports five API formats, which means you do not have to rewrite your application when switching models:
- Converse API: AWS's unified format that works across all Bedrock models
- Invoke API: Direct model invocation for provider-native payloads
- Messages API: Anthropic-native format for Claude models
- Chat Completions API: OpenAI-compatible format for migration from GPT-based apps
- Responses API: OpenAI-compatible format for newer OpenAI model patterns
Inference Options
Three routing modes let you balance compliance, latency, and throughput:
- In-Region: Data stays in a single AWS region. Required for strict regulatory compliance (HIPAA, data residency).
- Geo Cross-Region: Routes across regions within a geography (US, EU, or Australia) for higher throughput while keeping data within regulatory boundaries.
- Global Cross-Region: Maximum throughput with routing across all available regions.
Key Features
Bedrock is more than model hosting. AWS has layered enterprise features on top of the inference layer that handle the hard parts of production AI: retrieval, orchestration, safety, and customization.
Knowledge Bases (Managed RAG)
Fully managed retrieval-augmented generation. Connect data sources (S3, Confluence, Salesforce, SharePoint, or a web crawler), and Bedrock handles chunking, embedding, and vector storage. Supported vector stores include Aurora PostgreSQL, OpenSearch Serverless, Neptune Analytics, MongoDB Atlas, Pinecone, and Redis. GraphRAG support adds knowledge graph traversal for complex multi-hop queries. Every response includes source attribution so you can trace answers back to the original document.
Agents and AgentCore
Bedrock Agents handle multi-step task automation with tool use, code interpretation, and memory retention across sessions. Multi-agent collaboration lets you chain specialized agents together for complex workflows.
AgentCore, which reached general availability in late 2025, goes further. It is an end-to-end agentic platform with its own runtime, gateway, memory, identity management, policy engine, code interpreter, browser tool, evaluations, and observability. The key difference: AgentCore works with any framework (LangChain, CrewAI, custom) and any model (not just Bedrock models). The Managed Harness added in April 2026 handles deployment, scaling, and security for agentic workloads.
Guardrails
Six configurable safety policies: content filters with adjustable sensitivity thresholds across six categories, denied topics, word filters, PII detection and redaction, contextual grounding checks, and automated reasoning that uses formal logic for factual validation (AWS Bedrock Guardrails documentation). Two tiers (Standard and Classic) with the ApplyGuardrail API available for models running outside Bedrock. For a deeper dive into configuration options and deployment patterns, see What Is Amazon Bedrock Guardrails.
Custom Models
Four customization paths: supervised fine-tuning (SFT), reinforcement fine-tuning (RFT), continued pre-training on your own data, and model distillation. Distillation lets you compress a large model's capabilities into a smaller, cheaper one - AWS claims up to 500% faster inference and 75% lower cost with distilled models (AWS Bedrock Model Distillation documentation). You can also import custom models trained elsewhere.
Prompt Management and Optimization
Prompt caching with 5-minute and 1-hour TTL windows reduces costs by up to 90% and latency by 85% for repeated prompt prefixes (AWS Bedrock Prompt Caching documentation). Intelligent Prompt Routing automatically selects the most cost-effective model for each request - AWS claims up to 30% cost reduction from routing alone (AWS Bedrock Intelligent Prompt Routing documentation). Prompt Optimization rewrites your prompts for better model performance.
Flows
A visual workflow builder for chaining foundation models, agents, knowledge bases, and guardrails into multi-step pipelines. Drag-and-drop orchestration without writing glue code.
Data Automation
GenAI-powered processing for documents, images, audio, and video. Extracts structured data from unstructured inputs at scale.
The Nova Model Family
Amazon's first-party model lineup launched in late 2024. The Nova family covers text, image, video, and speech, giving AWS a model option at every price point without depending entirely on third-party providers.
The Nova models are priced aggressively. Nova Micro at $0.035 per million input tokens is one of the cheapest foundation models available from any provider. Nova Premier at $2.50/$12.50 (input/output) targets workloads where you need a high-capability model but do not want to pay Claude Opus or GPT-5 pricing.
Models and Pricing
Bedrock's pricing is pay-per-token with no minimum commitment on the Standard tier. The model catalog spans price points from $0.035 per million input tokens (Nova Micro) to $15.00 per million (Claude Opus 4). Verified May 2026.
| Model | Input / 1M | Output / 1M | Notes |
|---|---|---|---|
| Nova Micro | $0.035 | $0.14 | Text only, 128K context |
| Nova Lite | $0.06 | $0.24 | Multimodal, 300K context |
| Nova Pro | $0.80 | $3.20 | Multimodal, 300K, fine-tunable |
| Nova Premier | $2.50 | $12.50 | 1M context, distillation teacher |
| Claude 3.5 Haiku | $0.25 | $1.25 | Anthropic fast tier |
| Mistral Large 3 | $0.50 | $1.50 | Mistral AI flagship |
| Gemma 3 4B | $0.04 | $0.08 | Google open model |
| Claude 3.5 Sonnet v2 | $3.00 | $15.00 | Anthropic balanced tier |
| Claude Opus 4 | $15.00 | $75.00 | Anthropic premium tier |
Service Tiers
- No upfront commitment
- On-demand pricing
- All models available
- Best for variable workloads
- Flexible scheduling
- 50% discount vs Standard
- Async-friendly workloads
- Batch inference available
- Guaranteed high throughput
- Lowest latency routing
- Production-critical workloads
- SLA-backed performance
- Dedicated capacity
- Predictable pricing
- 1-6 month terms
- Highest throughput guarantee
Additional costs: Knowledge Bases $0.001-$0.002/query. Guardrails content filters $0.15/1K text units. Batch inference 50% off on-demand. Prompt caching reduces costs up to 90%. Pricing verified May 14, 2026.
Bedrock vs the Competition
Three platforms dominate managed foundation model access. Each reflects the strategic priorities of its parent company.
| Dimension | Amazon Bedrock | Azure AI Foundry | Google Vertex AI |
|---|---|---|---|
| Model breadth | 100+ models, 15+ providers | OpenAI-centric + select partners | Gemini-centric + Model Garden |
| Flagship model | Claude (Anthropic) + Nova (Amazon) | GPT-5.x (OpenAI) | Gemini 3.x (Google) |
| RAG support | Knowledge Bases (managed) | AI Search + Azure OpenAI | Vertex AI Search + Grounding |
| Agent platform | Agents + AgentCore | Azure AI Agent Service | Vertex AI Agent Builder |
| Safety | Guardrails (6 policies) | Content Safety + Prompt Shields | Responsible AI Toolkit |
| Differentiator | Multi-model choice, AWS integration | Deep Microsoft 365 integration | BigQuery-native ML, Search grounding |
Bedrock's advantage is model diversity. If you need Claude for reasoning, Llama for cost-sensitive batch processing, and Stability AI for image generation, Bedrock lets you use all three through the same API with the same security controls. Azure and Google both steer you toward their in-house flagship models.
The trade-off: if your organization is already deep in Microsoft 365 or Google Workspace, the competing platforms offer tighter integration with those specific productivity ecosystems. Bedrock's ecosystem advantage only applies if you are already on AWS.
Who Should Use Bedrock
Serverless access to 100+ models through a unified API. No GPU provisioning, no endpoint management. Build a GenAI feature with a single API call and let AWS handle the scaling. The OpenAI-compatible API makes migration from existing GPT-based apps straightforward.
Best fit: Standard tier + Converse APIMulti-model strategy without vendor lock-in to a single AI provider. Guardrails, IAM integration, VPC endpoints, and compliance certifications (FedRAMP High, HIPAA, SOC) satisfy enterprise security requirements. AgentCore runs on any framework and any model.
Best fit: Reserved tier + Guardrails + AgentCoreFine-tuning, reinforcement fine-tuning, continued pre-training, and model distillation without managing training infrastructure. Import custom models trained on SageMaker or elsewhere. Model evaluation tools for side-by-side comparison before deployment.
Best fit: Custom models + SageMaker integrationStart with Nova Micro at $0.035/M input tokens, prototype fast, then swap to Claude or Llama when you need more capability. No infrastructure investment. No long-term commitment. Flex tier cuts costs another 50% for batch workloads.
Best fit: Standard/Flex tier + Nova modelsLimitations
Bedrock solves the infrastructure problem, but it introduces its own constraints. Enterprise buyers should weigh these against the platform's strengths before committing.
Not all 100+ models are available in every AWS region. Specific models may be limited to US East or US West regions. If you need data residency in a particular geography, verify model availability for that region before building. The 33-region headline number does not mean all models run everywhere.
Four service tiers (Standard, Flex, Priority, Reserved), per-model token pricing, separate charges for Knowledge Bases, Guardrails, agent invocations, and custom model training. Cost prediction requires careful modeling. The Flex tier's 50% discount only applies to flexible-schedule workloads, not real-time inference.
Bedrock's deepest integrations (S3, Aurora, IAM, VPC, CloudWatch) only work within AWS. Knowledge Bases, Agents, and Flows are AWS-native services with no portable equivalent. If you later move to Azure or GCP, the model calls are portable (via OpenAI-compatible API), but the orchestration layer is not.
Bedrock's managed approach trades control for convenience. If you need custom training loops, non-standard model architectures, or full control over the inference stack, SageMaker is the AWS tool for that. Bedrock's fine-tuning and distillation cover common customization needs, but not every edge case.