Gallery

Contacts

411 University St, Seattle, USA

engitech@oceanthemes.net

+1 -800-456-478-23

AWS AI Services

What Is Amazon Bedrock? AWS's Foundation Model Platform Explained (2026)

Last verified: May 14, 2026  ·  Format: Breakdown

100K+
Organizations using Amazon Bedrock worldwide
Source: AWS official documentation
100+
Foundation models available from 15+ AI providers
Source: AWS Bedrock model catalog
33
AWS regions with Bedrock availability globally
Source: AWS Region table, May 2026
15+
AI model providers including Anthropic, Meta, Mistral, and Amazon Nova
Source: AWS Bedrock documentation

Bedrock is not a model. It is a menu. AWS built a fully managed service that gives you API access to over 100 foundation models from 15+ providers, and you never touch a server, provision a GPU, or manage an endpoint.

That distinction matters because it changes who can build with generative AI. If your team already runs infrastructure on AWS, Bedrock turns model selection into a configuration choice rather than an infrastructure project. The same API call that hits Claude today can point to Llama tomorrow. For the 100,000+ organizations already using it, that flexibility is the point.

What Is Amazon Bedrock

Amazon Bedrock is a fully managed, serverless service that provides secure, enterprise-grade access to high-performing foundation models through a single API. It went generally available on September 28, 2023, after an initial preview announcement on April 13, 2023.

The value proposition is straightforward. You pick a model (or several models), call it through a standardized API, and AWS handles the compute, scaling, and infrastructure behind the scenes. No model weights to download. No GPU instances to configure. No cold-start debugging at 2 AM.

What Bedrock Is and What It Is Not
Not a Single Model

Bedrock is a platform, not a model. It provides API access to 100+ models from 15+ providers. You choose which model to call per request.

Not a Training Platform

Need custom training loops or full control over the inference stack? That is SageMaker. Bedrock offers fine-tuning and distillation, not ground-up model training.

Not Cloud-Agnostic

Bedrock integrates deeply with S3, Aurora, IAM, and VPC. The model calls are portable via OpenAI-compatible APIs, but the orchestration layer (Knowledge Bases, Agents, Flows) is AWS-native.

The model catalog spans Amazon's own Nova family, Anthropic's Claude, Meta's Llama, Mistral AI, AI21 Labs, Cohere, DeepSeek, Stability AI, and several others. The Bedrock Marketplace adds another 100+ models beyond the core catalog. This multi-provider approach is what separates Bedrock from Azure AI Foundry (which centers on OpenAI) and Google Vertex AI (which centers on Gemini). You are not locked to a single vendor's model roadmap.

Bedrock connects directly to S3, Aurora, OpenSearch, and the broader AWS ecosystem. If your data already lives in AWS, the integration path is short. For background on where Bedrock fits in the broader AI tool landscape, see the AI Tools Hub and the AWS AI Services sub-hub. Ready to build? The How to Use Amazon Bedrock guide walks through setup from zero to first API call.

How Bedrock Works

The architecture follows a pattern familiar to anyone who has used AWS managed services. You make an API call. AWS routes it to the appropriate model endpoint, handles inference compute, and returns the response. You pay per token (or per image, per second of audio, per API call) with no upfront commitment required.

APIs

Bedrock supports five API formats, which means you do not have to rewrite your application when switching models:

  • Converse API: AWS's unified format that works across all Bedrock models
  • Invoke API: Direct model invocation for provider-native payloads
  • Messages API: Anthropic-native format for Claude models
  • Chat Completions API: OpenAI-compatible format for migration from GPT-based apps
  • Responses API: OpenAI-compatible format for newer OpenAI model patterns

Inference Options

Three routing modes let you balance compliance, latency, and throughput:

  • In-Region: Data stays in a single AWS region. Required for strict regulatory compliance (HIPAA, data residency).
  • Geo Cross-Region: Routes across regions within a geography (US, EU, or Australia) for higher throughput while keeping data within regulatory boundaries.
  • Global Cross-Region: Maximum throughput with routing across all available regions.
80%
Cost reduction achieved by Robinhood after scaling from 500M to 5B tokens/day on Bedrock in 6 months
Source: AWS case study, Robinhood

Key Features

Bedrock is more than model hosting. AWS has layered enterprise features on top of the inference layer that handle the hard parts of production AI: retrieval, orchestration, safety, and customization.

Knowledge Bases (Managed RAG)

Fully managed retrieval-augmented generation. Connect data sources (S3, Confluence, Salesforce, SharePoint, or a web crawler), and Bedrock handles chunking, embedding, and vector storage. Supported vector stores include Aurora PostgreSQL, OpenSearch Serverless, Neptune Analytics, MongoDB Atlas, Pinecone, and Redis. GraphRAG support adds knowledge graph traversal for complex multi-hop queries. Every response includes source attribution so you can trace answers back to the original document.

Agents and AgentCore

Bedrock Agents handle multi-step task automation with tool use, code interpretation, and memory retention across sessions. Multi-agent collaboration lets you chain specialized agents together for complex workflows.

AgentCore, which reached general availability in late 2025, goes further. It is an end-to-end agentic platform with its own runtime, gateway, memory, identity management, policy engine, code interpreter, browser tool, evaluations, and observability. The key difference: AgentCore works with any framework (LangChain, CrewAI, custom) and any model (not just Bedrock models). The Managed Harness added in April 2026 handles deployment, scaling, and security for agentic workloads.

Guardrails

Six configurable safety policies: content filters with adjustable sensitivity thresholds across six categories, denied topics, word filters, PII detection and redaction, contextual grounding checks, and automated reasoning that uses formal logic for factual validation (AWS Bedrock Guardrails documentation). Two tiers (Standard and Classic) with the ApplyGuardrail API available for models running outside Bedrock. For a deeper dive into configuration options and deployment patterns, see What Is Amazon Bedrock Guardrails.

Custom Models

Four customization paths: supervised fine-tuning (SFT), reinforcement fine-tuning (RFT), continued pre-training on your own data, and model distillation. Distillation lets you compress a large model's capabilities into a smaller, cheaper one - AWS claims up to 500% faster inference and 75% lower cost with distilled models (AWS Bedrock Model Distillation documentation). You can also import custom models trained elsewhere.

Prompt Management and Optimization

Prompt caching with 5-minute and 1-hour TTL windows reduces costs by up to 90% and latency by 85% for repeated prompt prefixes (AWS Bedrock Prompt Caching documentation). Intelligent Prompt Routing automatically selects the most cost-effective model for each request - AWS claims up to 30% cost reduction from routing alone (AWS Bedrock Intelligent Prompt Routing documentation). Prompt Optimization rewrites your prompts for better model performance.

Flows

A visual workflow builder for chaining foundation models, agents, knowledge bases, and guardrails into multi-step pipelines. Drag-and-drop orchestration without writing glue code.

Data Automation

GenAI-powered processing for documents, images, audio, and video. Extracts structured data from unstructured inputs at scale.

The Nova Model Family

Amazon's first-party model lineup launched in late 2024. The Nova family covers text, image, video, and speech, giving AWS a model option at every price point without depending entirely on third-party providers.

Nova Model Evolution
1
Dec 2024
Nova v1 Launch
Four text/understanding models: Premier (1M context), Pro (300K), Lite (300K), Micro (128K, text-only at $0.035/M input). Plus Canvas (image gen) and Reel (video gen).
2
Early 2025
Nova Sonic
Real-time speech-to-speech model supporting conversational AI in 5 languages. Enables voice-native applications without separate ASR/TTS pipelines.
3
2026
Nova 2 Family
Nova 2 Lite and Nova 2 Pro (with extended thinking). Nova 2 Omni in preview: unified multimodal understanding and generation in a single model.
4
Apr 2026
AgentCore Managed Harness + OpenAI Models
AgentCore Managed Harness for production agentic workloads. OpenAI models (GPT and Codex series) added in limited preview via Chat Completions and Responses APIs. Structured outputs in GovCloud.

The Nova models are priced aggressively. Nova Micro at $0.035 per million input tokens is one of the cheapest foundation models available from any provider. Nova Premier at $2.50/$12.50 (input/output) targets workloads where you need a high-capability model but do not want to pay Claude Opus or GPT-5 pricing.

Models and Pricing

Bedrock's pricing is pay-per-token with no minimum commitment on the Standard tier. The model catalog spans price points from $0.035 per million input tokens (Nova Micro) to $15.00 per million (Claude Opus 4). Verified May 2026.

Model Input / 1M Output / 1M Notes
Nova Micro $0.035 $0.14 Text only, 128K context
Nova Lite $0.06 $0.24 Multimodal, 300K context
Nova Pro $0.80 $3.20 Multimodal, 300K, fine-tunable
Nova Premier $2.50 $12.50 1M context, distillation teacher
Claude 3.5 Haiku $0.25 $1.25 Anthropic fast tier
Mistral Large 3 $0.50 $1.50 Mistral AI flagship
Gemma 3 4B $0.04 $0.08 Google open model
Claude 3.5 Sonnet v2 $3.00 $15.00 Anthropic balanced tier
Claude Opus 4 $15.00 $75.00 Anthropic premium tier

Service Tiers

Standard
Pay-per-token
  • No upfront commitment
  • On-demand pricing
  • All models available
  • Best for variable workloads
Priority
75% premium over Standard
  • Guaranteed high throughput
  • Lowest latency routing
  • Production-critical workloads
  • SLA-backed performance
Reserved
Custom term commitment
  • Dedicated capacity
  • Predictable pricing
  • 1-6 month terms
  • Highest throughput guarantee

Additional costs: Knowledge Bases $0.001-$0.002/query. Guardrails content filters $0.15/1K text units. Batch inference 50% off on-demand. Prompt caching reduces costs up to 90%. Pricing verified May 14, 2026.

Bedrock vs the Competition

Three platforms dominate managed foundation model access. Each reflects the strategic priorities of its parent company.

Dimension Amazon Bedrock Azure AI Foundry Google Vertex AI
Model breadth 100+ models, 15+ providers OpenAI-centric + select partners Gemini-centric + Model Garden
Flagship model Claude (Anthropic) + Nova (Amazon) GPT-5.x (OpenAI) Gemini 3.x (Google)
RAG support Knowledge Bases (managed) AI Search + Azure OpenAI Vertex AI Search + Grounding
Agent platform Agents + AgentCore Azure AI Agent Service Vertex AI Agent Builder
Safety Guardrails (6 policies) Content Safety + Prompt Shields Responsible AI Toolkit
Differentiator Multi-model choice, AWS integration Deep Microsoft 365 integration BigQuery-native ML, Search grounding
15+ Providers
Bedrock offers models from more AI providers through a single API than Azure AI Foundry or Google Vertex AI, which center on OpenAI and Gemini respectively
Source: AWS Bedrock model catalog, May 2026

Bedrock's advantage is model diversity. If you need Claude for reasoning, Llama for cost-sensitive batch processing, and Stability AI for image generation, Bedrock lets you use all three through the same API with the same security controls. Azure and Google both steer you toward their in-house flagship models.

The trade-off: if your organization is already deep in Microsoft 365 or Google Workspace, the competing platforms offer tighter integration with those specific productivity ecosystems. Bedrock's ecosystem advantage only applies if you are already on AWS.

Who Should Use Bedrock

Who Gets the Most Value
💻
Application Developers

Serverless access to 100+ models through a unified API. No GPU provisioning, no endpoint management. Build a GenAI feature with a single API call and let AWS handle the scaling. The OpenAI-compatible API makes migration from existing GPT-based apps straightforward.

Best fit: Standard tier + Converse API
🏗
Enterprise Architects

Multi-model strategy without vendor lock-in to a single AI provider. Guardrails, IAM integration, VPC endpoints, and compliance certifications (FedRAMP High, HIPAA, SOC) satisfy enterprise security requirements. AgentCore runs on any framework and any model.

Best fit: Reserved tier + Guardrails + AgentCore
🔬
ML Engineers

Fine-tuning, reinforcement fine-tuning, continued pre-training, and model distillation without managing training infrastructure. Import custom models trained on SageMaker or elsewhere. Model evaluation tools for side-by-side comparison before deployment.

Best fit: Custom models + SageMaker integration
🚀
Startup CTOs

Start with Nova Micro at $0.035/M input tokens, prototype fast, then swap to Claude or Llama when you need more capability. No infrastructure investment. No long-term commitment. Flex tier cuts costs another 50% for batch workloads.

Best fit: Standard/Flex tier + Nova models

Limitations

Bedrock solves the infrastructure problem, but it introduces its own constraints. Enterprise buyers should weigh these against the platform's strengths before committing.

Key Limitations
Model Availability Varies by Region

Not all 100+ models are available in every AWS region. Specific models may be limited to US East or US West regions. If you need data residency in a particular geography, verify model availability for that region before building. The 33-region headline number does not mean all models run everywhere.

Pricing Complexity

Four service tiers (Standard, Flex, Priority, Reserved), per-model token pricing, separate charges for Knowledge Bases, Guardrails, agent invocations, and custom model training. Cost prediction requires careful modeling. The Flex tier's 50% discount only applies to flexible-schedule workloads, not real-time inference.

AWS Ecosystem Lock-In

Bedrock's deepest integrations (S3, Aurora, IAM, VPC, CloudWatch) only work within AWS. Knowledge Bases, Agents, and Flows are AWS-native services with no portable equivalent. If you later move to Azure or GCP, the model calls are portable (via OpenAI-compatible API), but the orchestration layer is not.

Less Customization Than SageMaker

Bedrock's managed approach trades control for convenience. If you need custom training loops, non-standard model architectures, or full control over the inference stack, SageMaker is the AWS tool for that. Bedrock's fine-tuning and distillation cover common customization needs, but not every edge case.

Frequently Asked Questions
Amazon Bedrock is a fully managed AWS service that provides API access to 100+ foundation models from 15+ providers, including Anthropic Claude, Meta Llama, Mistral AI, and Amazon's own Nova family. It is serverless: you pay per token with no infrastructure to manage. GA since September 28, 2023.
Bedrock uses pay-per-token pricing. Costs range from $0.035 per million input tokens (Nova Micro) to $15.00 per million (Claude Opus 4). Four service tiers are available: Standard (on-demand), Flex (50% off for flexible scheduling), Priority (75% premium for guaranteed throughput), and Reserved (dedicated capacity with term commitment). Batch inference is 50% off on-demand rates. Prompt caching reduces costs up to 90%.
Over 100 models from 15+ providers: Amazon Nova (Micro, Lite, Pro, Premier, Canvas, Reel, Sonic), Anthropic Claude (Opus 4, 3.5 Sonnet v2, 3.5 Haiku), Meta Llama, Mistral AI, AI21 Labs, Cohere, DeepSeek, Stability AI, Luma AI, MiniMax, Moonshot AI (Kimi), Qwen, TwelveLabs, Writer, and OpenAI (limited preview). The Bedrock Marketplace adds 100+ additional models.
Yes. All data is encrypted in transit and at rest. Your data is never shared with model providers and is never used to train or improve base models. Bedrock holds SOC 1/2/3, ISO 27001/27017/27018, HIPAA eligibility, GDPR compliance, FedRAMP High authorization, and CSA STAR Level 2 certification. AWS provides uncapped intellectual property indemnity for Bedrock outputs.
Bedrock is for building applications on top of existing foundation models (serverless, API-based, managed). SageMaker is for training, fine-tuning, and deploying custom ML models with full control over the infrastructure and training pipeline. Use Bedrock when you want to call a model. Use SageMaker when you want to build or deeply customize a model. They complement each other: you can fine-tune on SageMaker and import the result into Bedrock.
AgentCore is an end-to-end agentic platform within Bedrock (GA late 2025). It provides a runtime, gateway, memory, identity management, policy engine, code interpreter, browser tool, evaluations, and observability. Unlike Bedrock Agents, AgentCore works with any framework (LangChain, CrewAI, custom) and any model, not just Bedrock-hosted models. The Managed Harness (April 2026) adds deployment, scaling, and security automation for agentic workloads.