How to Use Amazon Bedrock Guardrails: Complete Setup Guide (2026)
Last verified: May 14, 2026 · Format: Guide · Est. time: 20-25 min
Your LLM application is live, but what stops it from generating harmful content, leaking customer PII in a response, or producing a hallucination like a refund policy that does not exist? Amazon Bedrock Guardrails is AWS's answer: a configurable enforcement pipeline that sits between any foundation model and your end users, evaluating every input and output against policies you define.
This guide walks through every step of setting up Guardrails from scratch. You will create content filters, configure denied topics, set up PII detection, add word and regex filters, enable contextual grounding, and use the ApplyGuardrail API to protect models outside the Bedrock ecosystem. Each step includes verification instructions so you can confirm the guardrail works before moving to production. For a deeper understanding of what Guardrails is and how it fits the broader AI safety landscape, see What Is Amazon Bedrock Guardrails?
What You Need Before Starting
Bedrock Guardrails is a managed service within Amazon Bedrock. If you are new to the platform, our What Is Amazon Bedrock? breakdown covers the full model catalog and architecture. You do not need special hardware, but you do need a properly configured AWS account. Walk through this checklist before proceeding.
bedrock:CreateGuardrail, bedrock:UpdateGuardrail, bedrock:GetGuardrail, bedrock:ApplyGuardrail- ✓Step 1: Navigate the Guardrails Console
- ✓Step 2: Create Your First Guardrail
- ✓Step 3: Configure Denied Topics
- ✓Step 4: Add PII Filters
- ✓Step 5: Set Up Word and Regex Filters
- ✓Step 6: Enable Contextual Grounding
- ✓Step 7: Use the ApplyGuardrail API
- ✓Step 8: Test and Monitor
Step 1: Navigating the Guardrails Console
Guardrails are configured through the Amazon Bedrock console, AWS CLI, CloudFormation, or Terraform (via the aws_bedrock_guardrail resource). This guide uses the console for visual clarity, but every action has a CLI equivalent.
- Sign in to the AWS Management Console and navigate to Amazon Bedrock.
- In the left navigation pane, select Guardrails under the Safeguards section.
- You will see a list of existing guardrails (empty if this is your first setup). The Create guardrail button is in the top-right corner.
- Confirm you are in a supported region. Automated Reasoning requires US East (Ohio or N. Virginia), US West (Oregon), or select European regions (Frankfurt, Ireland, Paris).
Verification: You should see the Guardrails management page with a "Create guardrail" button. If you see an access denied error, check that your IAM user or role has the bedrock:*Guardrail* permissions. If Guardrails does not appear in the navigation, confirm Bedrock is enabled in your selected region.
Step 2: Creating Your First Guardrail (Content Filters)
Content filters are the foundation of every guardrail. They detect and block harmful content across six categories: Hate, Insults, Sexual, Violence, Misconduct, and Prompt Attack (jailbreak and injection attempts). You set the sensitivity threshold for each category independently.
- Click Create guardrail.
- Enter a descriptive name (for example,
customer-service-guardrail) and an optional description. - Under Content filters, you will see the six categories. For each category, set the filter strength:
- None: Disabled for this category
- Low: Blocks only the most explicit content
- Medium: Balanced detection (recommended starting point)
- High: Aggressive filtering, may catch borderline content
- Set Prompt Attack to High. This protects against jailbreak and prompt injection attempts. The other categories depend on your use case, but Medium is a reasonable default for customer-facing applications.
- Choose your safeguard tier:
- Standard: 15%+ recall improvement, 60-language support, code-level protection. Recommended for most production deployments. Requires cross-region inference opt-in.
- Classic: Lower latency, simpler evaluations. Suitable for straightforward moderation without multilingual or code requirements.
- Set a blocked messaging response. This is the message users see when their input or the model's output is blocked. Write something clear: "I'm unable to respond to that request. Please rephrase your question."
- Click Next (do not save yet; you will add more policies in the following steps).
Verification: After configuring content filters, the console should show green checkmarks next to each enabled category with the selected threshold level. The pricing preview in the console sidebar should show $0.15 per 1,000 text units for content filters.
High sensitivity catches borderline content and increases false positives. Start at Medium for most categories and only raise to High for Prompt Attack. Tune thresholds based on real traffic data, not assumptions.
If you do not set a custom blocked messaging response, users see a generic AWS error. Write a clear, brand-appropriate message that tells users how to rephrase their request.
Standard tier provides 15%+ recall improvement and 60-language support at the same price as Classic. Unless your application has strict latency budgets under 100ms, test Standard first before defaulting to Classic.
Step 3: Configuring Denied Topics
Denied topics prevent your application from discussing subjects that are off-limits. Unlike word filters, denied topics use semantic understanding: the guardrail recognizes the intent behind a question, not just specific keywords.
- In the guardrail configuration wizard, navigate to the Denied topics section.
- Click Add denied topic.
- Give the topic a name (for example, "Investment Advice").
- Write a natural language description of the topic: "Any recommendations, suggestions, or guidance about buying, selling, or holding financial securities, stocks, bonds, cryptocurrency, or other investment instruments."
- Add example prompts that should be blocked:
- "Should I buy Tesla stock?"
- "What is the best cryptocurrency to invest in right now?"
- "How should I allocate my 401(k)?"
- Repeat for each topic you want to deny. Common examples for enterprise applications:
- Competitor product recommendations
- Medical diagnosis or treatment advice
- Legal counsel or interpretation
- Internal company financials
Verification: Each denied topic should appear in the configuration list with its name and description. After the guardrail is created, test by sending a prompt that matches a denied topic. The response should return your configured blocked message, not the model's answer. Try rephrasing the question to verify semantic detection works beyond exact keyword matching.
Step 4: Adding Sensitive Information Filters (PII Detection)
Sensitive information filters detect and handle personally identifiable information. You control the action per PII type: block the entire request or anonymize the PII by replacing it with a placeholder.
- Navigate to the Sensitive information filters section in the wizard.
- Enable the PII types relevant to your application. Available types include:
- Names, email addresses, phone numbers
- Social Security numbers, credit card numbers
- IP addresses, AWS account IDs
- Driver's license numbers, passport numbers
- For each PII type, choose the action:
- Block: Reject the entire request or response
- Anonymize: Replace the detected PII with a placeholder (for example,
{EMAIL},{PHONE},{SSN})
- For most applications, use Anonymize for common identifiers (email, phone, name) and Block for high-sensitivity data (SSN, credit card numbers).
- Add custom regex patterns for organization-specific identifiers. For example, if your employee IDs follow the pattern
EMP-followed by six digits:
Regex-based filters are free; no per-unit charge applies.EMP-\d{6}
Verification: After the guardrail is active, send a test prompt containing a fake email address (for example, "Send the report to test@example.com"). If anonymization is enabled, the model should receive the prompt with {EMAIL} replacing the address. If blocking is enabled, the request should be rejected entirely.
Step 5: Setting Up Word Filters and Regex
Word filters are the simplest policy type: exact-match keyword blocking. Use them for profanity, competitor names, internal project codenames, or any term that should never appear in inputs or outputs. Word filters are free.
- Navigate to the Word filters section.
- Enable the managed word list to block common profanity. AWS maintains this list.
- Add custom words and phrases. Enter each term on a new line or upload a CSV file. Examples:
- Competitor product names: "CompetitorX", "RivalProduct"
- Internal codenames: "Project Phoenix", "Alpha Build"
- Sensitive terms specific to your industry
- Note that word filters are case-insensitive and match exact strings. The phrase "CompetitorX" will also catch "competitorx" but will not catch "Competitor X" (with a space). Add variations as needed.
Verification: After saving, send a test prompt containing one of your blocked words. The guardrail should trigger the blocked response. Then test with a slight variation (different casing) to confirm case-insensitive matching works. Test with a spaced variation to confirm it does not match (this is expected behavior for exact-match filters).
Step 6: Enabling Contextual Grounding
Contextual grounding checks evaluate whether the model's response is grounded in the provided reference source and relevant to the user's query. This is the hallucination detection layer for RAG (Retrieval-Augmented Generation) applications. If your application retrieves documents and uses them to ground model responses, this filter catches cases where the model drifts from the source material.
- Navigate to the Contextual grounding section.
- Set the Grounding threshold (0.0 to 1.0). This measures how closely the response aligns with the reference source. A threshold of 0.7 is a reasonable starting point. Higher values are stricter.
- Set the Relevance threshold (0.0 to 1.0). This measures whether the response addresses the user's actual question. A threshold of 0.7 is a reasonable default.
- Responses that fall below either threshold are blocked.
- Note: Contextual grounding costs $0.10 per 1,000 text units. The text unit count combines the source document, query, and response characters.
Verification: Test with a Knowledge Base query where the answer is clearly supported by the retrieved documents. The response should pass. Then test with a follow-up question that the documents do not cover. If the model fabricates an answer, the grounding check should block it and return your configured blocked message.
Step 7: Using the ApplyGuardrail API (Model-Independent)
The ApplyGuardrail API decouples the evaluation engine from model invocation. You send text directly to the API, and it returns the evaluation result without calling any foundation model. This enables three critical patterns: applying guardrails to third-party model outputs (OpenAI, Google Gemini), using guardrails with self-hosted models on SageMaker or EC2, and running pre-flight validation on prompts before they enter any model pipeline. For teams using autonomous agents in production, pairing Guardrails with agentic workflows like the AWS DevOps Agent adds an additional enforcement boundary.
Basic API Call (Python Boto3)
import boto3
client = boto3.client('bedrock-runtime', region_name='us-east-1')
response = client.apply_guardrail(
guardrailIdentifier='your-guardrail-id',
guardrailVersion='DRAFT', # or a specific version number
source='OUTPUT', # 'INPUT' for prompt checks, 'OUTPUT' for response checks
content=[
{
'text': {
'text': 'The model response text you want to evaluate goes here.'
}
}
]
)
# Check the result
action = response['action'] # 'NONE' (passed) or 'GUARDRAIL_INTERVENED' (blocked)
print(f"Action: {action}")
if action == 'GUARDRAIL_INTERVENED':
for output in response.get('outputs', []):
print(f"Blocked message: {output['text']}")
for assessment in response.get('assessments', []):
print(f"Assessment details: {assessment}")
Integration Pattern for Third-Party Models
To protect an OpenAI or Gemini deployment with Bedrock Guardrails:
- Send the user's prompt to the ApplyGuardrail API with
source='INPUT'. - If the action is
NONE, forward the prompt to your third-party model. - When the model responds, send the response to the ApplyGuardrail API with
source='OUTPUT'. - If the action is
NONE, return the response to the user. IfGUARDRAIL_INTERVENED, return the blocked message.
Verification: Run the API call with a benign text string. The response should return action: NONE. Then test with text that violates one of your configured policies (for example, a string containing a blocked word). The response should return action: GUARDRAIL_INTERVENED with the assessment showing which policy was triggered.
Step 8: Testing and Monitoring Your Guardrails
A guardrail that is not tested is a guardrail you cannot trust. Bedrock provides a built-in test interface in the console, and you should supplement it with programmatic tests before production deployment.
Console Testing
- In the Guardrails console, select your guardrail and click Test.
- Choose a foundation model to test against (or use the ApplyGuardrail API mode for model-independent testing).
- Enter test prompts that should pass and prompts that should be blocked. Document the results.
- For Automated Reasoning policies (if enabled), AWS now generates test Q&A pairs automatically from your policy documents. Review these generated tests to identify edge cases your policy may not cover.
Monitoring in Production
- CloudWatch Metrics: Bedrock publishes guardrail evaluation metrics to CloudWatch. Monitor
GuardrailInvocations,GuardrailIntervened, andGuardrailPassedto track block rates over time. - CloudWatch Logs: Enable detailed logging to capture which policy type triggered each intervention. This is essential for tuning thresholds.
- CloudTrail: All guardrail API calls are logged in CloudTrail for audit and compliance purposes.
Pricing Awareness
Each enabled filter charges independently per evaluation. For a customer service chatbot processing 100,000 interactions per month (average 3,000 characters per interaction = 3 text units each):
- Content filters: 300,000 text units × $0.15/1K = $45.00
- Denied topics: 300,000 text units × $0.15/1K = $45.00
- PII filters: 300,000 text units × $0.10/1K = $30.00
- Total: $120.00/month for three active policy types
Word filters and regex-based sensitive information filters are free. Contextual grounding costs $0.10/1K text units. Automated Reasoning costs $0.17/1K text units per policy.
Verification: After deploying to production, confirm CloudWatch metrics are flowing. The GuardrailIntervened metric should show non-zero counts if your application handles real user traffic. If the metric is zero after significant traffic, your guardrail may not be properly attached to the model invocation. Check the Bedrock InvokeModel call to confirm the guardrail ID and version are included.
The most frequent production issue: the guardrail exists but the InvokeModel or Converse API call does not include the guardrailIdentifier and guardrailVersion parameters. The model responds unfiltered with no error, giving a false sense of safety.
Automated Reasoning is only available in US East (Ohio, N. Virginia), US West (Oregon), and select European regions (Frankfurt, Ireland, Paris). Deploying in an unsupported region silently skips Automated Reasoning checks without error.
If GuardrailIntervened reads zero after significant traffic, the guardrail is likely not attached to the model invocation. Verify the guardrail ID and version in your API call. Also check that the guardrail version is not set to DRAFT in production.
The bedrock:ApplyGuardrail permission is separate from bedrock:InvokeModel. If guardrail evaluation fails silently, confirm your IAM role includes both permissions. Cross-account safeguards require additional Organizations-level permissions.
Troubleshooting and FAQ
source parameter set to INPUT or OUTPUT. This works with OpenAI, Google Gemini, self-hosted open-source models on SageMaker or EC2, or any other foundation model regardless of where it runs.Next Step
Start with a single guardrail using content filters and denied topics on a non-production workload. Monitor the CloudWatch metrics for one week to understand your block rate and false positive rate. Then add PII filters and contextual grounding incrementally. This phased approach lets you tune thresholds based on real data before committing to a full production rollout. For organizations with multiple AWS accounts, evaluate cross-account safeguards to centralize policy enforcement across your entire estate.