OpenRouter is a hosted aggregator that exposes a single, OpenAI-compatible API endpoint giving access to hundreds of AI models from many providers. Instead of integrating each provider separately, you send requests to one endpoint and OpenRouter handles provider selection, automatic fallbacks, and cost-effective routing. As of 2026-06-09, OpenRouter's vendor-reported catalog spans 341 text, 32 image, 26 embedding, 14 video, 10 transcription, 9 speech, 4 audio, and 3 rerank models.

How Much Does OpenRouter Cost?

OpenRouter uses a pay-as-you-go credit system. Cost is computed on each model's native tokenizer, and multimodal cost is dynamic based on reasoning effort, resolution, and complexity. Many models are free ($0), such as Nex-N2-Pro, Gemma 4 26B A4B, and Nemotron 3 Ultra. Paid model prices mirror the underlying provider's cost. For example, as of 2026-06-09, Claude Opus 4.8 is listed at $5 per million input tokens and $25 per million output tokens. Pricing is fast-moving, so verify at openrouter.ai/models before purchasing.

LLM Gateways

What Is OpenRouter? The Unified AI Model API Explained

Q: How Do You Call the OpenRouter API?

You send a POST request to https://openrouter.ai/api/v1/chat/completions with an Authorization: Bearer header containing your OpenRouter API key, and a JSON body with a model field and a messages array. The endpoint is OpenAI-compatible. There are five integration paths: the raw OpenRouter API, official client SDKs, the Agent SDK (@openrouter/agent) for multi-turn loops and tools, the OpenAI SDK pointed at OpenRouter by changing the base URL, and third-party SDKs.

Q: Does OpenRouter Keep My Data Private?

OpenRouter's privacy posture is per-model, not uniform. Zero Data Retention (ZDR) is a platform feature enabled for some models, such as Relace Apply 3 and Morph V3, while other models explicitly warn that prompts and completions may be logged by the provider, such as Owl Alpha. Never assume blanket Zero Data Retention. Check each model's data policy on its catalog page before routing sensitive data.

OpenRouter is a hosted aggregator that gives you one OpenAI-compatible API endpoint reaching hundreds of AI models across many providers. Instead of wiring up a separate integration for OpenAI, Anthropic, Google, and every other vendor, you point your application at a single URL and OpenRouter handles provider selection, automatic fallbacks, and cost-effective routing on your behalf. If you have ever burned a sprint rewriting request code just to swap one model for another, this is the problem OpenRouter is built to remove.

This breakdown covers what the platform actually is, the size and shape of its model catalog, how the pay-as-you-go pricing works, the five ways to call the API, how routing and fallbacks behave under load, and the one thing most write-ups get wrong: data privacy on OpenRouter is decided per model, not across the whole platform. Catalog counts and prices below are vendor-reported as of 2026-06-09.

400+

Models in Catalog

OpenRouter Models

Unified Endpoint

Quickstart

Integration Methods

API Reference

Free Models Available

OpenRouter Models

What Is OpenRouter?

OpenRouter sits between your application and the model providers as a single proxy layer. You authenticate once, target a model by name, and OpenRouter forwards the request to whichever upstream provider serves that model. Because the endpoint follows the OpenAI Chat Completions shape, most code that already speaks to OpenAI works against OpenRouter with little more than a changed base URL and key.

The value is consolidation. One account, one credit balance, one request format, and one place to see what you spent across every model you touched. When a new model lands in the catalog, you can test it by changing a single string in your request rather than onboarding a new vendor SDK. That makes OpenRouter an LLM gateway in the managed, hosted sense: it is the control point in front of the model layer, run for you rather than self-hosted.

Practitioner note: OpenRouter is not a model and does not train one. It is the routing and billing layer in front of other people's models. The practical win is that model selection becomes a runtime decision instead of an architecture decision. The tradeoff is that you are adding a hop, and you inherit each upstream provider's behavior, latency, and data policy through that hop.

The Model Catalog

OpenRouter spans far more than chat. The catalog is organized by modality, and text generation is only the largest slice of it. The figures below are pulled straight from OpenRouter's own models listing and are vendor-reported as of 2026-06-09. Treat them as a snapshot: the catalog churns constantly as providers add and retire models.

Text

Chat and completion models, the core of the catalog

Models 341

Endpoint chat/completions

Image & Video

Generation and understanding across visual modalities

Image 32

Video 14

Audio & Speech

Transcription, speech synthesis, and audio models

Transcription 10

Speech 9

Audio 4

Retrieval

Embeddings and rerank models for search and RAG

Embeddings 26

Rerank 3

The breadth matters for one reason: a single integration covers your whole pipeline. You can call a text model for reasoning, an embedding model to index your documents, a rerank model to sharpen retrieval, and a transcription model to ingest audio, all through the same key and the same billing ledger. For a RAG stack that would otherwise mean three or four separate vendor relationships, that consolidation is the headline feature.

How Pricing Works

OpenRouter runs on a pay-as-you-go credit system. You load credits, and each request draws down your balance based on the model you called. Cost is computed on the model's native tokenizer, so the same prompt can cost different amounts on different models because each one counts tokens its own way. For multimodal models, cost is dynamic and depends on factors like reasoning effort, output resolution, and request complexity rather than a flat per-token rate.

Two things make the pricing easy to reason about. First, a large set of models are free, listed at $0, which is the cheapest possible way to prototype before you commit spend. Second, paid models mirror the underlying provider's cost, so OpenRouter is passing through the upstream price rather than marking it up into a black box.

$5 / $25

Example paid rate: Claude Opus 4.8 is listed at $5 per million input tokens and $25 per million output tokens, mirroring the provider's cost. Vendor-reported, 2026-06-09. Verify before you budget.

For a concrete sense of the spread, the free tier includes models such as Nex-N2-Pro, Gemma 4 26B A4B, and Nemotron 3 Ultra at $0. On the paid side, the same Claude Opus 4.8 model is also offered in a faster variant listed at $10 per million input and $50 per million output, so you can trade money for latency on the exact same model. Because prices move quickly, always confirm the current number on the OpenRouter models page before you size a budget.

API & Integration Methods

You authenticate with an OpenRouter API key, sent as a Bearer token. The core request is a POST to https://openrouter.ai/api/v1/chat/completions with a JSON body that names a model and carries a messages array. Because the contract matches OpenAI's, the request below is the canonical starting point and is the same shape your existing OpenAI code already produces.

OpenRouter Chat Completions (curl)

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer $OPENROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-opus-4.8",
    "messages": [
      { "role": "user", "content": "Explain what an LLM gateway does in one sentence." }
    ]
  }'

That raw call is one of five supported integration paths. The right one depends on how much structure you want around the request:

The raw HTTP endpoint shown above. POST to chat/completions with a Bearer key. Maximum control, zero added dependencies.

Official type-safe libraries published to npm and pip. Use these when you want autocomplete, typed responses, and fewer hand-rolled fetch calls.

The @openrouter/agent package handles multi-turn loops, tool calling, and state through a callModel primitive. For building agents rather than single calls.

Point the official OpenAI SDK at OpenRouter by changing the base URL and key. Existing OpenAI code keeps working with almost no rewrite.

The fifth path is the broad ecosystem of third-party SDKs that target the OpenRouter endpoint. For most teams the decision is simple: if you are migrating an OpenAI codebase, use the drop-in path; if you are starting fresh and building agents, reach for the Agent SDK.

Routing & Fallbacks

Many popular models are served by more than one upstream provider. OpenRouter's routing layer picks among them, favoring the least-expensive or best-available capacity rather than pinning you to a single backend. When a model has several providers behind it, this is what keeps a request flowing even when one provider is degraded.

Fallbacks are the resilience half of the same system. If a request hits a 5xx server error or a provider rate-limit, OpenRouter can automatically retry against an alternative provider for the same model. You can also take manual control by passing a models: [] array of preferences plus route: 'fallback', which tells the gateway the exact order to try.

Practitioner note: The manual fallback list is the feature worth wiring in early. Define a primary model and one or two cheaper or more available backups, then let OpenRouter walk the list on failure. It turns a provider outage from a paging incident into a brief, automatic degrade rather than a hard error your users see.

The Privacy Caveat

This is the part you cannot skim. OpenRouter's data handling is decided per model, not uniformly across the platform. Zero Data Retention (ZDR) is a platform capability, but it is enabled only for some models. Other models carry an explicit warning that prompts and completions may be logged by the upstream provider. The gateway in front of them is the same; the policy behind each one is not.

Some models run with Zero Data Retention enabled, for example Relace Apply 3 and Morph V3. That status does not extend to the rest of the catalog. Check the data policy on each model's own page before assuming retention behavior.

Certain models, such as Owl Alpha, explicitly warn that prompts and completions may be logged by the provider. Do not route regulated or sensitive data through a model without first confirming its retention policy.

You can send an optional user identifier with requests to help OpenRouter with abuse detection. It aids platform safety, but it is not a privacy control and does not change a model's retention policy.

The operational takeaway: build a per-model policy gate into your own stack. Maintain an allowlist of models whose retention behavior you have verified for the data class you are sending, and never assume blanket Zero Data Retention because OpenRouter supports the feature somewhere in its catalog.

When to Use OpenRouter

OpenRouter earns its place when model choice is a moving target. Here is an honest read on where the managed, hosted approach fits and where you would reach for something else.

Use it when...

You want instant access to a broad catalog without standing up infrastructure, you are comparing models frequently, or you want one credit balance and one bill across many providers. The free models make it the cheapest way to prototype before committing spend.

Look elsewhere when...

You need to self-host the gateway, enforce a single uniform data-retention policy across every call, or run inside your own network boundary. A self-hostable gateway like LiteLLM fits those control requirements better.

Application Developers

If your code already talks to OpenAI, the drop-in SDK path makes adoption almost free. You gain model portability and fallbacks without re-architecting your request layer.

Platform & ML Teams

Useful for centralizing model access and spend visibility across teams. Pair it with your own per-model policy gate so retention and compliance decisions stay under your control, not the catalog's defaults.

Routing through a hosted proxy adds a network hop and a dependency you do not operate. For latency-critical paths, benchmark the added round trip and confirm the gateway's availability fits your service objectives.

Because retention is set per model, governance cannot be a single switch. Teams with strict compliance needs must track and enforce model-level data policies themselves rather than relying on a platform-wide guarantee.

Frequently Asked Questions

What is OpenRouter used for?

OpenRouter is used to reach many AI models through one unified API. Developers use it to compare models quickly, add automatic fallbacks so a provider outage does not break their app, consolidate billing across providers into a single credit balance, and prototype on free models before committing spend. It covers text, image, video, audio, embeddings, and rerank models through the same endpoint.

How much does the OpenRouter API cost?

OpenRouter uses pay-as-you-go credits, with cost computed on each model's native tokenizer. Many models are free at $0. Paid models mirror the underlying provider's price. As an example, as of 2026-06-09 Claude Opus 4.8 is listed at $5 per million input tokens and $25 per million output tokens. Pricing changes often, so check the OpenRouter models page for current rates.

How do you call the OpenRouter API?

Send a POST request to https://openrouter.ai/api/v1/chat/completions with an Authorization: Bearer header holding your OpenRouter API key and a JSON body containing a model name and a messages array. The endpoint is OpenAI-compatible, so existing OpenAI client code works after changing the base URL. There are five integration paths in total: the raw API, client SDKs, the Agent SDK, the OpenAI SDK drop-in, and third-party SDKs.

Does OpenRouter keep my data private?

Privacy on OpenRouter is set per model, not across the whole platform. Zero Data Retention is enabled for some models, such as Relace Apply 3 and Morph V3, while other models, such as Owl Alpha, explicitly warn that prompts and completions may be logged by the provider. Always confirm a specific model's data policy on its catalog page before routing sensitive data, and never assume blanket Zero Data Retention.

What is the difference between OpenRouter and a self-hosted gateway?

OpenRouter is a managed, hosted aggregator: you get instant catalog access with no infrastructure to run. A self-hostable gateway like LiteLLM gives you the same unified-API idea but inside your own network, with one uniform data policy you control. Choose OpenRouter for speed and breadth; choose a self-hosted gateway when control and uniform governance are the priority.

Video Resources

What Is OpenRouter? API Tutorial

YouTube Search

Beginner walkthrough of the unified endpoint, API keys, and a first chat completion request.

OpenRouter Model Routing & Fallbacks

YouTube Search

How automatic routing and the manual fallback list behave under provider errors and rate limits.

Using the OpenAI SDK with OpenRouter

YouTube Search

Migrating existing OpenAI code to OpenRouter by changing the base URL and key.

LLM Gateways

What Is an LLM Gateway?

The control layer between your app and many model providers, and why teams put one in front of the model layer.

LLM Gateways

What Is LiteLLM?

The open-source, self-hostable gateway and SDK that gives you a unified API inside your own network.

Comparison

OpenRouter vs LiteLLM

Hosted aggregator versus self-hosted gateway: which model-routing approach fits your control and scale needs.

Top N

Best LLM Gateways 2026

Ranked comparison of the leading LLM gateways and AI gateways for production use this year.

Go Deeper

Resources from across Tech Jacks Solutions

Agent Frameworks Compared

Side-by-side analysis of agent frameworks that sit on top of model APIs

Agent Threat Landscape

Security risks when applications broker access to many model providers

FREEAgentic AI Compliance Assessment

Compliance checklist for autonomous agent deployments

PREMIUMPre-Deployment Safety Gate

27-point checklist before any AI tool goes live

IAPP AIGP Certification

The AI governance certification for privacy professionals

Fact-checked against vendor documentation and official sources, June 2026. Verify current pricing at openrouter.ai/models before purchasing.

OpenRouter is a trademark of OpenRouter, Inc. Claude and Anthropic are trademarks of Anthropic. OpenAI and GPT are trademarks of OpenAI. Gemini and Gemma are trademarks of Google. All other trademarks belong to their respective owners.

Gallery

Contacts

What Is OpenRouter? The Unified AI Model API Explained

What Is OpenRouter?

The Model Catalog

How Pricing Works

API & Integration Methods

Routing & Fallbacks

The Privacy Caveat

When to Use OpenRouter

Frequently Asked Questions

What is OpenRouter used for?

How much does the OpenRouter API cost?

How do you call the OpenRouter API?

Does OpenRouter keep my data private?

What is the difference between OpenRouter and a self-hosted gateway?

Video Resources

Go Deeper

Services

Learn

Company