Is LiteLLM self-hosted and OpenRouter hosted?

Yes. LiteLLM's core SDK and proxy server are open-source and free to self-host, with an optional commercial Enterprise tier. OpenRouter is a managed hosted aggregator: you call its endpoint and it routes your request to the underlying provider. With self-hosted LiteLLM your traffic stays in your own infrastructure; with OpenRouter it passes through a third party.

Does OpenRouter offer free models?

OpenRouter lists many models priced at $0 per token (vendor-reported examples include Nex-N2-Pro, Gemma 4 26B A4B, and Nemotron 3 Ultra). Paid models mirror the underlying provider's cost on a pay-as-you-go credit system. This makes OpenRouter attractive for experimentation, but free and paid models still route through a third party.

Can I switch between OpenRouter and LiteLLM easily?

Both expose an OpenAI-compatible interface, so in many setups you change the base URL and key rather than rewriting application code. That OpenAI-format compatibility is the shared design choice that makes a gateway switch low-friction, and it is also why some teams start on OpenRouter for speed and move to self-hosted LiteLLM as data-control requirements harden.

LLM Gateways

OpenRouter vs LiteLLM: Which LLM Gateway Should You Use in 2026?

Q: Should I use OpenRouter or LiteLLM?

Choose LiteLLM when you want control: a self-hostable, open-source proxy and Python SDK that keeps request data inside your own infrastructure, manages multi-provider API keys centrally, and translates every provider into the OpenAI Chat Completions format. Choose OpenRouter when you want breadth and speed: a hosted aggregator that gives instant access to hundreds of models (including free $0 options) with zero infrastructure to run, ideal for fast experimentation. The trade-off is data control versus convenience.

Both put a single, OpenAI-compatible endpoint in front of dozens of model providers. That shared surface makes them look interchangeable in a feature checklist. They are not. OpenRouter is a hosted aggregator you call over the internet. LiteLLM is an open-source proxy and Python SDK you can run inside your own infrastructure. The right pick comes down to one question that no marketing page wants to lead with: how much do you care about where your prompts go?

Quick Verdict

Skeptic's Verdict

For production and data control, self-host LiteLLM. For breadth, speed, and zero ops, use OpenRouter.

This is one of the few gateway comparisons where a clear position is defensible. LiteLLM is an open-source proxy plus SDK you control: keep request data in your own infrastructure, manage every provider key centrally, and get OpenAI-format translation for 100+ providers (vendor-reported). OpenRouter is a hosted aggregator: instant access to hundreds of models including many priced at $0, with no servers to run. The catch with OpenRouter is that privacy varies per model rather than being uniform, and you are depending on a third party in your request path.

Choose LiteLLM when:

You want request data to stay inside your own infrastructure
You need centralized multi-provider key and budget management
OpenAI-format translation across 100+ providers matters
You are shipping to production and want a self-hosted control plane
Open-source and self-hostable is a procurement requirement

Choose OpenRouter when:

You want instant access to a huge model catalog, no setup
Free $0 models for experimentation are appealing
You have no infrastructure team and want zero ops
Fast prototyping across many models is the priority
Per-model privacy is acceptable for your data sensitivity

What Each One Actually Is

Most "versus" pieces blur these two because both speak the OpenAI Chat Completions dialect. The architecture underneath is where the real decision lives, so it is worth being precise.

LiteLLM: Two Things in One Project

The Python SDK is a drop-in replacement for the OpenAI client. You call completion(), embedding(), or image_generation() and it speaks to whichever provider you target, returning the response in OpenAI format every time. A built-in Router adds retries, fallbacks, and load balancing across deployments. This is the developer-facing form, installed with pip install litellm.

The Proxy Server is the actual gateway: a self-hosted, OpenAI-compatible central service you run yourself. It adds virtual keys with per-key, per-team, and per-user budgets, spend tracking, guardrails such as content filtering and PII masking, an admin dashboard, and the same routing and fallback logic. You install it with pip install 'litellm[proxy]' or a Docker image, and start it with litellm --model . The core SDK and proxy are open-source and free; an Enterprise commercial license adds SSO/SAML, audit logs, and SLAs (pricing is not public). LiteLLM is maintained by BerriAI.

OpenRouter: A Hosted Aggregator

OpenRouter is not something you run. It is a service you call. A single endpoint exposes a unified API to hundreds of models, with automatic fallbacks and automatic routing toward the most cost-effective option. You point your request at /api/v1/chat/completions with a Bearer key, and OpenRouter handles the provider behind the scenes.

There are five ways in: the raw OpenRouter API, official client SDKs for npm and pip, an Agent SDK for multi-turn loops and tool use, the standard OpenAI SDK pointed at OpenRouter's base URL, and assorted third-party SDKs. The breadth is the selling point. As of June 2026 the catalog spans (vendor-reported) 341 text models, 32 image, 26 embeddings, 14 video, 10 transcription, 9 speech, 4 audio, and 3 rerank models. Pricing is pay-as-you-go on a credit system priced against each model's native tokenizer, and a meaningful chunk of the catalog is priced at $0.

Two Different Shapes of Gateway

100+ Providers LiteLLM translates (vendor-reported)

341 Text models in OpenRouter's catalog (vendor-reported)

$0 Price of many OpenRouter catalog models

Side-by-Side Comparison

OpenRouter vs LiteLLM: Feature Comparison

Category	OpenRouter	LiteLLM
Hosting model	Hosted aggregator (managed only) Edge: no ops	Open-source, self-hostable proxy + SDK Edge: control
Data control	Routes through a third party; privacy per-model	Data stays in your infrastructure when self-hosted Edge
Model breadth	Hundreds of models, single catalog (vendor-reported) Edge	100+ providers via translation (vendor-reported)
Free models	Many $0 models for experimentation Edge	Free to self-host; you pay providers directly
Setup effort	Change base URL + key, done Edge	Run and operate the proxy yourself
Key management	One OpenRouter key for the catalog	Virtual keys with per key/team/user budgets Edge
OpenAI compatibility	OpenAI-compatible endpoint Tie	OpenAI-format translation for all providers Tie
Routing & fallbacks	Auto cost/GPU routing; fallback on 5xx or rate-limit	Router with retries, fallbacks, load balancing Tie
Guardrails & budgets	Optional user string for abuse detection	Content filter, PII masking, spend tracking Edge
Pricing	Pay-as-you-go credits, mirrors provider cost	Free OSS; Enterprise license priced privately

Edge indicators reflect category-specific strengths, not overall superiority. The right gateway depends on whether you optimize for control or for breadth and speed.

When LiteLLM Wins

Data Stays in Your Infrastructure

This is the headline reason to self-host. When you run the LiteLLM proxy yourself, your prompts and completions traverse infrastructure you control before they reach a provider. There is no third party sitting in the request path that you did not choose. For regulated workloads, internal data, or anything covered by a data residency requirement, that distinction is the whole ballgame.

Centralized Multi-Provider Key Management

The proxy issues virtual keys scoped per key, per team, and per user, each with its own budget. Spend tracking is built in. Instead of scattering raw provider API keys across services and notebooks, you hold the provider credentials in one place and hand out scoped virtual keys. That is a meaningful reduction in credential sprawl, and it gives finance and platform teams a single point of cost visibility.

OpenAI-Format Translation for 100+ Providers

LiteLLM normalizes every provider into the OpenAI Chat Completions format, including mapping each provider's error types onto OpenAI exception classes. The vendor reports coverage of 100+ providers (OpenAI, Anthropic, Gemini, Vertex AI, Bedrock, Azure, HuggingFace, and more). For a codebase already written against the OpenAI client, switching providers becomes a configuration change rather than a rewrite.

Production Operability

The vendor reports 8ms P95 latency at 1,000 requests per second and load tests handling 1,500-plus requests per second, with -stable Docker images put through 12-hour load tests. Treat those as vendor-reported figures rather than independent benchmarks, but the operational posture is clear: this is built to be run as production infrastructure, with observability hooks for Langfuse, MLflow, Helicone, and Lunary.

When OpenRouter Wins

Instant Access to a Huge Catalog

There is no infrastructure to provision. You get a key, point your existing OpenAI client at OpenRouter's base URL, and you can reach hundreds of models through one account. When the work is "try this prompt against ten different models and see which one is best," OpenRouter removes every step between you and the model. That is hard to beat for breadth.

Free Models Lower the Cost of Experimentation

A real portion of the catalog is priced at $0 per token (vendor-reported examples include Nex-N2-Pro, Gemma 4 26B A4B, and Nemotron 3 Ultra). Paid models follow a pay-as-you-go credit system that mirrors the underlying provider cost, computed on each model's native tokenizer. For early prototyping, the ability to validate an idea against capable models at no token cost is a genuine advantage.

Zero Operations

If you do not have a platform or infrastructure team, the calculus tilts hard toward hosted. OpenRouter handles routing toward least-expensive or best-available GPUs and falls back automatically on 5xx errors or rate limits. You can configure a manual fallback list with route: 'fallback', but the default behavior already absorbs a class of reliability problems you would otherwise engineer yourself.

Multiple Integration Paths

Five integration methods cover most stacks: the raw API, typed client SDKs, an Agent SDK for multi-turn tool-using loops, the OpenAI SDK with a changed base URL, and third-party SDKs. The OpenAI-SDK path in particular means many teams can adopt OpenRouter by editing two lines of configuration.

Privacy and Data Control

This is where the skeptic should slow down, because it is the dimension most likely to be glossed over in a quick comparison. The two products handle your data very differently, and the difference is structural rather than a matter of policy wording.

OpenRouter privacy varies per model. Zero Data Retention is a platform feature that is enabled for some models, such as Relace Apply 3 and Morph V3, but it is not applied uniformly across the catalog. Other models explicitly warn that prompts and completions may be logged by the underlying provider, for example Owl Alpha. There is no blanket ZDR guarantee. An optional user string can be passed to aid abuse detection. The practical takeaway: if data handling matters, you must check the policy for each specific model you intend to use, every time you add one.

LiteLLM, self-hosted, sidesteps the third party entirely. When you run the proxy, your traffic goes from your application, through infrastructure you operate, directly to the providers you have chosen. You are still subject to each provider's data terms at the far end, but there is no intermediate aggregator with its own retention policy to reason about. For teams whose threat model includes "where does this prompt physically travel," that is the deciding factor.

Data Handling Considerations

OpenRouter: Per-Model Privacy

No uniform Zero Data Retention. Some models support ZDR; others warn that the provider may log prompts and completions. Verify each model before sending sensitive data.

LiteLLM: Self-Hosted Path

Run the proxy and request data stays in your infrastructure before reaching providers. No intermediate aggregator retention policy to track.

Both: Provider Terms Still Apply

A gateway does not override the downstream provider's data terms. Review each provider's data processing agreement regardless of which gateway you choose.

LiteLLM: You Operate the Surface

Self-hosting means you own the proxy's security posture: patching, network isolation, and key scoping become your responsibility, not a vendor's.

Honest Limitations

No comparison is honest without naming what each option costs you. These are the trade-offs that do not show up in a feature grid.

OpenRouter Limitations

You depend on a third party: the aggregator sits in your request path. Its availability, routing decisions, and policies become part of your reliability and compliance surface.
Privacy is not uniform: retention behavior differs per model, so a setup that is acceptable for one model may not be for another you add later.
Less central governance: the per-key budget, virtual-key, and guardrail tooling that a self-hosted control plane provides is not the product's focus.
Vendor-reported catalog numbers: the model counts are reported by OpenRouter and shift frequently; treat them as a snapshot, not a contract.

LiteLLM Limitations

You run it: self-hosting means provisioning, patching, monitoring, and securing the proxy. That operational burden is real and ongoing.
The proxy is a high-value target: it brokers every provider key, so a compromise has broad blast radius. Network isolation and key scoping are not optional.
Enterprise pricing is opaque: the open-source core is free, but SSO/SAML, audit logs, and SLAs sit behind a commercial license with no public price.
Performance numbers are vendor-reported: the latency and throughput figures come from the maintainer's own load tests, not independent benchmarks.

Real-World Decision Framework

Skip the feature matrix for a moment. Here is how teams actually arrive at an answer.

Start with data sensitivity. If the prompts contain regulated, confidential, or customer data, and especially if you have a data residency obligation, self-hosted LiteLLM is the safer default. If the data is low-sensitivity or synthetic, OpenRouter's convenience is a fair trade.

Count your operators. If you have a platform or infrastructure team that can run and secure a proxy, LiteLLM is well within reach. If you are a small team or a solo developer with no appetite for running services, OpenRouter removes work you would otherwise have to do.

Match the stage. Early experimentation across many models, including free ones, favors OpenRouter. Hardening a specific set of providers into a governed production pipeline favors LiteLLM. Because both speak OpenAI format, starting on one and moving to the other is realistic.

The phased path is common. Prototype on OpenRouter for breadth and speed, then move the workloads you are committing to onto a self-hosted LiteLLM proxy as data-control and governance requirements harden. The OpenAI-compatible interface on both sides is what makes that migration a configuration exercise rather than a rebuild.

Gateway Picker

Which Gateway Fits Your Situation?

Question 1 of 4

How sensitive is the data in your prompts?

Question 2 of 4

Who will operate the gateway?

Question 3 of 4

What is your model-breadth need?

Question 4 of 4

What stage are you at?

Recommendation: OpenRouter

Your priorities are breadth, speed, and minimal operations, with data sensitivity low enough to accept a hosted aggregator in the path. Point your OpenAI client at OpenRouter's base URL and start trying models, including the $0 ones. Just verify the privacy policy for any specific model before sending data you care about.

Recommendation: LiteLLM (self-hosted)

Your needs point to control: data in your own infrastructure, centralized virtual keys and budgets, and OpenAI-format translation across providers. Run the proxy with pip install 'litellm[proxy]' or the -stable Docker image, scope virtual keys per team, and isolate the proxy on the network since it brokers every provider credential.

Recommendation: Start on OpenRouter, plan for LiteLLM

A phased path fits you. Prototype on OpenRouter to move fast across many models, then migrate the workloads you commit to onto a self-hosted LiteLLM proxy as governance and data-control requirements harden. Both speak OpenAI format, so the switch is mostly a base-URL and key change rather than an application rewrite.

Frequently Asked Questions

Should I use OpenRouter or LiteLLM?

Use LiteLLM when you want control: a self-hostable, open-source proxy and Python SDK that keeps request data in your own infrastructure, manages multi-provider keys centrally, and translates every provider into the OpenAI format. Use OpenRouter when you want breadth and speed: a hosted aggregator with instant access to hundreds of models, including free ones, and nothing to operate. The core trade-off is data control versus convenience.

Is OpenRouter's data private?

It depends on the model. OpenRouter does not apply uniform Zero Data Retention. ZDR is enabled for some models while others warn that the provider may log prompts and completions. If consistent data handling matters, check each model's policy or self-host a gateway like LiteLLM to keep data in your own infrastructure.

Is LiteLLM free?

The core SDK and proxy server are open-source and free to self-host. You still pay the underlying model providers directly. LiteLLM also offers a commercial Enterprise license that adds SSO/SAML, audit logs, and SLAs; that pricing is not published, so you contact the maintainer for a quote.

Can I switch between them later?

In most setups, yes. Both expose an OpenAI-compatible interface, so changing gateways is often a base-URL and key change rather than an application rewrite. That shared compatibility is why a phased approach, prototyping on OpenRouter and moving to self-hosted LiteLLM for production, is a realistic plan.

What about Portkey, Cloudflare, or Kong?

Those are also LLM gateways aimed at different needs: Portkey is a hosted control plane, Cloudflare AI Gateway runs on the edge, and Kong AI Gateway is a plugin layer on Kong. This comparison focuses on OpenRouter and LiteLLM because they represent the cleanest split in the category: a hosted aggregator built for breadth versus a self-hostable proxy built for control. The other options are covered in our broader gateway roundup.

Bottom Line

OpenRouter and LiteLLM both put one OpenAI-compatible endpoint in front of many providers, but they answer different questions. OpenRouter answers "how do I reach the most models with the least effort," and it answers it well: hundreds of models including many at $0, automatic routing and fallbacks, and nothing to operate. LiteLLM answers "how do I keep control of my data, keys, and spend," and it answers that one well: an open-source proxy you run yourself, virtual keys with budgets, guardrails, and OpenAI-format translation across 100+ providers (vendor-reported).

The skeptic's position is that this is genuinely not a coin flip. If you are shipping to production, handling sensitive data, or need centralized key and budget governance, self-host LiteLLM. If you are experimenting, moving fast across a wide catalog, and have no infrastructure team, use OpenRouter, and verify the privacy policy of any model you feed real data.

The honest middle path is to use both in sequence: prototype on OpenRouter for breadth and speed, then graduate the workloads you commit to onto a self-hosted LiteLLM proxy as your data-control and governance requirements harden. That is not indecision. It is the pragmatic way the OpenAI-compatible design of both products is meant to be used.

Video Resources

▶

OpenRouter vs LiteLLM Compared

YouTube Search

▶

Self-Hosting the LiteLLM Proxy

YouTube Search

▶

Getting Started With OpenRouter

YouTube Search