LiteLLM AI Gateway
Open Source AI Gateway for 100+ LLMs. Self-hosted. Enterprise-ready. Call any LLM in OpenAI format.
<a href="https://render.com/deploy?repo=https://github.com/BerriAI/litellm" target="_blank" rel="nofollow"><img src="https://render.com/images/deploy-to-render-button.svg" alt="Deploy to Render" height="40"></a>
<a href="https://railway.com/deploy/RhvhdC?referralCode=7mRv9K&utm_medium=integration&utm_source=template&utm_campaign=generic"><img src="https://railway.com/button.svg" alt="Deploy on Railway" height="40"></a>
<a href="https://console.aws.amazon.com/cloudshell/home" target="_blank" rel="nofollow"><img src="https://github.com/BerriAI/litellm/raw/v1.90.0/github/deploy-on-aws.png" alt="Deploy on AWS" height="40"></a>
<a href="https://ssh.cloud.google.com/cloudshell/editor?cloudshell_git_repo=https%3A%2F%2Fgithub.com%2FBerriAI%2Flitellm&cloudshell_workspace=terraform%2Flitellm%2Fgcp%2Fexamples%2Fdefault&cloudshell_tutorial=TUTORIAL.md&cloudshell_image=gcr.io/ds-artifacts-cloudshell/deploystack_custom_image&shellonly=true" target="_blank" rel="nofollow"><img src="https://github.com/BerriAI/litellm/raw/v1.90.0/github/deploy-on-gcp.png" alt="Deploy on GCP" height="40"></a>
LiteLLM is an open source AI Gateway that gives you a single, unified interface to call 100+ LLM providers — OpenAI, Anthropic, Gemini, Bedrock, Azure, and more — using the OpenAI format.
Use it as a Python SDK for direct library integration, or deploy the AI Gateway (Proxy Server) as a centralized service for your team or organization.
Jump to LiteLLM Proxy (LLM Gateway) Docs
Jump to Supported LLM Providers
Managing LLM calls across providers gets complicated fast — different SDKs, auth patterns, request formats, and error types for every model. LiteLLM removes that friction:
Netflix |
LLMs - Call 100+ LLMs (Python SDK + AI Gateway)
All Supported Endpoints - /chat/completions, /responses, /embeddings, /images, /audio, /batches, /rerank, /a2a, /messages and more.
uv add litellm
from litellm import completion
import os
os.environ["OPENAI_API_KEY"] = "your-openai-key"
os.environ["ANTHROPIC_API_KEY"] = "your-anthropic-key"
# OpenAI
response = completion(model="openai/gpt-4o", messages=[{"role": "user", "content": "Hello!"}])
# Anthropic
response = completion(model="anthropic/claude-sonnet-4-20250514", messages=[{"role": "user", "content": "Hello!"}])
Getting Started - E2E Tutorial - Setup virtual keys, make your first request
uv tool install 'litellm[proxy]'
litellm --model gpt-4o
import openai
client = openai.OpenAI(api_key="anything", base_url="http://0.0.0.0:4000")
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}]
)
Agents - Invoke A2A Agents (Python SDK + AI Gateway)
Supported Providers - LangGraph, Vertex AI Agent Engine, Azure AI Foundry, Bedrock AgentCore, Pydantic AI
from litellm.a2a_protocol import A2AClient
from a2a.types import SendMessageRequest, MessageSendParams
from uuid import uuid4
client = A2AClient(base_url="http://localhost:10001")
request = SendMessageRequest(
id=str(uuid4()),
params=MessageSendParams(
message={
"role": "user",
"parts": [{"kind": "text", "text": "Hello!"}],
"messageId": uuid4().hex,
}
)
)
response = await client.send_message(request)
Step 1. Add your Agent to the AI Gateway
Step 2. Call Agent via A2A SDK
from a2a.client import A2ACardResolver, A2AClient
from a2a.types import MessageSendParams, SendMessageRequest
from uuid import uuid4
import httpx
base_url = "http://localhost:4000/a2a/my-agent" # LiteLLM proxy + agent name
headers = {"Authorization": "Bearer sk-1234"} # LiteLLM Virtual Key
async with httpx.AsyncClient(headers=headers) as httpx_client:
resolver = A2ACardResolver(httpx_client=httpx_client, base_url=base_url)
agent_card = await resolver.get_agent_card()
client = A2AClient(httpx_client=httpx_client, agent_card=agent_card)
request = SendMessageRequest(
id=str(uuid4()),
params=MessageSendParams(
message={
"role": "user",
"parts": [{"kind": "text", "text": "Hello!"}],
"messageId": uuid4().hex,
}
)
)
response = await client.send_message(request)
MCP Tools - Connect MCP servers to any LLM (Python SDK + AI Gateway)
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
from litellm import experimental_mcp_client
import litellm
server_params = StdioServerParameters(command="python", args=["mcp_server.py"])
async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
await session.initialize()
# Load MCP tools in OpenAI format
tools = await experimental_mcp_client.load_mcp_tools(session=session, format="openai")
# Use with any LiteLLM model
response = await litellm.acompletion(
model="gpt-4o",
messages=[{"role": "user", "content": "What's 3 + 5?"}],
tools=tools
)
Step 1. Add your MCP Server to the AI Gateway
Step 2. Call MCP tools via /chat/completions
curl -X POST 'http://0.0.0.0:4000/v1/chat/completions' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Summarize the latest open PR"}],
"tools": [{
"type": "mcp",
"server_url": "litellm_proxy/mcp/github",
"server_label": "github_mcp",
"require_approval": "never"
}]
}'
{
"mcpServers": {
"LiteLLM": {
"url": "http://localhost:4000/mcp/",
"headers": {
"x-litellm-api-key": "Bearer sk-1234"
}
}
}
}
| Provider | /chat/completions |
/messages |
/responses |
/embeddings |
/image/generations |
/audio/transcriptions |
/audio/speech |
/moderations |
/batches |
/rerank |
|---|---|---|---|---|---|---|---|---|---|---|
Abliteration (abliteration) |
✅ | |||||||||
AI/ML API (aiml) |
✅ | ✅ | ✅ | ✅ | ✅ | |||||
AI21 (ai21) |
✅ | ✅ | ✅ | |||||||
AI21 Chat (ai21_chat) |
✅ | ✅ | ✅ | |||||||
| Aleph Alpha | ✅ | ✅ | ✅ | |||||||
| Amazon Nova | ✅ | ✅ | ✅ | |||||||
Anthropic (anthropic) |
✅ | ✅ | ✅ | ✅ | ||||||
Anthropic Text (anthropic_text) |
✅ | ✅ | ✅ | ✅ | ||||||
| Anyscale | ✅ | ✅ | ✅ | |||||||
AssemblyAI (assemblyai) |
✅ | ✅ | ✅ | ✅ | ||||||
Auto Router (auto_router) |
✅ | ✅ | ✅ | |||||||
AWS - Bedrock (bedrock) |
✅ | ✅ | ✅ | ✅ | ✅ | |||||
AWS - Sagemaker (sagemaker) |
✅ | ✅ | ✅ | ✅ | ||||||
Azure (azure) |
✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | |
Azure AI (azure_ai) |
✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | |
Azure Text (azure_text) |
✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | |||
Baseten (baseten) |
✅ | ✅ | ✅ | |||||||
Bytez (bytez) |
✅ | ✅ | ✅ | |||||||
Cerebras (cerebras) |
✅ | ✅ | ✅ | |||||||
Clarifai (clarifai) |
✅ | ✅ | ✅ | |||||||
Cloudflare AI Workers (cloudflare) |
✅ | ✅ | ✅ | |||||||
Codestral (codestral) |
✅ | ✅ | ✅ | |||||||
Cohere (cohere) |
✅ | ✅ | ✅ | ✅ | ✅ | |||||
Cohere Chat (cohere_chat) |
✅ | ✅ | ✅ | |||||||
CometAPI (cometapi) |
✅ | ✅ | ✅ | ✅ | ||||||
CompactifAI (compactifai) |
✅ | ✅ | ✅ | |||||||
Custom (custom) |
✅ | ✅ | ✅ | |||||||
Custom OpenAI (custom_openai) |
✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | |||
| [Dashscope (`d |
$ claude mcp add litellm \
-- python -m otcore.mcp_server <graph>