Control Plane for Multi-region Architecture (Enterprise)
Learn how to deploy LiteLLM across multiple regions while maintaining centralized administration and avoiding duplication of management overhead.
Overviewโ
When scaling LiteLLM for production use, you may want to deploy multiple instances across different regions or availability zones while maintaining a single point of administration. This guide covers how to set up a distributed LiteLLM deployment with:
- Regional Worker Instances: Handle LLM requests for users in specific regions
- Centralized Admin Instance: Manages configuration, users, keys, and monitoring
Architecture Pattern: Regional + Admin Instancesโ
Typical Deployment Scenarioโ
Benefits of This Architectureโ
- Reduced Management Overhead: Only one instance needs admin capabilities
- Regional Performance: Users get low-latency access from their region
- Centralized Control: All administration happens from a single interface
- Security: Limit admin access to designated instances only
- Cost Efficiency: Avoid duplicating admin infrastructure
Configurationโ
Admin Instance Configurationโ
The admin instance handles all management operations and provides the UI.
Environment Variables for Admin Instance:
# Keep admin capabilities enabled (default behavior)
# DISABLE_ADMIN_UI=false # Admin UI available
# DISABLE_ADMIN_ENDPOINTS=false # Management APIs available
DISABLE_LLM_API_ENDPOINTS=true # LLM APIs disabled
DATABASE_URL=postgresql://user:pass@global-db:5432/litellm
LITELLM_MASTER_KEY=your-master-key
# Configure API Reference page to show data plane URL
API_REFERENCE_BASE_URL=https://us.company.com # Data plane URL to display in API Reference
API_REFERENCE_MODEL=gpt-4 # Optional: Default model to show in examples
Worker Instance Configurationโ
Worker instances handle LLM requests but have admin capabilities disabled.
Environment Variables for Worker Instances:
# Disable admin capabilities
DISABLE_ADMIN_UI=true # No admin UI
DISABLE_ADMIN_ENDPOINTS=true # No management endpoints
DATABASE_URL=postgresql://user:pass@global-db:5432/litellm
LITELLM_MASTER_KEY=your-master-key
Environment Variables Referenceโ
DISABLE_ADMIN_UIโ
Disables the LiteLLM Admin UI interface.
- Default:
false - Worker Instances: Set to
true - Admin Instance: Leave as
false(or don't set)
# Worker instances
DISABLE_ADMIN_UI=true
Effect: When enabled, the web UI at /ui becomes unavailable.
DISABLE_ADMIN_ENDPOINTSโ
Disables all management/admin API endpoints.
- Default:
false - Worker Instances: Set to
true - Admin Instance: Leave as
false(or don't set)
# Worker instances
DISABLE_ADMIN_ENDPOINTS=true
Disabled Endpoints Include:
/key/*- Key management/user/*- User management/team/*- Team management/config/*- Configuration updates- All other administrative endpoints
Available Endpoints (when disabled):
/chat/completions- LLM requests/v1/*- OpenAI-compatible APIs/vertex_ai/*- Vertex AI pass-through APIs/bedrock/*- Bedrock pass-through APIs/health- Basic health check/metrics- Prometheus metrics- All other LLM API endpoints
DISABLE_LLM_API_ENDPOINTSโ
Disables all LLM API endpoints.
- Default:
false - Worker Instances: Leave as
false(or don't set) - Admin Instance: Set to
true
# Admin instance
DISABLE_LLM_API_ENDPOINTS=true
Disabled Endpoints Include:
/chat/completions- LLM requests/v1/*- OpenAI-compatible APIs/vertex_ai/*- Vertex AI pass-through APIs/bedrock/*- Bedrock pass-through APIs- All other LLM API endpoints
Available Endpoints (when disabled):
/key/*- Key management/user/*- User management/team/*- Team management/config/*- Configuration updates- All other administrative endpoints
API_REFERENCE_BASE_URLโ
โจ This is useful for Control Plane setups.
Overrides the URL displayed on the API Reference page in the admin UI.
- Default: Uses
PROXY_BASE_URLvalue - Control Plane Use Case: Set to your data plane URL
# Admin instance (control plane)
PROXY_BASE_URL=https://admin.company.com
API_REFERENCE_BASE_URL=https://us.company.com # Data plane URL for LLM requests
Effect: The API Reference page will show code examples using https://us.company.com instead of the control plane URL.
API_REFERENCE_MODELโ
Overrides the model name displayed in API Reference code examples.
- Default:
gpt-3.5-turbo - Control Plane Use Case: Set to your preferred model name
API_REFERENCE_MODEL=gpt-4
Effect: The API Reference code examples will show model="gpt-4" instead of the default model.
Usage Patternsโ
Client Usageโ
For LLM Requests (use regional endpoints):
import openai
# US users
client_us = openai.OpenAI(
base_url="https://us.company.com/v1",
api_key="your-litellm-key"
)
# EU users
client_eu = openai.OpenAI(
base_url="https://eu.company.com/v1",
api_key="your-litellm-key"
)
response = client_us.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Hello!"}]
)
For Administration (use admin endpoint):
import requests
# Create a new API key
response = requests.post(
"https://admin.company.com/key/generate",
headers={"Authorization": "Bearer sk-1234"},
json={"duration": "30d"}
)
Related Documentationโ
- Virtual Keys - Managing API keys and users
- Health Checks - Monitoring instance health
- Prometheus Metrics - Collecting metrics
- Production Deployment - Production best practices