A configured instance of LiteLLM library running as an OpenAI-compatible API proxy. This proxy provides a unified interface to multiple LLM providers (OpenAI, Anthropic, DeepSeek, Groq, Mistral, Gemini, etc.) with features like load balancing, cost tracking, and failover.
cd /home/clara/Documents/AI_AND_API/LiteLLM/litellm
./start.shThe server runs on port 8000 with usage-based routing by default.
pkill -f "litellm --config"# Check if server is running
ps aux | grep "litellm --config"
# View logs
tail -f litellm.log
# Health check (if implemented)
curl http://localhost:8000/healthMaster Key: supersecretkey123
All requests to the proxy must include this bearer token in the Authorization header:
Authorization: Bearer supersecretkey123
| Endpoint | Method | Description |
|---|---|---|
http://localhost:8000/v1/chat/completions |
POST | Chat completions (OpenAI-compatible) |
http://localhost:8000/v1/models |
GET | List available models |
http://localhost:8000/health |
GET | Health check (inferred from logs) |
API keys for all providers are stored in .env. Important: This file contains live API keys and should be secured.
Key variables:
LITELLM_MASTER_KEY=supersecretkey123(proxy authentication)OPENROUTER_API_KEY,OPENROUTER_API_KEY_2-6(6 OpenRouter keys for load balancing)DEEPSEEK_API_KEY,GEMINI_API_KEY,GROQ_API_KEY,MISTRAL_API_KEY, etc.
config.yaml: Primary configuration with usage-based routingconfig_cost_based.yaml: Alternative configuration with cost-based routing (selects cheapest model)start.sh: Startup script
Claude Code can be configured to use this LiteLLM proxy as its AI model gateway. The proxy provides OpenAI-compatible endpoints, so Claude Code needs to be configured to use the OpenAI SDK with the proxy base URL.
Add the following to your Claude Code configuration (~/.claude.json or equivalent):
{
"model": "openrouter-claude", // Use the proxy's model name for Claude
"api_base": "http://localhost:8000/v1",
"api_key": "supersecretkey123",
"api_type": "openai" // Ensure using OpenAI-compatible API
}# For OpenAI SDK compatibility
export OPENAI_API_KEY="supersecretkey123"
export OPENAI_API_BASE="http://localhost:8000/v1"
# For Claude Code specific variables (if supported)
export ANTHROPIC_API_BASE="http://localhost:8000/v1"
export ANTHROPIC_API_KEY="supersecretkey123"When Claude Code makes API calls, you can intercept and route them through the proxy by setting the appropriate base URL and authentication header. The proxy supports the following Claude models via OpenRouter:
openrouter-claude(Claude-3-Haiku via OpenRouter with load balancing across 6 API keys)
- Start the LiteLLM proxy:
./start.sh - Test with a simple curl request using the
openrouter-claudemodel:curl http://localhost:8000/v1/chat/completions \ -H "Authorization: Bearer supersecretkey123" \ -H "Content-Type: application/json" \ -d '{ "model": "openrouter-claude", "messages": [{"role": "user", "content": "Hello!"}] }'
- Configure Claude Code with the above settings and verify it routes through the proxy.
Note: Claude Code may require additional configuration to use OpenAI-compatible endpoints. Check Claude Code documentation for custom API base URL support.
curl http://localhost:8000/v1/chat/completions \
-H "Authorization: Bearer supersecretkey123" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-chat",
"messages": [
{"role": "user", "content": "Hello!"}
],
"max_tokens": 100
}'curl http://localhost:8000/v1/models \
-H "Authorization: Bearer supersecretkey123"Use the provided test script:
python3 ../test_litellm.pyThe proxy provides access to multiple models through different providers:
gpt-3.5-turbo(OpenAI)deepseek-chat(DeepSeek)gemini-pro(Gemini)groq-llama3(Groq)cerebras-llama3(Cerebras)mistral-large(Mistral)codestral(Codestral)voyage-embed(Voyage)morph(Morph)
openrouter-claude(Claude-3-Haiku via OpenRouter, 6 API keys for load balancing)
free-tier(Smart logic pool with failover: includes DeepSeek, Mistral, Groq models, Gemini, Codestral, KiloCode, and more)
allam-2-7b,llama-3.1-8b,llama-3.3-70b,llama-4-maverick-17b,llama-4-scoutwhisper-large-v3,whisper-large-v3-turbo(audio transcription)groq-compound,groq-compound-minillama-guard-4-12b,llama-prompt-guard-2-22m,llama-prompt-guard-2-86mkimi-k2-instruct,kimi-k2-instruct-0905gpt-oss-120b,gpt-oss-20b,gpt-oss-safeguard-20bqwen3-32b
Distributes load across available models based on usage patterns.
Selects the cheapest model based on token costs.
Both strategies include:
- 2 retries per failed request
- 3 allowed fails before banning a model
- 24-hour cooldown for banned models
- 30-second timeout per request
- API keys exposed:
.envfile contains live API keys for 17+ providers - Weak master key:
supersecretkey123is hardcoded and should be changed - Network exposure: Proxy runs on
0.0.0.0:8000(accessible from network) - No global rate limiting: Basic RPM/TPM limits per model but no global rate limiting
- No request logging/auditing: Only basic server logs in
litellm.log
Recommended security improvements:
- Rotate all API keys in
.env - Generate strong random master key
- Implement IP-based rate limiting
- Add request logging and auditing
- Consider running behind reverse proxy with authentication
- Add
.envto.gitignoreif not already
The .claude-flow/ directory contains integration metrics and configuration for Claude-Flow multi-agent system. The LiteLLM proxy serves as the AI model gateway for Claude-Flow agents.
# Check port conflict
lsof -i :8000
# Check Python environment
.venv/bin/python --version
# Run with debug
.venv/bin/litellm --config config.yaml --port 8000 --debug- Check
.envfor valid API keys - Review rate limits in config
- Check
litellm.logfor specific errors - Test individual model with curl
- Consider switching to
config_cost_based.yaml - Review free-tier model failures
- Check network connectivity to providers
Critical files to backup:
config.yaml- Primary configurationconfig_cost_based.yaml- Alternative configuration.env- API keys (store securely)start.sh- Startup script
Recovery procedure:
- Restore configuration files
- Update API keys in
.envif necessary - Start server:
./start.sh - Test:
python3 ../test_litellm.py
- Python 3.13 (from virtual environment)
litellm==1.80.10litellm_enterprise==0.1.25litellm_proxy_extras==0.4.14
For issues, check the logs in litellm.log or consult the LiteLLM documentation.
Note: This proxy is configured for Rakel's AI development stack and integrates with Claude-Flow for multi-agent AI workflows.