LiteLLM is an open-source Python library and proxy server providing a single OpenAI-compatible API for 100+ LLMs including GPT, Claude, Gemini, Llama, and Mistral. It handles retries, fallbacks, cost tracking, rate limiting, load balancing, and virtual API keys.

What happened in the LiteLLM supply chain attack?

On March 24, 2026, threat actor TeamPCP published backdoored versions 1.82.7 and 1.82.8 to PyPI after stealing credentials via a compromised Trivy GitHub Action in LiteLLM's CI/CD. The packages contained a credential stealer and Kubernetes worm, were live for ~3 hours. Safe versions are 1.82.6 and below. Docker image users were not affected.

How do I call multiple LLM providers with LiteLLM?

Use the OpenAI-compatible format: from litellm import completion; response = completion(model='claude-3-5-sonnet', messages=[...]). Switch providers by changing the model string. LiteLLM translates the request to each provider's native API automatically.

What is the LiteLLM proxy gateway?

A self-hosted proxy server your team points their OpenAI SDK at instead of api.openai.com. Provides centralized logging, cost tracking, virtual API keys per team member, budget caps, PII redaction, and caching without changing application code.

How can I protect my CI/CD pipeline from supply chain attacks?

Pin GitHub Actions to specific commit SHAs instead of mutable tags, use short-lived OIDC tokens for PyPI publishing, monitor for unexpected package publish events, and prefer Docker images with pinned dependencies over pip installs for production.

LiteLLM: One Interface for 100+ LLMs — And a Cautionary Supply Chain Tale

By Prahlad Menon Published 2026-03-27 6 min read

If you’ve ever had to wrangle multiple LLM providers — switching between OpenAI, Anthropic, Google, and Azure depending on the task — you know the pain. Different SDKs, different response formats, different retry logic, different billing dashboards. It’s a mess.

LiteLLM solves exactly that problem. One interface, 100+ models, and it all looks like OpenAI.

This week it also became the target of a sophisticated supply chain attack that’s worth understanding — not just as a security story, but as a window into how fragile the open-source AI ecosystem really is.

What LiteLLM Actually Does

At its core, LiteLLM is a translation layer. You write code using the standard OpenAI SDK format, and LiteLLM routes it to whatever model you want — GPT-4o, Claude 3.5, Gemini 1.5, Llama 3, Mistral, Bedrock, Groq, Cohere, and 100+ others.

from litellm import completion

# Switch between models with one line change
response = completion(model="gpt-4o", messages=[...])
response = completion(model="claude-3-5-sonnet", messages=[...])
response = completion(model="gemini/gemini-1.5-pro", messages=[...])

Every model returns the same response format. Your application code doesn’t change — only the model string does.

But the real power is in what you get beyond basic routing.

Built-in Reliability

LiteLLM handles retries and fallbacks automatically. If OpenAI goes down or rate-limits you, it can automatically switch to Claude or Gemini. You define the priority order; LiteLLM handles the rest.

response = completion(
    model="gpt-4o",
    fallbacks=["claude-3-5-sonnet", "gemini/gemini-1.5-pro"],
    messages=[...]
)

Cost Tracking and Budget Caps

This is where it gets genuinely useful for teams. LiteLLM tracks spend per user, per team, per project — and lets you enforce hard limits.

Set a $500/month cap per team. Once they hit it, requests stop (or fall back to a cheaper model). No surprise bills. No angry Slack messages from your CFO.

Virtual API Keys

Instead of sharing your master OpenAI key with every developer, LiteLLM issues virtual keys. Each key can have its own model access, rate limits, and budget. Revoke one without touching the others.

Load Balancing

Spread traffic across multiple deployments — Azure East, Azure West, OpenAI direct — with round-robin or latency-based routing. At scale this matters: LiteLLM claims 8ms P95 latency at 1,000 requests per second.

The Proxy Gateway

You can run LiteLLM as a self-hosted proxy server. Your entire team points their OpenAI SDK at http://your-litellm-server instead of https://api.openai.com, and you get centralized logging, cost tracking, guardrails, PII redaction, and caching — without changing a line of application code.

# Run the proxy
litellm --model gpt-4o --port 8000

# Your app now talks to localhost instead of OpenAI
client = OpenAI(base_url="http://localhost:8000", api_key="your-virtual-key")

There’s also an admin dashboard UI for monitoring all of this.

Why This Matters

The “single interface” pattern isn’t new — but LiteLLM has executed it well enough that it’s become infrastructure for thousands of AI applications. With 3.4 million daily PyPI downloads, it’s in a lot of production pipelines.

That ubiquity is exactly what made it a target.

The Supply Chain Attack (March 24, 2026)

Here’s where the story gets dark — and instructive.

On March 24, 2026, two versions of LiteLLM on PyPI (1.82.7 and 1.82.8) were found to contain malicious code. They were live for approximately three hours before PyPI quarantined them.

The attack wasn’t a direct breach of LiteLLM. It was a cascade:

Step 1: Compromise Trivy Trivy is a popular open-source security scanner. In late February 2026, an attacker submitted a malicious pull request against Trivy’s CI pipeline, exploiting a pull_request_target workflow to exfiltrate credentials from aqua-bot, Trivy’s CI service account.

By March 19, threat actor TeamPCP had rewritten Trivy’s GitHub Action tags to point to a malicious release — meaning any project using aquasecurity/trivy-action in their CI/CD was now running attacker-controlled code.

Step 2: Steal LiteLLM’s PyPI Credentials LiteLLM used Trivy in its CI/CD security scanning workflow. When the pipeline ran with the compromised Trivy action, TeamPCP harvested LiteLLM’s PyPI publishing credentials.

Step 3: Publish Backdoored Packages With PyPI credentials in hand, TeamPCP published litellm==1.82.7 and 1.82.8 containing a three-stage payload:

Credential harvester (cloud keys, API tokens, environment variables)
Encrypted exfiltration to models.litellm.cloud (a domain registered the day before, March 23)
Persistent backdoor via Python startup hooks (litellm_init.pth)
Kubernetes worm for lateral movement

Wiz’s head of threat exposure summarized it bluntly: “Trivy gets compromised → LiteLLM gets compromised → credentials from tens of thousands of environments end up in attacker hands → and those credentials lead to the next compromise.”

A recursive security failure: the tool you use to find vulnerabilities becomes the vector for introducing them.

If you’re affected: Versions ≤ 1.82.6 are safe. The malicious versions have been removed from PyPI. If you ran 1.82.7 or 1.82.8, treat your environment as compromised — rotate all API keys, cloud credentials, and secrets that were accessible in that environment.

LiteLLM has paused new releases pending a full supply chain review. Their official post-mortem is at docs.litellm.ai/blog/security-update-march-2026.

The Broader Lesson

This attack illustrates something that’s been true for years but is increasingly dangerous in the AI era: open source supply chains are deeply interconnected, and that interdependence is a liability.

LiteLLM didn’t do anything obviously wrong. They used a widely trusted security tool in their CI/CD. That tool got compromised. Their credentials got stolen. Their package got backdoored.

A few practices that would have helped:

Pin dependencies in CI/CD — use commit SHAs instead of mutable tags for GitHub Actions
Separate publishing credentials — use short-lived OIDC tokens for PyPI rather than long-lived API keys
Monitor PyPI publish events — alert when a new package version is published
Use Docker images over pip for production — LiteLLM’s official Docker image was not affected because it pins dependencies in requirements.txt

That last point is worth noting: users running the official LiteLLM Proxy Docker image were protected. The attack targeted pip installs specifically.

Getting Started

If you want to try LiteLLM:

pip install litellm  # make sure you're on ≥ 1.82.9 when releases resume

Or run the proxy via Docker (the safer production path):

docker run -e OPENAI_API_KEY=your-key \
  -p 4000:4000 \
  ghcr.io/berriai/litellm:main-latest \
  --model gpt-4o --port 4000

The GitHub repo is github.com/BerriAI/litellm — 20k+ stars, Apache 2.0 licensed.

The irony isn’t lost: a tool designed to make AI infrastructure simpler and more manageable got hit through the exact kind of complexity and trust that security tooling is supposed to guard against. It’s a good reminder that in open source, your security posture is only as strong as the weakest link in your dependency chain — including the tools you use to find weak links.

Use LiteLLM. It’s excellent. Just pin your versions.