`skim`

Stop paying for tokens you never meant to send.

The runtime layer that sits between your AI tools and the LLM API — stripping waste, injecting caching, and showing you exactly where every token goes.

⚡ Quickstart · 🔍 How it works · 📊 Dashboard · 🏢 Enterprise · ⌨️ CLI · 📚 Docs · ▶️ Live Demo

Note

One env var. Zero code changes. Claude Code reads a package-lock.json — 122k tokens, $0.37 — just to answer a question about a 200-line file. History compounds. Your context window fills silently and quality degrades while you fly blind. skim fixes this in the API call path, in real time.

flowchart LR
    A["🤖 Claude Code<br/>Cursor · your app"] -->|ANTHROPIC_BASE_URL| B1

    subgraph SKIM ["⚡ skim proxy"]
        direction TB
        B1["✂️ strip lock files<br/>& build artifacts"]
        B2["◈ inject prompt caching<br/>50–90% cheaper"]
        B3["🛡️ enforce budgets<br/>hard 429 block"]
        B4["📊 live dashboard<br/>+ local SQLite"]
        B1 --> B2 --> B3 --> B4
    end

    B4 --> C["☁️ Anthropic<br/>OpenAI · Gemini"]

    style A fill:#161920,stroke:#6c63ff,color:#e4e6f0
    style SKIM fill:#0d0f14,stroke:#6c63ff,color:#6c63ff
    style C fill:#161920,stroke:#00d4aa,color:#e4e6f0
    style B1 fill:#161920,stroke:#252a3a,color:#e4e6f0
    style B2 fill:#161920,stroke:#252a3a,color:#e4e6f0
    style B3 fill:#161920,stroke:#252a3a,color:#e4e6f0
    style B4 fill:#161920,stroke:#252a3a,color:#e4e6f0

⚡ Quickstart

1. Install

pip install skim-llm

2. Start the proxy

skim proxy

Browser opens automatically to your live dashboard.

3. Point your tool at it

export ANTHROPIC_API_KEY=sk-ant-...   # required for Claude Code
export ANTHROPIC_BASE_URL=http://localhost:7474

That's it. Every call now flows through skim.

┌────────────────────────────────────┐
│  skim v0.5.0  — runtime token proxy │
├────────────────────────────────────┤
│  listening  localhost:7474          │
│  dashboard  localhost:7474/dashboard│
│  filtering  ✓ on                    │
│  caching    ✓ on                    │
├────────────────────────────────────┤
│  ⠋ LIVE  waiting for calls...       │
└────────────────────────────────────┘

Tip

skim auto-detects your plan — x-api-key for API users, Authorization: Bearer for OAuth clients — and routes each accordingly, with full waste filtering and tracking either way.

Warning

Claude Code on a Pro/Max subscription cannot use a local proxy. Subscription traffic ignores ANTHROPIC_BASE_URL and routes straight to Anthropic — the proxy will sit on "waiting for calls". To intercept Claude Code, use API-key auth (export ANTHROPIC_API_KEY=sk-ant-… alongside ANTHROPIC_BASE_URL, in the same shell before launching claude). skim also works as-is with Cursor, the SDK, and any OpenAI-compatible tool.

🔍 How it works

✂️

Waste filtering

Detects lock files, build artifacts & generated code inside tool_result blocks and strips them before they hit your context.

package-lock.json → a 12-token note instead of 122k tokens.

◈

Caching injection

Wraps your system prompt + large context with cache_control automatically.

First call caches it. Every call after is free. CLAUDE.md loads at zero cost on calls 2+.

📊

Live dashboard

Opens in your browser on start. No login, no setup. Persists to ~/.skim/events.db.

Real-time SSE updates — watch tokens & cost as they happen.

Auto-detected waste signatures

File	Detected by
`package-lock.json`	`"lockfileVersion"` + `"resolved": "https://"`
`yarn.lock`	`# yarn lockfile v1` + `resolved`
`pnpm-lock.yaml`	`lockfileVersion:` + `resolution:`
`Cargo.lock`	`@generated` + `[[package]]`
`poetry.lock`	`@generated` + `[[package]]`
`composer.lock`	`"content-hash":` + `"packages":`

Plus anything in your project's .llmignore. Stripped blocks are replaced with a one-line note showing what was removed and how to disable it.

How plan detection works

One method, _auth_type(), owns all routing logic:

_auth_type() → ("apikey", key)    # API plan      → filtering + caching + tracking
             → ("oauth",  token)  # Pro/Max plan  → filtering + tracking (no cache injection)
             → ("", "")           # no auth       → 401

Adding a new plan type (enterprise SSO, team tokens) is a single elif. Caching injection is skipped for Pro/OAuth because the Pro plan manages its own cache layer.

📊 Dashboard

Five fully-built pages. Dark theme, live charts, real-time SSE updates — no refresh button needed.

🟣 Overview	⚡ Sessions	📈 Usage	🤖 Models	💰 Savings
tokens, cost, savings, cache	full call log, searchable	hourly + daily charts	cost/1k, cache %, waste %	cumulative savings & ROI

skim proxy              # local dashboard, zero setup, opens in browser

The local dashboard works for everyone — solo devs, Pro users, anyone. Data never leaves your machine unless you explicitly connect a team server.

🏢 Enterprise

Important

Everything below is open-source and self-hosted — same pip package, no paywall, no telemetry.

🛡️ Budget enforcement

Hard-block calls that exceed token/cost limits. Proxy returns 429 before forwarding.

skim admin budget set --owner-type team \
  --owner-id engineering --usd 500 --period monthly

🔔 Webhook alerts

Slack (& Teams) or any HTTP endpoint on budget events.

skim admin webhooks add --channel slack \
  --url https://hooks.slack.com/...

✉️ User invites

Self-registration via single-use links. No manual accounts.

skim admin users invite --email new@corp.com \
  --role user --team platform

🔑 Scoped API keys

ingest · read · admin — with expiry dates and revocation.

👥 RBAC

admin · team_admin · user — enforced data isolation per role.

📋 Audit log

Every sensitive action logged immutably. Queryable by action + date.

skim admin audit --days 30 --action auth.login

📤 Data export

CSV event logs + JSON summaries for accounting & BI.

skim admin export --days 30 --out report.csv

Team deployment in 3 commands

# 1. Run the server (auto-creates admin, uses gunicorn if installed)
pip install 'skim-llm[web]'
SKIM_ADMIN_EMAIL=you@corp.com skim server --host 0.0.0.0 --port 7475

# 2. Each developer connects their proxy
export SKIM_SERVER_URL=https://skim.corp.internal
export SKIM_SERVER_TOKEN=sk-skim-...     # generate in Settings

# 3. Manage from anywhere
skim admin users list

Auth: local password · LDAP/AD (SKIM_LDAP_*) · Google/GitHub/Azure/Okta (SKIM_OIDC_*)

Full guide → docs/enterprise.md · docs/deployment.md

⌨️ CLI Reference

🔬 Static analysis _{no API key}

skim scan       # token cost per file
skim analyze    # detect waste patterns
skim fix        # auto-write .llmignore
skim check      # CI budget gate
skim generate   # .llmignore + CLAUDE.md
skim secrets    # leaked credential scan

⚙️ Runtime & ops

skim proxy      # the interceptor
skim server     # team dashboard + API
skim admin      # manage users/budgets/keys
skim audit      # local operation log
skim hooks      # git pre-commit gate
skim baseline   # token regression checks

Example — skim fix auto-cleanup

  skim fix  —  ./my-project
  ──────────────────────────────────────────────────────
  Before  : 166.8k tokens  (83.4% ctx)  $0.50/session

  Pattern              Severity    Tokens saved  Rules
  ────────────────────────────────────────────────────
  Lock files           HIGH           160.3k     +7
  Test snapshots       MEDIUM           4.1k     +2

  ✓ Written to .llmignore

  After   : 6.5k tokens  (3.2% ctx)  $0.02/session
  Saved   : 160.3k tokens  (96.1% reduction)  $0.48/session
  Now     : 51 sessions / $1

🐍 Python API

from adapters import ClaudeAdapter

claude = ClaudeAdapter(
    model="claude-sonnet-4-6",
    system_prompt="You are a terse coding assistant.",
    enable_caching=True,          # prompt caching, automatic
)
response = claude.chat("Refactor the auth module")
claude.print_stats()
# Session: 12,400 tokens | Cache hit rate: 87% | Cost: $0.0037

_{Adapters: ClaudeAdapter · OpenAIAdapter · GeminiAdapter · OllamaAdapter}

📦 Install

pip install skim-llm                    # core — zero hard deps
pip install 'skim-llm[tiktoken]'        # accurate token counting
pip install 'skim-llm[web]'             # dashboard server
pip install 'skim-llm[web,sso,ldap]'    # enterprise auth
pip install 'skim-llm[all]'             # everything

📚 Documentation

Guide	What it covers
Quickstart	Zero to running in 2 minutes
Proxy	Deep-dive — every feature, every flag
Dashboard	Local & team dashboards
Enterprise	Budgets, webhooks, invites, RBAC, audit
Admin CLI	`skim admin` complete reference
REST API	All 31 endpoints with schemas
Configuration	Every env var & `.skimrc` option
Deployment	Docker, systemd, nginx, scaling
MCP Setup	Claude Desktop integration

🔌 MCP Server

{ "mcpServers": { "skim": { "command": "skim-mcp" } } }

_{Tools: scan_tokens · analyze_context · check_budget · fix_context · generate_llmignore}

_{GitHub · PyPI · Issues · Changelog · Live Demo
Built for developers who'd rather not pay for noise. · MIT License}

_{⭐ Star the repo if skim saved you some tokens.}

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
.claude/commands		.claude/commands
.github		.github
.llm		.llm
adapters		adapters
assets		assets
benchmarks		benchmarks
core		core
demo		demo
docs		docs
scripts		scripts
server		server
skim_mcp		skim_mcp
tests		tests
web		web
.claudeignore		.claudeignore
.gitignore		.gitignore
.llmignore		.llmignore
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

`skim`

Stop paying for tokens you never meant to send.

⚡ Quickstart

🔍 How it works

✂️

◈

📊

📊 Dashboard

🏢 Enterprise

🛡️ Budget enforcement

🔔 Webhook alerts

✉️ User invites

🔑 Scoped API keys

👥 RBAC

📋 Audit log

📤 Data export

⌨️ CLI Reference

🐍 Python API

📦 Install

📚 Documentation

🔌 MCP Server

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

skim

Stop paying for tokens you never meant to send.

⚡ Quickstart

🔍 How it works

✂️

◈

📊

📊 Dashboard

🏢 Enterprise

🛡️ Budget enforcement

🔔 Webhook alerts

✉️ User invites

🔑 Scoped API keys

👥 RBAC

📋 Audit log

📤 Data export

⌨️ CLI Reference

🐍 Python API

📦 Install

📚 Documentation

🔌 MCP Server

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`skim`

Packages