FREE
Tokens saved
Cost saved
Compression
Requests

Tokens saved per day

Recent requests

TimeProviderRawSavedRatio

Upgrade your plan

Same optimization quality on every tier. Pro removes the daily cap.

Free
$0
Free forever
  • All 6 optimization strategies
  • LLMLingua-2 compression
  • 200K token savings/day
  • Basic dashboard
Enterprise
$30/seat/mo
For teams of 5+
  • Everything in Pro
  • Shared context pools
  • SSO + audit logs
  • On-premise deployment
Can I cancel anytime? Yes. No lock-in, no questions.

Documentation

Trimli AI is a transparent optimization proxy that reduces token consumption across AI coding tools. It intercepts API requests, compresses messages using 6 strategies, and forwards to the upstream provider. Your tools work exactly as before — just faster and cheaper.

Quickstart

1. Install the VS Code extension

Search "Trimli AI" in the VS Code Marketplace, or install from the command line:

code --install-extension trimliai.trimli-vscode

The extension starts a local optimization proxy on http://localhost:8765 and auto-configures supported tools.

2. Sign in (optional)

Open the Command Palette (Cmd+Shift+P) and run "Trimli AI: Sign In". This links your account for dashboard analytics and tier upgrades. The optimizer works without signing in.

3. Use your AI tools as normal

That's it. The proxy optimizes messages transparently. Check the status bar in VS Code for a live token savings counter.

No API keys stored. The proxy forwards your Authorization header to the upstream API unchanged. Trimli never sees, stores, or logs your API keys.

How it works

Trimli operates as a reverse proxy between your AI tool and the provider's API:

  1. The request arrives at localhost:8765
  2. Trimli detects the API format (OpenAI, Anthropic, or Google)
  3. Messages are compressed using up to 6 strategies (cheapest first)
  4. The optimized request is forwarded to the real API
  5. The response streams back unchanged
Important: tool_use blocks (structured JSON) are never modified. Only text content within tool_result blocks is optimized. System messages are never compressed.

VS Code extension

The VS Code extension manages the proxy lifecycle, auto-configures tools, and provides a dashboard.

Installation

  1. Open VS Code
  2. Go to Extensions (Cmd+Shift+X)
  3. Search "Trimli AI"
  4. Click Install

Commands

Trimli AI: Show Dashboard        — Open the savings dashboard
Trimli AI: Sign In               — Link your account via magic link
Trimli AI: Toggle Forward Proxy  — Enable env var injection for terminal tools
Trimli AI: Optimize Now          — Optimize selected text in the editor

Settings

tokOptimizer.enabled              — Enable/disable optimization (default: true)
tokOptimizer.pythonServiceUrl     — Custom Python service URL (default: localhost:8766)
tokOptimizer.hostedServiceUrl     — Hosted service URL (default: Railway)
tokOptimizer.forwardProxy.enabled — Enable forward proxy mode (default: true)

Claude Code

Claude Code Auto-configured

Claude Code picks up the ANTHROPIC_BASE_URL environment variable automatically when launched from a VS Code terminal.

Setup

  1. Make sure the Trimli VS Code extension is installed and running
  2. Open a terminal inside VS Code (Ctrl+`)
  3. Run claude as usual
Verify it's working: After your first message, check the VS Code status bar — the counter should increase.

Manual setup (outside VS Code)

# Add to your shell profile (~/.zshrc, ~/.bashrc, etc.)
export ANTHROPIC_BASE_URL=http://localhost:8765

# Then run Claude Code as normal
claude
Note: The proxy must be running (VS Code extension active) for this to work.

Continue

Continue Auto-configured

Trimli auto-configures Continue's config.json on activation.

Manual setup

// ~/.continue/config.json
{
  "models": [{
    "title": "GPT-4o",
    "provider": "openai",
    "model": "gpt-4o",
    "apiKey": "sk-...",
    "apiBase": "http://localhost:8765/v1/"
  }]
}

Cline

Cline Auto-configured

Trimli updates Cline's VS Code settings on activation.

OpenAI mode

Set Base URL to http://localhost:8765/v1. Use gpt-4.1-mini or gpt-4.1.

Anthropic mode

Set Base URL to http://localhost:8765 (no /v1 suffix).

Base URL rule: OpenAI mode: localhost:8765/v1. Anthropic mode: localhost:8765. Wrong format causes double /v1 errors.

Cursor

Cursor Not supported

Cursor routes all traffic through its own servers and does not honor custom base URLs.

OpenAI-compatible tools

Any tool with a custom OpenAI base URL works. Set it to http://localhost:8765.

export OPENAI_BASE_URL=http://localhost:8765
export ANTHROPIC_BASE_URL=http://localhost:8765

Optimization strategies

Strategies run cheapest-first and stop when the token budget is met.

StrategyTypeSavingsDescription
whitespace-normalizeLossless3-8%Collapses spaces, blank lines, trailing whitespace
deduplicateLossless5-20%Removes repeated sentences
intent-distillLossy10-30%Strips filler phrases
reference-substituteLossy10-25%Aliases long repeated strings
history-summarizeLossy30-50%Compresses old turns (no LLM)
context-pruneLossy20-40%Drops low-relevance messages

FAQ

Does Trimli affect response quality?

No. 59 accuracy tests pass with zero quality degradation at ~46% average compression.

What if the proxy is not running?

Your tool will fail to connect. Start VS Code (which starts the proxy) or remove the env vars.

Can I use it with Azure OpenAI?

Yes. The proxy detects Azure via the api-version query parameter.

Is my data sent to Trimli servers?

Optimization runs locally. On Pro/Enterprise, messages may go to the hosted Python service for ML compression. Nothing is stored or logged.