Tokens saved per day
Recent requests
| Time | Provider | Raw | Saved | Ratio |
|---|
Your Plan
Upgrade your plan
Same optimization quality on every tier. Pro removes the daily cap.
- All 6 optimization strategies
- LLMLingua-2 compression
- 200K token savings/day
- Basic dashboard
- Everything in Free
- Unlimited token savings
- Full analytics + 30-day charts
- Request history export
- Everything in Pro
- Shared context pools
- SSO + audit logs
- On-premise deployment
Documentation
Trimli AI is a transparent optimization proxy that reduces token consumption across AI coding tools. It intercepts API requests, compresses messages using 6 strategies, and forwards to the upstream provider. Your tools work exactly as before — just faster and cheaper.
Quickstart
1. Install the VS Code extension
Search "Trimli AI" in the VS Code Marketplace, or install from the command line:
code --install-extension trimliai.trimli-vscode
The extension starts a local optimization proxy on http://localhost:8765 and auto-configures supported tools.
2. Sign in (optional)
Open the Command Palette (Cmd+Shift+P) and run "Trimli AI: Sign In". This links your account for dashboard analytics and tier upgrades. The optimizer works without signing in.
3. Use your AI tools as normal
That's it. The proxy optimizes messages transparently. Check the status bar in VS Code for a live token savings counter.
How it works
Trimli operates as a reverse proxy between your AI tool and the provider's API:
- The request arrives at
localhost:8765 - Trimli detects the API format (OpenAI, Anthropic, or Google)
- Messages are compressed using up to 6 strategies (cheapest first)
- The optimized request is forwarded to the real API
- The response streams back unchanged
tool_use blocks (structured JSON) are never modified. Only text content within tool_result blocks is optimized. System messages are never compressed.VS Code extension
The VS Code extension manages the proxy lifecycle, auto-configures tools, and provides a dashboard.
Installation
- Open VS Code
- Go to Extensions (
Cmd+Shift+X) - Search "Trimli AI"
- Click Install
Commands
Trimli AI: Show Dashboard — Open the savings dashboard
Trimli AI: Sign In — Link your account via magic link
Trimli AI: Toggle Forward Proxy — Enable env var injection for terminal tools
Trimli AI: Optimize Now — Optimize selected text in the editor
Settings
tokOptimizer.enabled — Enable/disable optimization (default: true)
tokOptimizer.pythonServiceUrl — Custom Python service URL (default: localhost:8766)
tokOptimizer.hostedServiceUrl — Hosted service URL (default: Railway)
tokOptimizer.forwardProxy.enabled — Enable forward proxy mode (default: true)
Claude Code
Claude Code Auto-configured
Claude Code picks up the ANTHROPIC_BASE_URL environment variable automatically when launched from a VS Code terminal.
Setup
- Make sure the Trimli VS Code extension is installed and running
- Open a terminal inside VS Code (
Ctrl+`) - Run
claudeas usual
Manual setup (outside VS Code)
# Add to your shell profile (~/.zshrc, ~/.bashrc, etc.)
export ANTHROPIC_BASE_URL=http://localhost:8765
# Then run Claude Code as normal
claude
Continue
Continue Auto-configured
Trimli auto-configures Continue's config.json on activation.
Manual setup
// ~/.continue/config.json
{
"models": [{
"title": "GPT-4o",
"provider": "openai",
"model": "gpt-4o",
"apiKey": "sk-...",
"apiBase": "http://localhost:8765/v1/"
}]
}
Cline
Cline Auto-configured
Trimli updates Cline's VS Code settings on activation.
OpenAI mode
Set Base URL to http://localhost:8765/v1. Use gpt-4.1-mini or gpt-4.1.
Anthropic mode
Set Base URL to http://localhost:8765 (no /v1 suffix).
localhost:8765/v1. Anthropic mode: localhost:8765. Wrong format causes double /v1 errors.Cursor
Cursor Not supported
Cursor routes all traffic through its own servers and does not honor custom base URLs.
OpenAI-compatible tools
Any tool with a custom OpenAI base URL works. Set it to http://localhost:8765.
export OPENAI_BASE_URL=http://localhost:8765
export ANTHROPIC_BASE_URL=http://localhost:8765
Optimization strategies
Strategies run cheapest-first and stop when the token budget is met.
| Strategy | Type | Savings | Description |
|---|---|---|---|
whitespace-normalize | Lossless | 3-8% | Collapses spaces, blank lines, trailing whitespace |
deduplicate | Lossless | 5-20% | Removes repeated sentences |
intent-distill | Lossy | 10-30% | Strips filler phrases |
reference-substitute | Lossy | 10-25% | Aliases long repeated strings |
history-summarize | Lossy | 30-50% | Compresses old turns (no LLM) |
context-prune | Lossy | 20-40% | Drops low-relevance messages |
FAQ
Does Trimli affect response quality?
No. 59 accuracy tests pass with zero quality degradation at ~46% average compression.
What if the proxy is not running?
Your tool will fail to connect. Start VS Code (which starts the proxy) or remove the env vars.
Can I use it with Azure OpenAI?
Yes. The proxy detects Azure via the api-version query parameter.
Is my data sent to Trimli servers?
Optimization runs locally. On Pro/Enterprise, messages may go to the hosted Python service for ML compression. Nothing is stored or logged.