📖 Usage Documentation

🚀 Quick Start (3 Steps)

1Install

Download the installer for your platform from GitHub Releases, or build from source:

git clone https://github.com/vvvvking/tokenscope.git
cd tokenscope/desktop
npm install
npm start          # Run in development mode
npm run build:win  # Build Windows installer

2Launch

Double-click TokenScope from your Start Menu (or Applications on macOS). The app starts in the system tray and automatically launches the proxy on http://127.0.0.1:17666.

3Point your code at it

Change your base_url to http://127.0.0.1:17666/v1. That's it.

# Python (OpenAI SDK)
from openai import OpenAI

client = OpenAI(
    base_url="http://127.0.0.1:17666/v1",   # ← only this line changes
    api_key="your-api-key"
)
resp = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "hello"}]
)
print(resp.choices[0].message.content)

💡 Every call you make will now appear in the TokenScope dashboard in real time — including tokens, latency, model, and status.

🇨🇳 Using with Chinese AI Models

TokenScope supports 8 Chinese AI providers out of the box. To use them:

1Open Settings

Click the tray icon → Open Main Window → Settings tab.

2Select Default Upstream

In the "Default Upstream (OpenAI Protocol)" dropdown, select your provider (e.g. DeepSeek, Moonshot, Zhipu GLM).

3Save & Use

Click Save. The change takes effect immediately — no need to restart the proxy. Now all requests through 127.0.0.1:17666 will be forwarded to your selected provider.

# Example: Using DeepSeek through TokenScope
# Settings → Default Upstream → DeepSeek

from openai import OpenAI

client = OpenAI(
    base_url="http://127.0.0.1:17666/v1",
    api_key="your-deepseek-key"
)
resp = client.chat.completions.create(
    model="deepseek-chat",
    messages=[{"role": "user", "content": "hello"}]
)
print(resp.choices[0].message.content)

Supported Chinese providers and example models:

Provider	Upstream Key	Example Model
DeepSeek	deepseek	deepseek-chat
Moonshot Kimi	moonshot	moonshot-v1-8k
智谱 GLM	zhipu	glm-4-flash
火山方舟 Doubao	doubao	doubao-seed-1-6
阿里通义 Qwen	qwen	qwen-plus
零一万物 Yi	yi	yi-lightning
MiniMax	minimax	abab6.5s-chat
硅基流动	siliconflow	Qwen/Qwen2.5-7B-Instruct

💡 You can also override the upstream per-request by adding an X-Upstream header. This is useful if you want to route different calls to different providers without changing global settings.

🛠 Setup for Popular Tools

Claude Code (Anthropic CLI)

# Windows PowerShell
$env:ANTHROPIC_BASE_URL = "http://127.0.0.1:17666"
$env:ANTHROPIC_API_KEY  = "your-anthropic-key"
claude

# macOS / Linux
export ANTHROPIC_BASE_URL="http://127.0.0.1:17666"
export ANTHROPIC_API_KEY="your-anthropic-key"
claude

Cursor Editor

Go to Settings → Models → Override OpenAI Base URL, enter:

http://127.0.0.1:17666/v1

Cline / Continue (VSCode)

Select OpenAI Compatible as provider, then set:

Base URL:  http://127.0.0.1:17666/v1
API Key:   your-api-key
Model ID:  gpt-4o-mini  (or any model)

Node.js (OpenAI SDK)

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'http://127.0.0.1:17666/v1',
  apiKey: 'your-api-key'
});

const resp = await client.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'hello' }]
});
console.log(resp.choices[0].message.content);

curl Quick Test

curl http://127.0.0.1:17666/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"hi"}]}'

💡 Even without a valid API key, the upstream will return 401 — but TokenScope will still capture the full call, proving the proxy is working.

⚙️ Settings Reference

Setting	Default	Description
Proxy Port	17666	The local HTTP port your code connects to
Control Port	17667	Internal WebSocket port for the dashboard
Max Records	5000	How many calls to keep in local history
Auto Start Proxy	On	Start proxy automatically when app launches
Launch at Login	Off	Start TokenScope when you log into your computer
Default Upstream	OpenAI	Where to forward OpenAI-protocol requests by default

Settings are stored in %APPDATA%/tokenscope-desktop/settings.json (Windows) or ~/Library/Application Support/tokenscope-desktop/settings.json (macOS).

🏗 How It Works

TokenScope runs a transparent HTTP proxy on your machine. When your code sends an API call to 127.0.0.1:17666, the proxy:

Detects the protocol — OpenAI (/v1/chat/completions), Anthropic (/v1/messages), or Gemini (:generateContent)
Picks the upstream — based on your Default Upstream setting, or the X-Upstream header
Rewrites the path — for providers with non-standard paths (e.g. Zhipu's /api/paas/v4)
Forwards the request — with all original headers and body intact
Records everything — tokens, model, latency, status, first 2000 chars of input/output
Returns the response — untouched, including streaming SSE responses

All records are stored locally in records.ndjson — one JSON object per line, easy to grep or import into other tools.

⚠️ TokenScope is a development/debugging tool. It stores request/response previews in plaintext on your local disk. Do not use it in production environments with sensitive data.

❓ FAQ

Does TokenScope modify my API requests?

No. It forwards everything transparently. The only header removed is X-Upstream (if present), which is consumed by the proxy to decide routing.

Does it work with streaming responses?

Yes. Both SSE streaming and non-streaming responses are fully supported and recorded.

Can I use it with multiple providers at the same time?

Yes. Set a default upstream in Settings, and override per-request using the X-Upstream header. Different applications can hit different providers through the same proxy.

What if my provider uses a non-standard path?

TokenScope handles this automatically for known providers. Zhipu GLM (/api/paas/v4) and Doubao (/api/v3) have built-in path rewriting — your code still sends to /v1/chat/completions as usual.

Is there a size limit for recorded data?

Input and output text previews are capped at 2000 characters each. The "Max Records" setting (default 5000) controls how many calls are kept in history.

Monitor Every AI Call with
TokenScope

Why TokenScope?

Real-Time Dashboard

Zero Config

12+ Providers

100% Local

Free & Open Source

Chinese Model Support

Supported Providers

📖 Usage Documentation

🚀 Quick Start (3 Steps)

1Install

2Launch

3Point your code at it

🇨🇳 Using with Chinese AI Models

1Open Settings

2Select Default Upstream

3Save & Use

🛠 Setup for Popular Tools

Claude Code (Anthropic CLI)

Cursor Editor

Cline / Continue (VSCode)

Node.js (OpenAI SDK)

curl Quick Test

⚙️ Settings Reference

🏗 How It Works

❓ FAQ

Does TokenScope modify my API requests?

Does it work with streaming responses?

Can I use it with multiple providers at the same time?

What if my provider uses a non-standard path?

Is there a size limit for recorded data?

Start Monitoring Your AI Usage Today

Monitor Every AI Call withTokenScope

Why TokenScope?

Real-Time Dashboard

Zero Config

12+ Providers

100% Local

Free & Open Source

Chinese Model Support

Supported Providers

📖 Usage Documentation

🚀 Quick Start (3 Steps)

1Install

2Launch

3Point your code at it

🇨🇳 Using with Chinese AI Models

1Open Settings

2Select Default Upstream

3Save & Use

🛠 Setup for Popular Tools

Claude Code (Anthropic CLI)

Cursor Editor

Cline / Continue (VSCode)

Node.js (OpenAI SDK)

curl Quick Test

⚙️ Settings Reference

🏗 How It Works

❓ FAQ

Does TokenScope modify my API requests?

Does it work with streaming responses?

Can I use it with multiple providers at the same time?

What if my provider uses a non-standard path?

Is there a size limit for recorded data?

Start Monitoring Your AI Usage Today

Monitor Every AI Call with
TokenScope