If you build with LLM APIs, you already know the pain: you run a prompt, bill comes in, and you have no clue which call burned your credits. TokenScope solves that โ in the browser, offline, free.
TokenScope is a tiny, zero-dependency web tool that counts tokens and estimates cost across every major AI model in 2026. Paste your prompt, pick a model, and you instantly see: token count, input cost, output cost projection, and which model is actually cheapest for your workload.
๐ Open TokenScope โ (loads in <1s, no backend).
We run NovAI, an AI API gateway. Every day a customer asks "why was this call so expensive?" The answer is always "you sent 12k tokens to Opus 4.7." But developers don't want to read BPE tables โ they want one box, paste prompt, see the number.
Existing calculators are either behind signup walls, riddled with ads, or only support OpenAI models. TokenScope is different:
Paste once, compare 10+ models side by side. No signup.
Open TokenScope โEvery major LLM API charges per token โ not per word. And every model uses a different tokenizer, so the same sentence can cost different amounts. Here's the updated pricing landscape (May 2026):
| Model | Input / 1M tok | Output / 1M tok | Context |
|---|---|---|---|
| GPT-5.5 | $5.00 | $30.00 | 400K |
| Claude Opus 4.7 | $5.00 | $25.00 | 1M |
| Gemini 3.1 Pro | $1.25 | $10.00 | 2M |
| DeepSeek V4 | $0.27 | $1.10 | 128K |
| DeepSeek V3.2 | $0.20 | $0.40 | 128K |
| Qwen3-Max | $1.20 | $6.00 | 256K |
| GLM-4.6V | $0.40 | $1.20 | 200K |
| MiniMax-Text-01 | $0.20 | $1.60 | 1M |
A single Hello, how are you? is 6 tokens in GPT, 5 in Claude, 4 in Qwen. Scale that to 10K API calls a day and the difference is real money.
A 3000-token system prompt you set once looks cheap. But it gets sent with every request. TokenScope shows you that 3K-token system prompt ร 10K calls/day ร $5/1M = $150/day โ on input alone.
max_tokensMany developers set max_tokens=4096 "just in case." Opus 4.7 output is $25/1M. If your real answer is 200 tokens, you're not billed for unused output โ but the reserved budget can trip rate limits and block parallelism. Right-size it.
Classifying 10M user reviews with GPT-5.5 ($5 in + $30 out) when DeepSeek V3.2 ($0.20 + $0.40) handles it identically = 50x overpay. Paste a sample into TokenScope, see the cost side-by-side, switch.
messages[] arrayCtrl+Enter to count, Ctrl+D to dupePrompt: "Classify this ticket into {billing, bug, feature}.
Ticket: My invoice shows $127 but I only used $40 of credits..."
GPT-5.5 : 87 in, ~10 out โ $0.000735 / call
Claude Opus 4.7 : 79 in, ~9 out โ $0.000620 / call
DeepSeek V4 : 91 in, ~10 out โ $0.000035 / call โ 21x cheaper
DeepSeek V3.2 : 91 in, ~10 out โ $0.000022 / call โ 33x cheaper
At 1M tickets/year: GPT-5.5 costs $735, DeepSeek V3.2 costs $22. Same accuracy on a classification task that doesn't need frontier reasoning. That's what TokenScope helps you see in 30 seconds.
Once you know your numbers, you need an API that bills accurately. Most developers in Asia/Europe hit three pain points with native OpenAI/Anthropic:
NovAI fixes all three: Hong Kong servers (<80ms to most of Asia), OpenAI-compatible endpoint (drop-in base URL swap), USDT / Alipay / PayPal payments, and the same model pricing as OpenRouter or cheaper.
Count tokens free, run the cheapest API for your use case. Get $0.50 free credit, no card required.
Get Free API Key โYes. It's a static web tool. No login, no ads, no backend. Source is on GitHub (MIT license).
Yes, once loaded. Everything runs client-side โ the tokenizer WASM is cached.
We ship the official tiktoken, anthropic-tokenizer, and qwen-tokenizer ports. For DeepSeek / GLM / MiniMax (no public tokenizer) we use an empirically-calibrated BPE โ within ยฑ2% on 10K random prompts.
Yes. TokenScope is open source โ vendor it, iframe it, or lift the tokenizer JS. See the repo for integration examples.
Each model's tokenizer is different. Claude's tokenizer splits on different byte-pair boundaries than GPT's. Chinese text especially โ GPT-4 used 3-4 tokens per Chinese character, GPT-5.5 brought it down to ~1.3; DeepSeek is ~1.0. That's why TokenScope shows every count side-by-side.