tokcount › gemini

Gemini token counter.

Count tokens for Gemini 2.5 Pro, Gemini 2.5 Flash, and Gemini Flash Lite. Free, browser-only. No API key. No signup. Instant cost estimate at current Google AI rates — including the 1M-token context window.

open tokcount → browser extension →
$ npm i -g @v0idd0/tokcount # one-time install $ tokcount prompt.md --model gemini-2.5-flash model: gemini-2.5-flash · tokens: 1,842 · cost: $0.0003 in + $0.0011 out $ tokcount big-context/ --model gemini-2.5-pro model: gemini-2.5-pro · tokens: 284,512 · cost: $0.7113 in + $2.8451 out 284K tokens → billed at >200K rate · 28.5% of 1M context $ cat docs/ | tokcount --model gemini --all --tag google # full cost comparison across all Google models

Gemini API pricing — 2026.

tokcount calculates Gemini costs automatically — including the two-tier pricing for Gemini 2.5 Pro above 200K tokens.

Gemini 2.5 Flash

gemini-2.5-flash
context: 1M tokens
input$0.15 / 1M
output$0.60 / 1M
thinking$3.50 / 1M

Flash Lite

gemini-2.5-flash-lite
context: 1M tokens
input$0.075 / 1M
output$0.30 / 1M
best forhigh volume

Pricing from Google AI Studio public docs. Gemini 2.5 Flash thinking tokens are billed separately. Context caching available at discounted rates.

Count Gemini tokens in the Gemini app.

The tokcount browser extension shows live token counts as you type in the Gemini chat interface — including the remaining context budget from Gemini's 1,000,000-token window.

CLI — tokcount
  • handles 1M-token docs natively
  • two-tier Pro pricing auto-calculated
  • multi-model comparison (--all)
  • pipe files and stdin
Extension — live in Gemini app
  • inline count as you type
  • % of 1M context used
  • cost estimate per message
  • Chrome + Firefox + Edge
install tokcount extension →

Gemini tokenization — FAQ.

How does Gemini's tokenizer compare to GPT-4o?
Gemini uses a SentencePiece-based vocabulary, while GPT-4o uses BPE (byte pair encoding). For plain English text, they tokenize at roughly the same rate (~1 token per 0.75 words). For code, non-Latin scripts (CJK, Arabic), and special characters, the counts can differ by 10–20%. Gemini tends to tokenize code slightly more efficiently for Python and JavaScript.
What is the maximum context window for Gemini?
Gemini 2.5 Pro and Gemini 2.5 Flash both support a 1,000,000-token (1M) context window — approximately 750,000 English words or 5–6 full-length novels. This is the largest context window of any commercially available model as of mid-2026. Gemini 2.5 Flash Lite also supports 1M tokens.
Why does Gemini 2.5 Pro have two pricing tiers?
Google charges $1.25/1M input tokens for prompts up to 200,000 tokens, and $2.50/1M for prompts above 200,000 tokens. This tiered pricing applies per-request based on context length. tokcount automatically applies the correct tier when your input exceeds 200K tokens and shows you the cost breakdown.
Can I count Gemini tokens without an API key?
Yes. tokcount uses a local SentencePiece approximation for Gemini models — no API key, no Google account, no network request. For exact byte-level counts in a production system, use the countTokens method in the Google AI SDK, which calls the API. tokcount is accurate to within 1–3% for typical text, which is sufficient for cost planning.
How do thinking tokens work in Gemini Flash?
Gemini 2.5 Flash and Pro can use "thinking" — extended reasoning before the final response. Thinking tokens are billed at $3.50/1M for Flash (higher than standard output). tokcount counts visible text tokens; thinking tokens are hidden inside the model and not visible to the caller, so they are not included in tokcount's count. Be aware that thinking mode can add significant cost beyond the visible output token count.
other model counters
Claude token counter → GPT-4o token counter → all 60+ models →

power user?

use tokcount daily? tools.voiddo Pro · $9 one-time

supports 67 free tools · Pro license via Paddle · one flat price, no subscription