Gemini Pricing Calculator (3.1 Pro, 3.5 Flash, Gemma)
| Model | Input / 1M | Output / 1M | Context |
|---|---|---|---|
Gemini 3.1 Pro Google | $1.25 | $10.00 | 1000K |
Gemini 3.5 Flash Google | $0.30 | $2.50 | 1000K |
Gemini 2.5 Pro Google | $1.25 | $10.00 | 2000K |
Gemini 2.5 Flash Google | $0.30 | $2.50 | 1000K |
Gemma 4 31B Google | $0.05 | $0.15 | 256K |
Which Gemini model should you use?
Gemini 3.5 Flash is the best-value model in the lineup — 1M-token context, multimodal, and only $0.30/1M input. Gemini 3.1 Pro is the reasoning tier, priced at parity with GPT-5.5. Gemma 4 31B is open-weight and the cheapest option for self-hostable workloads.
Gemini vs GPT-5 vs Claude pricing
On the frontier tier, Gemini 3.1 Pro matches GPT-5.5 exactly and undercuts Claude Sonnet 4.6 by 2.4x. On the fast tier, Gemini 3.5 Flash is cheaper than GPT-5.4 mini and far cheaper than Claude Haiku — and it's the only one of the three with a true 1M-token context window across the entire price range.
How to lower your Gemini bill
(1) Use context caching for stable prompts — 75% off cached tokens. (2) Move long-context RAG to Flash; it handles 1M tokens at one-quarter the price of Pro. (3) Batch non-realtime traffic — Google's Batch mode is 50% off. (4) For pure-text classification, drop to Gemma 4 — $0.05/1M is hard to beat.