GPT-5 Pricing Explained: Every Tier in 2026

OpenAI's GPT-5 family is now four tiers deep, and the pricing spread is wider than any prior generation. The cheapest tier (GPT-5.4 nano) is 25× cheaper on input than GPT-5.5 and 250× cheaper than legacy GPT-4. If you're still running everything on a single tier, you're leaving real money on the table.

This guide is the full GPT-5 pricing breakdown: every list rate, the cached-input math, batch API savings, and what each tier actually costs on real workloads. Cross-check the numbers in our cost calculator or jump to the OpenAI pricing calculator.

The GPT-5 pricing table (mid-2026)

USD per 1M tokens, list rate. Cached input is 50% off list. Batch API is 50% off both input and output for non-realtime jobs.

Model	Input	Cached input	Output	Context
GPT-5.5 (xhigh / high / medium / low)	$1.25	$0.625	$10.00	922K
GPT-5.3 Codex	$1.25	$0.625	$10.00	400K
GPT-5.4 mini	$0.25	$0.125	$2.00	400K
GPT-5.4 nano	$0.05	$0.025	$0.40	400K

Source: OpenAI pricing page, June 2026. Live data in our OpenAI pricing calculator.

What each tier actually costs on real workloads

List prices are abstract. Here's what 1,000 typical interactions cost on each GPT-5 tier:

Workload	GPT-5.5	5.4 mini	5.4 nano
Chat (1.5K in / 400 out)	$5.88	$1.18	$0.24
RAG (8K in / 600 out)	$16.00	$3.20	$0.64
Agent loop (12K in / 2K out)	$35.00	$7.00	$1.40

Which GPT-5 tier should you actually use?

GPT-5.5 — premium reasoning

Use the xhigh or high variants for hard reasoning, code generation, or anything where a quality miss costs more than $10 of engineering time. The "low" variant is the speed-optimized version — same price, faster latency, slightly lower scores on hard reasoning benchmarks.

GPT-5.3 Codex — code specialist

Tuned for coding agents and IDE integrations. Same price as GPT-5.5 but with better FIM (fill-in-middle) support and tool-use reliability. Default choice if your product writes code.

GPT-5.4 mini — the workhorse

The price/intelligence sweet spot. 5× cheaper than GPT-5.5 with ~85% of the reasoning quality on most production tasks. Most chat, RAG and routing workloads should default here.

GPT-5.4 nano — bulk operations

At $0.05/$0.40 per 1M, nano is the cheapest serious OpenAI model. Use for classification, extraction, structured outputs, or as the first hop in a cascading router. Quality is well below mini, so test on your evals before deploying.

The three pricing levers

1. Prompt caching (50% off input)

If your system prompt repeats across users, caching halves your input cost. For a 4K-token system prompt shared across 100K users, caching saves roughly $0.50/1M tokens of effective input. Always put static content at the top of the prompt.

2. Batch API (50% off both directions)

Non-realtime workloads (evals, backfills, classification jobs) get 50% off input AND output with a 24-hour SLA. Combined with caching, batch on GPT-5.4 mini drops effective cost to $0.0625 / $1.00 per 1M.

3. Model cascading

Route 80% of traffic to GPT-5.4 nano, escalate to mini or 5.5 only when nano returns low confidence. Real-world bill cuts: 60–80% with negligible quality loss. See the cost-cutting playbook for the routing logic.

GPT-5 vs the competition

At $1.25/$10 per 1M, GPT-5.5 is now cheaper than Claude Sonnet 4.6 ($3/$15) and matches Gemini 3.1 Pro exactly on price. The closest cost-per-quality competitor is DeepSeek V4 Pro at $0.27/$1.10 — roughly 5× cheaper but with a measurable quality gap on hard reasoning. See the full breakdown in Claude vs GPT cost comparison or the 2026 LLM price comparison.

The bottom line

The GPT-5 lineup is more like a cost menu than a single product. Default to GPT-5.4 mini, escalate to 5.5 for hard reasoning, drop to nano for bulk. Add caching and batch wherever you can. Run the numbers for your own traffic in the cost calculator, then sanity-check vs other providers on the comparison page.