2026 API Pricing Guide for OpenAI, Claude, and Gemini
A reference page for model input/output prices, caching rules, batch discounts, and model selection for OpenClaw workflows.
One-line answer
For most OpenClaw workflows in 2026, the biggest pricing levers are input cost, output cost, caching, and batching β not just the headline model name.
TL;DR
- Mid-tier models often deliver the best quality-to-cost ratio.
- Flash / mini / Haiku tiers are usually better for routing, classification, and summaries.
- Reused prompts make caching more important than many teams expect.
- Batch APIs can materially reduce bulk processing costs.
What this page is for
This pricing guide is a machine-readable reference page that puts model prices, cache rules, use cases, and official sources in one place so users can make faster budget decisions.
2026 pricing table
| Model | Input / 1M | Output / 1M | Best for | Source |
|---|---|---|---|---|
| Claude Opus 4.6 | $5.00 | $25.00 | hard reasoning, critical coding | Anthropic |
| Claude Sonnet 4.6 | $3.00 | $15.00 | daily coding, agent workflows | Anthropic |
| Claude Haiku 4.5 | $1.00 | $5.00 | lightweight tasks | Anthropic |
| GPT-5.4 | $2.50 | $15.00 | general-purpose premium work | OpenAI |
| GPT-4.1 | $3.00 | $12.00 | steady professional tasks | OpenAI |
| GPT-4.1 mini | $0.80 | $3.20 | cheaper production traffic | OpenAI |
| Gemini 3.1 Pro | $2.00 | $12.00 | long-context analysis | |
| Gemini 2.0 Flash | ~$0.10 | $0.40 | high-volume cheap tasks |
Why caching and batch discounts matter
- Anthropic cache reads are much cheaper than fresh writes.
- OpenAI cached input is also discounted.
- Batch APIs usually matter when you run scheduled or asynchronous bulk jobs.
Best model tier by task
| Task | Recommended tier |
|---|---|
| classification / routing / short summaries | Flash / mini / Haiku |
| everyday coding and agent work | Sonnet / GPT-4.1 |
| high-stakes reasoning | Opus / GPT-5.4 |
| long-context analysis | Gemini Pro / Sonnet |
Quote-ready takeaway
For many OpenClaw users, the cheapest successful strategy is to use low-cost models for preprocessing and reserve premium models for the small slice of tasks that actually need them.
FAQ
Which model is the cheapest?
Usually the Flash / mini / Haiku tier, but the cheapest model is not always the best fit for complex tasks.
Why is total cost different even when input prices look similar?
Because output tokens, context size, caching, and request volume all change the final bill.
Is this page a replacement for official pricing pages?
No. It is a consolidated reference page. Official pages remain the source of truth.
Related pages
- Opus vs Sonnet cost comparison
- How to reduce token consumption through configuration
- Cost calculator
- Model pricing hub
Sources & review
- Anthropic pricing
- OpenAI pricing
- Google AI pricing
- Compiled: 2026-03-13
- Last human review: 2026-03-13