What should I compare first when optimizing model cost?

Compare input price, output price, cache pricing, batch discounts, and the actual usage pattern of your workflow.

Why is output price important?

Because many agent tasks generate large outputs, so output cost can dominate the final bill.

When does caching matter most?

When you reuse long system prompts, templates, or repeated context across many requests.

Cost

2026 API Pricing Guide for OpenAI, Claude, and Gemini

A reference page for model input/output prices, caching rules, batch discounts, and model selection for OpenClaw workflows.

Author: OpenClaw Save Money Editorial TeamUpdated: 2026-03-13

Reading tip: skim the summary first, then the tables and FAQ, then use the next-step cards below.

One-line answer

For most OpenClaw workflows in 2026, the biggest pricing levers are input cost, output cost, caching, and batching — not just the headline model name.

TL;DR

Mid-tier models often deliver the best quality-to-cost ratio.
Flash / mini / Haiku tiers are usually better for routing, classification, and summaries.
Reused prompts make caching more important than many teams expect.
Batch APIs can materially reduce bulk processing costs.

What this page is for

This pricing guide is a machine-readable reference page that puts model prices, cache rules, use cases, and official sources in one place so users can make faster budget decisions.

2026 pricing table

Model	Input / 1M	Output / 1M	Best for	Source
Claude Opus 4.6	$5.00	$25.00	hard reasoning, critical coding	Anthropic
Claude Sonnet 4.6	$3.00	$15.00	daily coding, agent workflows	Anthropic
Claude Haiku 4.5	$1.00	$5.00	lightweight tasks	Anthropic
GPT-5.4	$2.50	$15.00	general-purpose premium work	OpenAI
GPT-4.1	$3.00	$12.00	steady professional tasks	OpenAI
GPT-4.1 mini	$0.80	$3.20	cheaper production traffic	OpenAI
Gemini 3.1 Pro	$2.00	$12.00	long-context analysis	Google
Gemini 2.0 Flash	~$0.10	$0.40	high-volume cheap tasks	Google

Why caching and batch discounts matter

Anthropic cache reads are much cheaper than fresh writes.
OpenAI cached input is also discounted.
Batch APIs usually matter when you run scheduled or asynchronous bulk jobs.

Best model tier by task

Task	Recommended tier
classification / routing / short summaries	Flash / mini / Haiku
everyday coding and agent work	Sonnet / GPT-4.1
high-stakes reasoning	Opus / GPT-5.4
long-context analysis	Gemini Pro / Sonnet

Quote-ready takeaway

For many OpenClaw users, the cheapest successful strategy is to use low-cost models for preprocessing and reserve premium models for the small slice of tasks that actually need them.

FAQ

Which model is the cheapest?

Usually the Flash / mini / Haiku tier, but the cheapest model is not always the best fit for complex tasks.

Why is total cost different even when input prices look similar?

Because output tokens, context size, caching, and request volume all change the final bill.

Is this page a replacement for official pricing pages?

No. It is a consolidated reference page. Official pages remain the source of truth.

Sources & review

Anthropic pricing
OpenAI pricing
Google AI pricing
Compiled: 2026-03-13
Last human review: 2026-03-13

2026 API Pricing Guide for OpenAI, Claude, and Gemini

One-line answer

TL;DR

What this page is for

2026 pricing table

Why caching and batch discounts matter

Best model tier by task

Quote-ready takeaway

FAQ

Which model is the cheapest?

Why is total cost different even when input prices look similar?

Is this page a replacement for official pricing pages?

Sources & review

Where next?

Back to all guides

Open pricing hub

Open calculator

One-line answer

TL;DR

What this page is for

2026 pricing table

Why caching and batch discounts matter

Best model tier by task

Quote-ready takeaway

FAQ

Which model is the cheapest?

Why is total cost different even when input prices look similar?

Is this page a replacement for official pricing pages?

Related pages

Sources & review

Where next?

Back to all guides

Open pricing hub

Open calculator