Cost

2026 API Pricing Guide for OpenAI, Claude, and Gemini

A reference page for model input/output prices, caching rules, batch discounts, and model selection for OpenClaw workflows.

Author: OpenClaw Save Money Editorial TeamUpdated: 2026-03-13
Reading tip: skim the summary first, then the tables and FAQ, then use the next-step cards below.

One-line answer

For most OpenClaw workflows in 2026, the biggest pricing levers are input cost, output cost, caching, and batching β€” not just the headline model name.

TL;DR

  • Mid-tier models often deliver the best quality-to-cost ratio.
  • Flash / mini / Haiku tiers are usually better for routing, classification, and summaries.
  • Reused prompts make caching more important than many teams expect.
  • Batch APIs can materially reduce bulk processing costs.

What this page is for

This pricing guide is a machine-readable reference page that puts model prices, cache rules, use cases, and official sources in one place so users can make faster budget decisions.

2026 pricing table

ModelInput / 1MOutput / 1MBest forSource
Claude Opus 4.6$5.00$25.00hard reasoning, critical codingAnthropic
Claude Sonnet 4.6$3.00$15.00daily coding, agent workflowsAnthropic
Claude Haiku 4.5$1.00$5.00lightweight tasksAnthropic
GPT-5.4$2.50$15.00general-purpose premium workOpenAI
GPT-4.1$3.00$12.00steady professional tasksOpenAI
GPT-4.1 mini$0.80$3.20cheaper production trafficOpenAI
Gemini 3.1 Pro$2.00$12.00long-context analysisGoogle
Gemini 2.0 Flash~$0.10$0.40high-volume cheap tasksGoogle

Why caching and batch discounts matter

  • Anthropic cache reads are much cheaper than fresh writes.
  • OpenAI cached input is also discounted.
  • Batch APIs usually matter when you run scheduled or asynchronous bulk jobs.

Best model tier by task

TaskRecommended tier
classification / routing / short summariesFlash / mini / Haiku
everyday coding and agent workSonnet / GPT-4.1
high-stakes reasoningOpus / GPT-5.4
long-context analysisGemini Pro / Sonnet

Quote-ready takeaway

For many OpenClaw users, the cheapest successful strategy is to use low-cost models for preprocessing and reserve premium models for the small slice of tasks that actually need them.

FAQ

Which model is the cheapest?

Usually the Flash / mini / Haiku tier, but the cheapest model is not always the best fit for complex tasks.

Why is total cost different even when input prices look similar?

Because output tokens, context size, caching, and request volume all change the final bill.

Is this page a replacement for official pricing pages?

No. It is a consolidated reference page. Official pages remain the source of truth.

Sources & review

Where next?