LLM Cost per Call Estimator

Use total tokens and call count from a bill or log to derive average cost per API call, average token usage and total spend.

Inputs

Model

Total cached input

tok

Total uncached input

tok

Total output tokens

tok

Call count

Advanced (rate)

⚠ This estimator uses the shared LLM API Pricing Table and its checked date. Provider prices may change before this site updates; official pricing pages are authoritative.

Rate 1 USD = ? CNY

CNY

Average cost per call

—CNY ≈ —

Metric	USD	CNY
Per call (avg)	—	—
Total	—	—

What it is

An LLM cost per call estimator that derives unit cost from total token usage: paste totals from a bill, log or usage report, then compute average cost per API call, average input/output tokens, cache hit rate and total spend. It matches long-tail searches like AI API unit cost, token usage cost breakdown and average token cost per request. Everything is computed locally in your browser.

Related search intents: LLM cost per call · AI API unit cost calculator · cost per API call calculator · total tokens to per-call cost · average token cost per request · API usage cost breakdown · GPT cost per request · Claude cost per call · Gemini API call cost · DeepSeek cost per call · cached token cost calculator · bill token usage calculator · LLM spend breakdown · AI API average request cost

Related tools

FAQ

How is this different from the per-call tool?

That one starts from one call’s usage and projects daily/monthly totals; this one starts from a period’s totals and derives the average per call. Use this with bills/logs, that for planning.

How is the hit rate computed?

Hit rate = total cached input ÷ (total cached + total uncached input). Set cached to 0 if you don’t use caching.

What if the call count is wrong?

Call count only affects the derived per-call averages, not the total (the total comes from total tokens). A larger count means smaller derived per-call usage.

Do tiered-pricing models work here?

Yes. The tool selects the tier from the derived per-call input/output length and shows the effective tier price.

Is my data uploaded?

No. Everything is computed locally in your browser; the page records none of the values or text you enter.

How to use

Three steps

Pick a model (grouped by provider, billing currency labeled).
Enter the period’s total cached input, total uncached input, total output tokens and call count.
The panel shows the average per-call cost and the total (USD & CNY), plus the derived per-call usage and tier price.

Example

A typical example

10,000 calls in a day: 5M cached input, 25M uncached input, 12M output tokens. With Claude Sonnet 4.5 at rate 7.2: derived ~3,000 input + 1,200 output per call, ~16.7% hit rate, about ¥0.18 per call and ~¥1,850 total for the day.

Limits

Before you rely on this

The result is an average per call and hides per-call variance; for a typical single call use the per-call → totals tool.
Model rates come from the shared LLM API Pricing Table with official source links and checked dates. Visitors cannot edit site data; provider pricing pages are authoritative for real billing.
Hit rate = cached ÷ (cached + uncached) input; models with no cache price ignore the cached portion.

Last updated: 2026-06-14 · Formula: per-call input = (cached+uncached)/calls, hit rate = cached/(cached+uncached), then priced by the shared LLM API Pricing Table; official pages are authoritative.