LLM Cost per Call Estimator
Use total tokens and call count from a bill or log to derive average cost per API call, average token usage and total spend.
Inputs
Advanced (rate)
Average cost per call
| Metric | USD | CNY |
|---|---|---|
| Per call (avg) | — | — |
| Total | — | — |
An LLM cost per call estimator that derives unit cost from total token usage: paste totals from a bill, log or usage report, then compute average cost per API call, average input/output tokens, cache hit rate and total spend. It matches long-tail searches like AI API unit cost, token usage cost breakdown and average token cost per request. Everything is computed locally in your browser.
Related search intents: LLM cost per call · AI API unit cost calculator · cost per API call calculator · total tokens to per-call cost · average token cost per request · API usage cost breakdown · GPT cost per request · Claude cost per call · Gemini API call cost · DeepSeek cost per call · cached token cost calculator · bill token usage calculator · LLM spend breakdown · AI API average request cost
FAQ
How is this different from the per-call tool?
That one starts from one call’s usage and projects daily/monthly totals; this one starts from a period’s totals and derives the average per call. Use this with bills/logs, that for planning.
How is the hit rate computed?
Hit rate = total cached input ÷ (total cached + total uncached input). Set cached to 0 if you don’t use caching.
What if the call count is wrong?
Call count only affects the derived per-call averages, not the total (the total comes from total tokens). A larger count means smaller derived per-call usage.
Do tiered-pricing models work here?
Yes. The tool selects the tier from the derived per-call input/output length and shows the effective tier price.
Is my data uploaded?
No. Everything is computed locally in your browser; the page records none of the values or text you enter.
Three steps
- Pick a model (grouped by provider, billing currency labeled).
- Enter the period’s total cached input, total uncached input, total output tokens and call count.
- The panel shows the average per-call cost and the total (USD & CNY), plus the derived per-call usage and tier price.
A typical example
10,000 calls in a day: 5M cached input, 25M uncached input, 12M output tokens. With Claude Sonnet 4.5 at rate 7.2: derived ~3,000 input + 1,200 output per call, ~16.7% hit rate, about ¥0.18 per call and ~¥1,850 total for the day.
Before you rely on this
- The result is an average per call and hides per-call variance; for a typical single call use the per-call → totals tool.
- Model rates come from the shared LLM API Pricing Table with official source links and checked dates. Visitors cannot edit site data; provider pricing pages are authoritative for real billing.
- Hit rate = cached ÷ (cached + uncached) input; models with no cache price ignore the cached portion.