Every LLM provider charges per token — but the real monthly bill depends on far more than list price. Our calculator models your actual usage pattern: what percentage of tokens are input vs. output, how much of your input qualifies for prompt caching, and whether your workload can use batch APIs at a discount.

We start with the workflow profile. A customer support chatbot sends mostly input tokens (system prompt + conversation history) with short responses, and the system prompt repeats across every request — so caching saves 50% or more on input costs. A coding assistant, by contrast, generates proportionally more output and benefits less from caching but needs strong reasoning capability. Each of the eight workflow types maps to a specific input/output ratio and cache hit rate derived from real-world usage patterns.

Next we apply your scale. Token costs are linear, but the gap between models widens dramatically at volume: a $0.10/M model vs. a $15/M model differs by $50 at 1M tokens/month — but by $15,000 at 1B tokens/month. Our slider lets you see exactly where the crossover points are.

Finally, we score each of the 2003+ models across cost efficiency, feature fit (vision, function calling, long context), latency profile, and quality tier. The scoring weights shift depending on whether you chose Budget, Balanced, or Premium — so a budget search heavily penalizes expensive models while a premium search rewards capability over price.

Pricing data is sourced from LiteLLM's community-maintained model registry and refreshed daily. All costs shown are estimates — actual bills depend on your exact prompt lengths, caching behavior, and provider-specific billing details.

AI Cost Calculator

Top Recommendationsfor Customer Support Chatbot at 1.0M tokens/mo

How We Calculate AI Costs

Popular Cost Questions