💬

How Much Does an AI Customer Support Chatbot Cost? (2026 Breakdown)

A typical SaaS chatbot serving 30k conversations per month costs $45–$340 depending on the model.

Cost range: $45–$340/mo

Scenario

A B2B SaaS company runs a customer-facing chatbot on their support page. The bot answers product questions, walks users through workflows, and escalates to a human when stuck. Traffic averages 1,000 conversations per day with about 8 turns each. The system prompt is large (~500 tokens) because it includes product context, tone instructions, and escalation rules — but it repeats every turn, so prompt caching saves real money.

Assumption	Value
Conversations / day	1,000
Turns / conversation	8 (average)
System prompt	500 tokens (cached)
User input	~200 tokens / turn
Model output	~150 tokens / turn
Days / month	30

The system prompt fires on every turn → ~50% of input tokens hit the prompt cache on providers that support it (Claude, OpenAI, Gemini, DeepSeek).

Monthly cost across recommended models

Calculated at 168M input tokens + 36.0M output tokens, with 50% prompt cache hit rate.

Model	Input cost	Output cost	Cache savings	Total / mo
Deepseek Chat Cheapest	$47.04	$15.12	−$21.17	$40.99
Gpt 5 Mini	$42.00	$72.00	−$18.90	$95.10
Gemini 2.5 Flash	$50.40	$90.00	−$22.68	$118
Claude Haiku 4 5	$168	$180	−$75.60	$272

💡 Switching from Claude Haiku 4 5 to Deepseek Chat saves $231/month (85% reduction).

Why these models

Chatbots need low latency, decent reasoning, and high cache utilization. Mid-tier models hit the sweet spot. Avoid frontier models (GPT-5, Claude Opus) — 6-20× more expensive with marginal UX gain for casual support questions. Sample 2-5% of conversations through a frontier model for quality monitoring instead.

Key insights

1. Prompt caching cuts the bill by 30-50% on every provider that supports it. If your model choice does not support caching, you are paying 2× what you should.
2. Claude Haiku 4.5 and GPT-5 Mini deliver comparable chatbot quality at similar price points — pick whichever has lower latency in your customer region.
3. Gemini 2.5 Flash is the longest-context option here (1M tokens), useful if your system prompt grows beyond 50k tokens.
4. DeepSeek is the cheapest by far but routing to Chinese servers may be a compliance / latency concern for enterprise customers.

Cost at different scales

Scale	Deepseek Chat	Gpt 5 Mini	Gemini 2.5 Flash	Claude Haiku 4 5
100 users (10× smaller)	$4.10	$9.51	$11.77	$27.24
Baseline (1,000 conv/day)	$40.99	$95.10	$118	$272
10,000 conv/day (10× growth)	$410	$951	$1177	$2724
100,000 conv/day (enterprise)	$4099	$9510	$11.8k	$27.2k

Try your own scenario

The numbers above use our best-guess assumptions. For your actual workflow, use the interactive calculator to plug in your real token volumes and quality requirements.

All cost figures are estimates based on publicly-listed pricing as of the data refresh date. Verify with the provider's official pricing page before making business decisions. Embedding costs, vector database costs, and infrastructure costs are not included unless explicitly noted.