Glm 4.7 Flash

Provider: openrouter · openrouter/z-ai/glm-4.7-flash

Pricing per million tokens

Component	USD per 1M tokens	USD per 1k tokens
Input	$0.07	$0.000070
Output	$0.40	$0.000400
Cached input (read)	$0.00	$0.00
Cache write	$0.00	$0.00

💡 With prompt caching, you save up to 100% on cached input tokens — massive for repeated context like system prompts, RAG retrieval, or long conversations.

Monthly cost estimates

Assuming 30% prompt-cache hit rate where available. Adjust for your actual usage.

Small

1M in / 200k out

$0.13

per month

Saves $0.02 via caching

Medium

10M in / 2000k out

$1.29

per month

Saves $0.21 via caching

Large

100M in / 20000k out

$12.90

per month

Saves $2.10 via caching

Context

Input window: 200,000 tokens
Max output: 32,000 tokens

Capabilities

✅ Vision (image input)
✅ Function / tool calling
⬜ Prompt caching
⬜ Web search
⬜ JSON / response schema

Compare Glm 4.7 Flash with similar models

Mistral Small 3.1 24b Instruct

openrouter

$0.10 in / $0.30 out

Mistral Small 3.2 24b Instruct

Pricing data sourced from LiteLLM and refreshed regularly. Last updated May 20, 2026. Always verify with the provider's official pricing page before making business decisions.