Glm 4.5v

Provider: novita · novita/zai-org/glm-4.5v

Pricing per million tokens

Component	USD per 1M tokens	USD per 1k tokens
Input	$0.60	$0.000600
Output	$1.80	$0.0018
Cached input (read)	$0.11	$0.000110

💡 With prompt caching, you save up to 82% on cached input tokens — massive for repeated context like system prompts, RAG retrieval, or long conversations.

Monthly cost estimates

Assuming 30% prompt-cache hit rate where available. Adjust for your actual usage.

Small

1M in / 200k out

$0.81

per month

Saves $0.15 via caching

Medium

10M in / 2000k out

$8.13

per month

Saves $1.47 via caching

Large

100M in / 20000k out

$81.30

per month

Saves $14.70 via caching

Context

Input window: 65,536 tokens
Max output: 16,384 tokens

Capabilities

✅ Vision (image input)
✅ Function / tool calling
⬜ Prompt caching
⬜ Web search
✅ JSON / response schema

Compare Glm 4.5v with similar models

Qwen3 Vl 235b A22b Thinking

Qwen3 Vl 235b A22b Instruct

Qwen3 235b A22b Thinking 2507

novita

$0.30 in / $3.00 out

Pricing data sourced from LiteLLM and refreshed regularly. Last updated May 20, 2026. Always verify with the provider's official pricing page before making business decisions.