Llama 3.3 Nemotron Super 49B V1.5

Provider: deepinfra · deepinfra/nvidia/Llama-3.3-Nemotron-Super-49B-v1.5

Pricing per million tokens

Component	USD per 1M tokens	USD per 1k tokens
Input	$0.10	$0.000100
Output	$0.40	$0.000400

Monthly cost estimates

Assuming 30% prompt-cache hit rate where available. Adjust for your actual usage.

Small

1M in / 200k out

$0.18

per month

Medium

10M in / 2000k out

$1.80

per month

Large

100M in / 20000k out

$18.00

per month

Context

Input window: 131,072 tokens
Max output: 131,072 tokens

Capabilities

⬜ Vision (image input)
✅ Function / tool calling
⬜ Prompt caching
⬜ Web search
⬜ JSON / response schema

Compare Llama 3.3 Nemotron Super 49B V1.5 with similar models

$0.15 in / $0.40 out

Qwen2.5 VL 32B Instruct

$0.20 in / $0.60 out

Qwen3 235B A22B Instruct 2507

$0.09 in / $0.60 out

Qwen3 Next 80B A3B Instruct

$0.14 in / $1.40 out

Qwen3 Next 80B A3B Thinking

$0.14 in / $1.40 out

DeepSeek R1 Distill Llama 70B

$0.20 in / $0.60 out

Pricing data sourced from LiteLLM and refreshed regularly. Last updated May 20, 2026. Always verify with the provider's official pricing page before making business decisions.