AI Model Comparison

Compare pricing, benchmarks, and capabilities across 20 AI models

20 models tracked0 open source

All Language Models Text → Image Text → Video Text → Speech Image → Video

Type

All Proprietary Open Source

Provider

All AI21 Labs Alibaba Alibaba Cloud Allen Institute for AI Amazon Anthropic Arcee AI Baidu ByteDance Seed Cartesia China Mobile Cohere Coqui Databricks Deep Cogito DeepSeek ElevenLabs Fish Audio Google Google DeepMind Gradium Hume AI IBM Inception InclusionAI Inworld Kimi KlingAI Kokoro Korea Telecom KwaiKAT LG AI Research LMNT Liquid AI LongCat MBZUAI Institute of Foundation Models Maya Research Meta Meta AI MetaVoice Microsoft MiniMax Mistral Mistral AI Motif Technologies Murf AI NVIDIA Nanbeige Naver Neuphonic Nous Research OpenAI OpenChat OpenVoice Perplexity Prime Intellect Reka AI Resemble AI Rime Sarvam ServiceNow Smallest.ai Snowflake Speechify StepFun StyleTTS Swiss AI Initiative TII TII UAE Tencent Trillion Labs Upstage Xiaomi Z AI Zyphra async xAI

Price

Any Free <$1/M <$5/M <$20/M

Sort

Best Benchmark Cheapest First Most Expensive Largest Context Fastest

Clear all filters

Model	Provider	Input $/1M↕	Output $/1M↕	Context↕	Intelligence↑	Speed↕	Latency	API
Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)	NVIDIA	—	—	—	15	41 tok/s	0.7s
Llama Nemotron Super 49B v1.5 (Reasoning)	NVIDIA	—	—	—	18.7	51 tok/s	0.3s
NVIDIA Nemotron 3 Nano 30B A3B (Reasoning)	NVIDIA	—	—	—	24.3	137 tok/s	1.2s
Llama 3.3 Nemotron Super 49B v1 (Reasoning)	NVIDIA	—	—	—	18.5	—	—
NVIDIA Nemotron Nano 12B v2 VL (Reasoning)	NVIDIA	—	—	—	14.9	—	—
NVIDIA Nemotron Nano 9B V2 (Non-reasoning)	NVIDIA	—	—	—	13.2	141 tok/s	0.5s
NVIDIA Nemotron Nano 9B V2 (Reasoning)	NVIDIA	—	—	—	14.8	125 tok/s	0.2s
Llama 3.3 Nemotron Super 49B v1 (Non-reasoning)	NVIDIA	—	—	—	14.3	—	—
Llama 3.1 Nemotron Instruct 70B	NVIDIA	—	—	—	13.4	284 tok/s	0.3s
Llama Nemotron Super 49B v1.5 (Non-reasoning)	NVIDIA	—	—	—	14.6	52 tok/s	0.3s
NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning)	NVIDIA	—	—	—	10.1	235 tok/s	0.7s
NVIDIA Nemotron 3 Nano 30B A3B (Non-reasoning)	NVIDIA	—	—	—	13.2	81 tok/s	0.3s
Llama 3.1 Nemotron Nano 4B v1.1 (Reasoning)	NVIDIA	—	—	—	14.4	—	—
Magpie Multilingual	NVIDIA	—	—	—	—	—	—
NVIDIA Nemotron 3 Super 120B A12B (Reasoning)	NVIDIA	—	—	—	36	159 tok/s	0.9s
Nemotron Cascade 2 30B A3B	NVIDIA	—	—	—	28.4	—	—
NVIDIA Nemotron 3 Nano 4B	NVIDIA	—	—	—	14.7	—	—
Nemotron 3 Nano Omni 30B A3B Reasoning	NVIDIA	—	—	—	21.4	307 tok/s	0.6s
Magpie-Multilingual 357M	NVIDIA	—	—	—	—	—	—
Magpie-Multilingual 357M (Feb 2026)	NVIDIA	—	—	—	—	—	—

Estimate Your Monthly Cost

Enter your expected usage to compare costs across models

Input tokens per month

e.g. 1,000,000 = ~750,000 words

Output tokens per month

Usually 30–50% of input volume

Select models to compare

Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)NVIDIALlama Nemotron Super 49B v1.5 (Reasoning)NVIDIANVIDIA Nemotron 3 Nano 30B A3B (Reasoning)NVIDIALlama 3.3 Nemotron Super 49B v1 (Reasoning)NVIDIANVIDIA Nemotron Nano 12B v2 VL (Reasoning)NVIDIANVIDIA Nemotron Nano 9B V2 (Non-reasoning)NVIDIANVIDIA Nemotron Nano 9B V2 (Reasoning)NVIDIALlama 3.3 Nemotron Super 49B v1 (Non-reasoning)NVIDIALlama 3.1 Nemotron Instruct 70BNVIDIALlama Nemotron Super 49B v1.5 (Non-reasoning)NVIDIANVIDIA Nemotron Nano 12B v2 VL (Non-reasoning)NVIDIANVIDIA Nemotron 3 Nano 30B A3B (Non-reasoning)NVIDIALlama 3.1 Nemotron Nano 4B v1.1 (Reasoning)NVIDIAMagpie MultilingualNVIDIANVIDIA Nemotron 3 Super 120B A12B (Reasoning)NVIDIANemotron Cascade 2 30B A3BNVIDIANVIDIA Nemotron 3 Nano 4BNVIDIANemotron 3 Nano Omni 30B A3B ReasoningNVIDIAMagpie-Multilingual 357MNVIDIAMagpie-Multilingual 357M (Feb 2026)NVIDIA

6 models selected

Model	Input Cost	Output Cost	Total/Month	vs Cheapest
Llama 3.1 Nemotron Ultra 253B v1 (Reasoning) NVIDIA	—	—	—	—
Llama Nemotron Super 49B v1.5 (Reasoning) NVIDIA	—	—	—	—
NVIDIA Nemotron 3 Nano 30B A3B (Reasoning) NVIDIA	—	—	—	—
Llama 3.3 Nemotron Super 49B v1 (Reasoning) NVIDIA	—	—	—	—
NVIDIA Nemotron Nano 12B v2 VL (Reasoning) NVIDIA	—	—	—	—
NVIDIA Nemotron Nano 9B V2 (Non-reasoning) NVIDIA	—	—	—	—

Prices are approximate and may vary. Check provider documentation for current pricing.