AI Model Comparison

Compare pricing, benchmarks, and capabilities across 487 AI models

487 models tracked0 open source

All Language Models Text → Image Text → Video Text → Speech Image → Video

Type

Provider

All AI21 Labs Alibaba Alibaba Cloud Allen Institute for AI Amazon Anthropic Arcee AI Baidu ByteDance Seed Cartesia China Mobile Cohere Coqui Databricks Deep Cogito DeepSeek ElevenLabs Fish Audio Google Google DeepMind Hume AI IBM Inception InclusionAI Inworld Kimi Kokoro Korea Telecom KwaiKAT LG AI Research LMNT Liquid AI LongCat MBZUAI Institute of Foundation Models Maya Research Meta Meta AI MetaVoice Microsoft Microsoft Azure MiniMax Mistral Mistral AI Motif Technologies Murf AI NVIDIA Nanbeige Naver Neuphonic Nous Research OpenAI OpenChat OpenVoice Perplexity Prime Intellect Reka AI Resemble AI Rime Sarvam ServiceNow Smallest.ai Snowflake Speechify StepFun StyleTTS Swiss AI Initiative TII TII UAE Trillion Labs Upstage Xiaomi Z AI Zyphra async xAI

Price

Any Free <$1/M <$5/M <$20/M

Sort

Best Benchmark Cheapest First Most Expensive Largest Context Fastest

Clear all filters

Model	Provider	Input $/1M↕	Output $/1M↕	Context↕	Intelligence↑	Speed↕	Latency
DeepSeek R2 ★	DeepSeek	$0.55	$2.19	128K	91%	60 tok/s	—
GPT-4.1 ★	OpenAI	$2	$8	1M	90.5%	80 tok/s	—
Claude Opus 4.6 ★	Anthropic	$15	$75	200K	88.7%	60 tok/s	—
GPT-4o ★	OpenAI	$5	$15	128K	87.2%	120 tok/s	—
Claude Sonnet 4.6 ★	Anthropic	$3	$15	200K	86.8%	100 tok/s	—
o3	OpenAI	$10	$40	200K	96.7%	40 tok/s	—
o4-mini	OpenAI	$1.1	$4.4	200K	93.4%	100 tok/s	—
Gemini 3 Ultra	Google DeepMind	$7	$21	1M	90.1%	70 tok/s	—
Claude Opus 4.5 (Reasoning)	Anthropic	—	—	—	49.7	72 tok/s	11.7s
Gemini 3 Pro Preview (low)	Google	—	—	—	41.3	—	—
Claude Opus 4.5 (Non-reasoning)	Anthropic	—	—	—	43.1	63 tok/s	1.1s
Gemini 3 Flash Preview (Reasoning)	Google	—	—	—	46.4	195 tok/s	5.9s
Claude 4.1 Opus (Reasoning)	Anthropic	—	—	—	42	42 tok/s	8.0s
MiniMax-M2.1	MiniMax	—	—	—	39.4	59 tok/s	2.4s
Claude 4.5 Sonnet (Reasoning)	Anthropic	—	—	—	43	59 tok/s	10.4s
Grok 3	xAI	$3	$15	131K	87.5%	90 tok/s	—
Grok 4	xAI	—	—	—	41.5	64 tok/s	7.4s
Gemini 3 Pro	Google DeepMind	$3.5	$10.5	1M	87%	100 tok/s	—
Claude 4 Opus (Reasoning)	Anthropic	—	—	—	39	41 tok/s	8.0s
GPT-5 (medium)	OpenAI	—	—	—	42	95 tok/s	40.4s
GPT-5 Codex (high)	OpenAI	—	—	—	44.6	207 tok/s	11.4s
Qwen3-Max	Alibaba Cloud	$0.4	$1.2	32K	87%	90 tok/s	—
GPT-5 (high)	OpenAI	—	—	—	44.6	86 tok/s	99.7s
GPT-5.1 (high)	OpenAI	—	—	—	47.7	118 tok/s	25.1s
GPT-5.2 (xhigh)	OpenAI	—	—	—	51.3	72 tok/s	81.3s
DeepSeek V3.2 (Reasoning)	DeepSeek	—	—	—	41.7	29 tok/s	1.4s
GPT-5 (low)	OpenAI	—	—	—	39.2	75 tok/s	10.3s
GLM-4.7 (Reasoning)	Z AI	—	—	—	42.1	109 tok/s	0.7s
GPT-5.2 (medium)	OpenAI	—	—	—	46.6	—	—
Claude 4 Opus (Non-reasoning)	Anthropic	—	—	—	33	37 tok/s	1.4s
GPT-5.1 Codex (high)	OpenAI	—	—	—	43.1	167 tok/s	6.7s
Gemini 2.5 Pro	Google	—	—	—	34.6	127 tok/s	22.0s
Claude 4.5 Sonnet (Non-reasoning)	Anthropic	—	—	—	37.1	56 tok/s	1.2s
DeepSeek V3.2 Speciale	DeepSeek	—	—	—	29.4	—	—
Gemini 2.5 Pro Preview (Mar' 25)	Google	—	—	—	30.3	—	—
DeepSeek V3.1 (Reasoning)	DeepSeek	—	—	—	27.7	—	—
DeepSeek R1 0528 (May '25)	DeepSeek	—	—	—	27.1	—	—
Kimi K2 Thinking	Kimi	—	—	—	40.9	41 tok/s	1.1s
Grok 4 Fast (Reasoning)	xAI	—	—	—	35.1	216 tok/s	3.4s
Cogito v2.1 (Reasoning)	Deep Cogito	—	—	—	85%	57 tok/s	0.5s
Grok 4.1 Fast (Reasoning)	xAI	—	—	—	38.6	142 tok/s	9.2s
DeepSeek V3.2 Exp (Reasoning)	DeepSeek	—	—	—	32.9	30 tok/s	1.4s
DeepSeek V3.1 Terminus (Reasoning)	DeepSeek	—	—	—	33.9	—	—
Doubao Seed Code	ByteDance Seed	—	—	—	33.5	—	—
GLM-4.5 (Reasoning)	Z AI	—	—	—	26.4	38 tok/s	0.9s
Claude 4 Sonnet (Reasoning)	Anthropic	—	—	—	38.7	59 tok/s	8.5s
Claude 4 Sonnet (Non-reasoning)	Anthropic	—	—	—	33	52 tok/s	0.8s
Claude 3.7 Sonnet (Reasoning)	Anthropic	—	—	—	34.7	—	—
Gemini 2.5 Flash Preview (Sep '25) (Non-reasoning)	Google	—	—	—	25.7	—	—
o1	OpenAI	—	—	—	30.8	112 tok/s	23.6s
Qwen3 VL 235B A22B (Reasoning)	Alibaba	—	—	—	27.6	45 tok/s	1.2s
Qwen3 Max (Preview)	Alibaba	—	—	—	26.1	47 tok/s	1.8s
Qwen3 235B A22B 2507 (Reasoning)	Alibaba	—	—	—	29.5	51 tok/s	1.3s
K-EXAONE (Reasoning)	LG AI Research	—	—	—	32.1	—	—
GPT-5 mini (high)	OpenAI	—	—	—	41.2	74 tok/s	91.5s
MiMo-V2-Flash (Reasoning)	Xiaomi	—	—	—	39.2	123 tok/s	1.8s
Mistral Large	Mistral AI	$2	$6	128K	84%	90 tok/s	—
DeepSeek R1 (Jan '25)	DeepSeek	—	—	—	18.8	—	—
DeepSeek V3.2 (Non-reasoning)	DeepSeek	—	—	—	32.1	30 tok/s	1.3s
DeepSeek V3.1 Terminus (Non-reasoning)	DeepSeek	—	—	—	28.5	—	—
DeepSeek V3.2 Exp (Non-reasoning)	DeepSeek	—	—	—	28.4	31 tok/s	1.3s
Gemini 2.5 Flash Preview (Sep '25) (Reasoning)	Google	—	—	—	31.1	—	—
Gemini 2.5 Pro Preview (May' 25)	Google	—	—	—	29.5	—	—
Grok 3 Mini	xAI	$0.3	$0.5	131K	83%	160 tok/s	—
ERNIE 5.0 Thinking Preview	Baidu	—	—	—	29.1	—	—
GPT-5 mini (medium)	OpenAI	—	—	—	38.9	77 tok/s	20.0s
DeepSeek V3.1 (Non-reasoning)	DeepSeek	—	—	—	28.1	—	—
Nova 2.0 Pro Preview (medium)	Amazon	—	—	—	35.7	120 tok/s	17.9s
Qwen3 235B A22B 2507 Instruct	Alibaba	—	—	—	25	70 tok/s	1.2s
Hermes 4 - Llama-3.1 405B (Reasoning)	Nous Research	—	—	—	18.6	32 tok/s	0.8s
GLM-4.6 (Reasoning)	Z AI	—	—	—	32.5	36 tok/s	0.9s
Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)	NVIDIA	—	—	—	15	42 tok/s	0.7s
Grok 3 mini Reasoning (high)	xAI	—	—	—	32.1	216 tok/s	0.4s
Gemini 2.5 Flash (Reasoning)	Google	—	—	—	27	205 tok/s	13.3s
Qwen3 235B A22B (Reasoning)	Alibaba	—	—	—	19.8	65 tok/s	1.3s
Nova 2.0 Lite (high)	Amazon	—	—	—	34.5	195 tok/s	21.4s
Kimi K2	Kimi	—	—	—	26.3	35 tok/s	1.3s
INTELLECT-3	Prime Intellect	—	—	—	22.2	—	—
Qwen3 VL 32B (Reasoning)	Alibaba	—	—	—	24.7	97 tok/s	1.4s
GPT-4o mini	OpenAI	$0.15	$0.6	128K	82%	200 tok/s	—
Ling-1T	InclusionAI	—	—	—	19	—	—
Magistral Medium 1.2	Mistral	—	—	—	27.1	95 tok/s	0.4s
Gemini 3 Flash	Google DeepMind	$0.075	$0.3	1M	82%	250 tok/s	—
MiniMax M1 80k	MiniMax	—	—	—	24.4	—	—
GPT-5 (ChatGPT)	OpenAI	—	—	—	21.8	158 tok/s	0.6s
MiniMax-M2	MiniMax	—	—	—	36.1	61 tok/s	2.2s
GLM-4.5-Air	Z AI	—	—	—	23.2	65 tok/s	1.3s
GPT-5.1 Codex mini (high)	OpenAI	—	—	—	38.6	197 tok/s	5.9s
EXAONE 4.0 32B (Reasoning)	LG AI Research	—	—	—	16.7	—	—
Kimi K2 0905	Kimi	—	—	—	30.9	22 tok/s	2.1s
Qwen3 Next 80B A3B Instruct	Alibaba	—	—	—	20.1	166 tok/s	1.0s
Nova 2.0 Pro Preview (low)	Amazon	—	—	—	31.9	143 tok/s	6.8s
Seed-OSS-36B-Instruct	ByteDance Seed	—	—	—	25.2	42 tok/s	1.8s
Qwen3 VL 235B A22B Instruct	Alibaba	—	—	—	20.8	57 tok/s	1.2s
DeepSeek V3 0324	DeepSeek	—	—	—	22.3	—	—
Qwen3 Max Thinking (Preview)	Alibaba	—	—	—	32.5	43 tok/s	1.8s
Qwen3 Next 80B A3B (Reasoning)	Alibaba	—	—	—	26.7	164 tok/s	1.1s
Hermes 4 - Llama-3.1 70B (Reasoning)	Nous Research	—	—	—	16	62 tok/s	0.6s
Mi:dm K 2.5 Pro Preview	Korea Telecom	—	—	—	81%	—	—
GPT-5 (minimal)	OpenAI	—	—	—	23.9	74 tok/s	1.1s
Llama 4 Maverick	Meta	—	—	—	18.4	115 tok/s	0.6s
Qwen3 VL 30B A3B (Reasoning)	Alibaba	—	—	—	19.7	127 tok/s	1.0s
Qwen3 30B A3B 2507 (Reasoning)	Alibaba	—	—	—	22.4	148 tok/s	1.1s
gpt-oss-120B (high)	OpenAI	—	—	—	33.3	215 tok/s	0.5s
Gemini 2.5 Flash-Lite Preview (Sep '25) (Reasoning)	Google	—	—	—	21.6	—	—
Mistral Large 3	Mistral	—	—	—	22.8	56 tok/s	0.6s
MiniMax M1 40k	MiniMax	—	—	—	20.9	—	—
Nova 2.0 Lite (medium)	Amazon	—	—	—	29.7	177 tok/s	13.8s
Nova 2.0 Omni (medium)	Amazon	—	—	—	28	—	—
Solar Pro 2 (Reasoning)	Upstage	—	—	—	14.9	—	—
Gemini 2.5 Flash (Non-reasoning)	Google	—	—	—	20.6	180 tok/s	0.5s
Llama Nemotron Super 49B v1.5 (Reasoning)	NVIDIA	—	—	—	18.7	60 tok/s	0.3s
Gemini 2.0 Pro Experimental (Feb '25)	Google	—	—	—	18.1	—	—
GPT-5.2 (Non-reasoning)	OpenAI	—	—	—	33.6	63 tok/s	0.8s
K-EXAONE (Non-reasoning)	LG AI Research	—	—	—	23.4	—	—
Ring-1T	InclusionAI	—	—	—	22.8	—	—
KAT-Coder-Pro V1	KwaiKAT	—	—	—	36	112 tok/s	1.0s
Mi:dm K 2.5 Pro	Korea Telecom	—	—	—	23.1	—	—
Nova 2.0 Omni (low)	Amazon	—	—	—	23.2	—	—
Qwen3 32B (Reasoning)	Alibaba	—	—	—	16.5	103 tok/s	1.1s
DeepSeek R1 Distill Llama 70B	DeepSeek	—	—	—	16	41 tok/s	0.5s
Motif-2-12.7B-Reasoning	Motif Technologies	—	—	—	19.1	—	—
Gemini 2.5 Flash Preview (Reasoning)	Google	—	—	—	24.3	—	—
GPT-4o (March 2025, chatgpt-4o-latest)	OpenAI	—	—	—	18.6	—	—
o3-mini (high)	OpenAI	—	—	—	25.2	149 tok/s	27.7s
GLM-4.6V (Reasoning)	Z AI	—	—	—	23.4	27 tok/s	1.2s
Claude 3.7 Sonnet (Non-reasoning)	Anthropic	—	—	—	30.8	—	—
Claude 4.5 Haiku (Non-reasoning)	Anthropic	—	—	—	31.1	120 tok/s	0.5s
Gemini 2.5 Flash-Lite Preview (Sep '25) (Non-reasoning)	Google	—	—	—	19.4	—	—
Gemini 2.0 Flash Thinking Experimental (Jan '25)	Google	—	—	—	19.6	—	—
GPT-5.1 (Non-reasoning)	OpenAI	—	—	—	27.4	108 tok/s	0.8s
Qwen3 Coder 480B A35B Instruct	Alibaba	—	—	—	24.8	65 tok/s	1.7s
NVIDIA Nemotron 3 Nano 30B A3B (Reasoning)	NVIDIA	—	—	—	24.3	133 tok/s	1.3s
Ring-flash-2.0	InclusionAI	—	—	—	14	87 tok/s	1.4s
GLM-4.7 (Non-reasoning)	Z AI	—	—	—	34.2	106 tok/s	0.7s
Llama 3.3 Nemotron Super 49B v1 (Reasoning)	NVIDIA	—	—	—	18.5	—	—
Grok Code Fast 1	xAI	—	—	—	28.7	185 tok/s	5.4s
GLM-4.5V (Reasoning)	Z AI	—	—	—	15.1	45 tok/s	1.0s
Qwen3 VL 32B Instruct	Alibaba	—	—	—	17.2	83 tok/s	1.3s
Apriel-v1.6-15B-Thinker	ServiceNow	—	—	—	27.6	—	—
Qwen3 Omni 30B A3B (Reasoning)	Alibaba	—	—	—	15.6	93 tok/s	1.0s
K2-V2 (high)	MBZUAI Institute of Foundation Models	—	—	—	20.6	—	—
o3-mini	OpenAI	—	—	—	25.9	151 tok/s	8.1s
HyperCLOVA X SEED Think (32B)	Naver	—	—	—	23.7	—	—
Nova 2.0 Lite (low)	Amazon	—	—	—	24.6	210 tok/s	5.1s
GPT-5 mini (minimal)	OpenAI	—	—	—	20.7	96 tok/s	1.1s
Command R+	Cohere	$2.5	$10	128K	78%	80 tok/s	—
Gemini 2.0 Flash (Feb '25)	Google	—	—	—	18.5	—	—
Gemini 2.0 Flash (experimental)	Google	—	—	—	16.8	—	—
GLM-4.6 (Non-reasoning)	Z AI	—	—	—	30.2	67 tok/s	0.9s
GPT-4.1 mini	OpenAI	—	—	—	22.9	90 tok/s	0.6s
GPT-5 nano (high)	OpenAI	—	—	—	26.8	144 tok/s	100.6s
gpt-oss-120B (low)	OpenAI	—	—	—	24.5	218 tok/s	0.5s
Ling-flash-2.0	InclusionAI	—	—	—	15.7	94 tok/s	1.5s
Qwen3 30B A3B 2507 Instruct	Alibaba	—	—	—	15	92 tok/s	1.3s
Qwen3 30B A3B (Reasoning)	Alibaba	—	—	—	15.3	70 tok/s	1.2s
Gemini 2.5 Flash Preview (Non-reasoning)	Google	—	—	—	17.8	—	—
ERNIE 4.5 300B A47B	Baidu	—	—	—	15	29 tok/s	1.8s
Claude 3.5 Sonnet (Oct '24)	Anthropic	—	—	—	15.9	—	—
GPT-5 nano (medium)	OpenAI	—	—	—	25.9	145 tok/s	50.0s
Apriel-v1.5-15B-Thinker	ServiceNow	—	—	—	28.3	—	—
Magistral Small 1.2	Mistral	—	—	—	18.2	188 tok/s	0.4s
Solar Pro 2 (Preview) (Reasoning)	Upstage	—	—	—	18.8	—	—
Nova 2.0 Pro Preview (Non-reasoning)	Amazon	—	—	—	23.1	151 tok/s	0.7s
EXAONE 4.0 32B (Non-reasoning)	LG AI Research	—	—	—	11.7	—	—
Qwen3 14B (Reasoning)	Alibaba	—	—	—	16.2	65 tok/s	1.1s
GPT-4o (ChatGPT)	OpenAI	—	—	—	14.1	—	—
Qwen3 235B A22B (Non-reasoning)	Alibaba	—	—	—	17	63 tok/s	1.2s
Claude 4.5 Haiku (Reasoning)	Anthropic	—	—	—	37.1	156 tok/s	10.0s
NVIDIA Nemotron Nano 12B v2 VL (Reasoning)	NVIDIA	—	—	—	14.9	151 tok/s	0.5s
Olmo 3.1 32B Think	Allen Institute for AI	—	—	—	13.9	—	—
K2-V2 (medium)	MBZUAI Institute of Foundation Models	—	—	—	18.7	—	—
Gemini 2.5 Flash-Lite (Reasoning)	Google	—	—	—	17.6	295 tok/s	12.3s
Mistral Medium 3	Mistral	—	—	—	18.8	62 tok/s	0.5s
Sonar Pro	Perplexity	—	—	—	15.2	—	—
Qwen3 VL 30B A3B Instruct	Alibaba	—	—	—	16.1	123 tok/s	1.0s
Qwen2.5 Max	Alibaba	—	—	—	16.3	46 tok/s	1.1s
QwQ 32B	Alibaba	—	—	—	19.7	33 tok/s	0.4s
Devstral 2	Mistral	—	—	—	22	79 tok/s	0.5s
Olmo 3 32B Think	Allen Institute for AI	—	—	—	12.1	—	—
Claude Haiku 4.5	Anthropic	$0.8	$4	200K	75.2%	250 tok/s	—
Magistral Medium 1	Mistral	—	—	—	18.8	—	—
Claude 3.5 Sonnet (June '24)	Anthropic	—	—	—	14.2	—	—
Magistral Small 1	Mistral	—	—	—	16.8	—	—
Qwen3 VL 8B (Reasoning)	Alibaba	—	—	—	16.7	135 tok/s	1.1s
Solar Pro 2 (Non-reasoning)	Upstage	—	—	—	13.6	—	—
Llama 4 Scout	Meta	—	—	—	13.5	137 tok/s	0.5s
GLM-4.6V (Non-reasoning)	Z AI	—	—	—	17.1	23 tok/s	5.9s
gpt-oss-20B (high)	OpenAI	—	—	—	24.5	252 tok/s	0.3s
GLM-4.5V (Non-reasoning)	Z AI	—	—	—	12.7	39 tok/s	29.9s
Gemini 1.5 Pro (Sep '24)	Google	—	—	—	16	—	—
DeepSeek R1 Distill Qwen 14B	DeepSeek	—	—	—	15.8	—	—
NVIDIA Nemotron Nano 9B V2 (Reasoning)	NVIDIA	—	—	—	14.8	117 tok/s	0.3s
GPT-4o (May '24)	OpenAI	—	—	—	14.5	101 tok/s	0.5s
DeepSeek R1 0528 Qwen3 8B	DeepSeek	—	—	—	16.4	—	—
Nova 2.0 Lite (Non-reasoning)	Amazon	—	—	—	18	182 tok/s	0.8s
DeepSeek R1 Distill Qwen 32B	DeepSeek	—	—	—	17.2	42 tok/s	0.5s
Grok 4.1 Fast (Non-reasoning)	xAI	—	—	—	23.6	131 tok/s	0.4s
NVIDIA Nemotron Nano 9B V2 (Non-reasoning)	NVIDIA	—	—	—	13.2	153 tok/s	0.7s
MiMo-V2-Flash (Non-reasoning)	Xiaomi	—	—	—	30.4	124 tok/s	1.5s
o1-mini	OpenAI	—	—	—	20.4	—	—
Qwen3 4B 2507 (Reasoning)	Alibaba	—	—	—	18.2	—	—
Qwen3 8B (Reasoning)	Alibaba	—	—	—	13.2	91 tok/s	1.0s
Grok 4 Fast (Non-reasoning)	xAI	—	—	—	23.1	196 tok/s	0.4s
Hermes 4 - Llama-3.1 405B (Non-reasoning)	Nous Research	—	—	—	17.6	32 tok/s	0.9s
Nova Premier	Amazon	—	—	—	19	70 tok/s	1.2s
Llama 3.1 Instruct 405B	Meta	—	—	—	17.4	31 tok/s	0.7s
Qwen3 32B (Non-reasoning)	Alibaba	—	—	—	14.5	102 tok/s	1.2s
Qwen3 Omni 30B A3B Instruct	Alibaba	—	—	—	10.7	106 tok/s	1.1s
Falcon-H1R-7B	TII UAE	—	—	—	15.8	—	—
Solar Pro 2 (Preview) (Non-reasoning)	Upstage	—	—	—	16	—	—
Llama 3.1 Tulu3 405B	Allen Institute for AI	—	—	—	14.1	—	—
Command R	Cohere	$0.15	$0.6	128K	72%	150 tok/s	—
Mistral Small	Mistral AI	$0.1	$0.3	32K	72%	200 tok/s	—
Gemini 3.1 Flash-Lite	Google DeepMind	$0.01	$0.04	1M	72%	500 tok/s	—
Nova 2.0 Omni (Non-reasoning)	Amazon	—	—	—	16.6	227 tok/s	0.9s
gpt-oss-20B (low)	OpenAI	—	—	—	20.8	261 tok/s	0.4s
Gemini 2.0 Flash-Lite (Feb '25)	Google	—	—	—	14.7	—	—
Qwen2.5 Instruct 72B	Alibaba	—	—	—	15.6	55 tok/s	1.2s
Gemini 2.5 Flash-Lite (Non-reasoning)	Google	—	—	—	12.7	260 tok/s	0.4s
K2-V2 (low)	MBZUAI Institute of Foundation Models	—	—	—	14.4	—	—
Grok 2 (Dec '24)	xAI	—	—	—	13.9	—	—
Command A	Cohere	—	—	—	13.5	40 tok/s	0.6s
Qwen3 Coder 30B A3B Instruct	Alibaba	—	—	—	20	113 tok/s	1.4s
Qwen3 30B A3B (Non-reasoning)	Alibaba	—	—	—	12.5	67 tok/s	1.2s
Llama 3.3 Instruct 70B	Meta	—	—	—	14.5	96 tok/s	0.6s
Devstral Medium	Mistral	—	—	—	18.7	145 tok/s	0.5s
Llama 3.3 Nemotron Super 49B v1 (Non-reasoning)	NVIDIA	—	—	—	14.3	—	—
Qwen2.5 Instruct 32B	Alibaba	—	—	—	13.2	—	—
Sarvam M (Reasoning)	Sarvam	—	—	—	8.4	—	—
Qwen3 4B (Reasoning)	Alibaba	—	—	—	14.2	104 tok/s	1.0s
Qwen3 VL 4B (Reasoning)	Alibaba	—	—	—	13.7	—	—
Grok Beta	xAI	—	—	—	13.3	—	—
Pixtral Large	Mistral	—	—	—	14	51 tok/s	0.5s
Claude 3 Opus	Anthropic	—	—	—	18	—	—
Mistral Large 2 (Nov '24)	Mistral	—	—	—	15.1	41 tok/s	0.5s
Ministral 3 14B	Mistral	—	—	—	16	99 tok/s	0.3s
Sonar	Perplexity	—	—	—	15.5	—	—
Nova Pro	Amazon	—	—	—	13.5	—	—
Llama 3.1 Nemotron Instruct 70B	NVIDIA	—	—	—	13.4	46 tok/s	0.3s
GPT-4 Turbo	OpenAI	—	—	—	13.7	32 tok/s	1.2s
Qwen3 VL 8B Instruct	Alibaba	—	—	—	14.3	148 tok/s	0.9s
Llama Nemotron Super 49B v1.5 (Non-reasoning)	NVIDIA	—	—	—	14.6	58 tok/s	0.3s
Mistral Large 2 (Jul '24)	Mistral	—	—	—	13	—	—
Mistral Medium 3.1	Mistral	—	—	—	21.3	89 tok/s	0.4s
Devstral Small 2	Mistral	—	—	—	19.5	80 tok/s	0.7s
Gemini 1.5 Flash (Sep '24)	Google	—	—	—	13.8	—	—
Qwen3 14B (Non-reasoning)	Alibaba	—	—	—	12.8	65 tok/s	1.0s
Mistral Small 3.2	Mistral	—	—	—	15.1	155 tok/s	0.3s
Llama 3.1 Instruct 70B	Meta	—	—	—	12.5	31 tok/s	0.8s
Qwen3 4B 2507 Instruct	Alibaba	—	—	—	12.9	—	—
Ling-mini-2.0	InclusionAI	—	—	—	9.2	—	—
Reka Flash 3	Reka AI	—	—	—	9.5	94 tok/s	1.3s
Llama 3.2 Instruct 90B (Vision)	Meta	—	—	—	11.9	42 tok/s	0.5s
Mistral Small 3.1	Mistral	—	—	—	14.5	153 tok/s	0.5s
Gemini 1.5 Pro (May '24)	Google	—	—	—	12	—	—
Hermes 4 - Llama-3.1 70B (Non-reasoning)	Nous Research	—	—	—	12.6	63 tok/s	0.6s
Olmo 3 7B Think	Allen Institute for AI	—	—	—	9.4	—	—
GPT-4.1 nano	OpenAI	—	—	—	13	200 tok/s	0.4s
Mistral Small 3	Mistral	—	—	—	12.7	154 tok/s	0.5s
NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning)	NVIDIA	—	—	—	10.1	175 tok/s	0.7s
QwQ 32B-Preview	Alibaba	—	—	—	15.2	43 tok/s	0.5s
Qwen2.5 Coder Instruct 32B	Alibaba	—	—	—	12.9	—	—
Qwen3 8B (Non-reasoning)	Alibaba	—	—	—	10.6	94 tok/s	0.9s
Ministral 3 8B	Mistral	—	—	—	14.8	180 tok/s	0.3s
Qwen2.5 Turbo	Alibaba	—	—	—	12	68 tok/s	1.2s
Devstral Small (May '25)	Mistral	—	—	—	18	—	—
Qwen3 VL 4B Instruct	Alibaba	—	—	—	9.6	—	—
Claude 3.5 Haiku	Anthropic	—	—	—	18.7	—	—
Devstral Small (Jul '25)	Mistral	—	—	—	15.2	202 tok/s	0.4s
Granite 4.0 H Small	IBM	—	—	—	10.8	453 tok/s	8.7s
Qwen2 Instruct 72B	Alibaba	—	—	—	11.7	—	—
Mistral Saba	Mistral	—	—	—	12.1	—	—
Gemma 3 12B Instruct	Google	—	—	—	8.8	30 tok/s	10.2s
Exaone 4.0 1.2B (Reasoning)	LG AI Research	—	—	—	8.3	—	—
Kimi Linear 48B A3B Instruct	Kimi	—	—	—	14.4	—	—
Qwen3 4B (Non-reasoning)	Alibaba	—	—	—	12.5	105 tok/s	1.0s
Nova Lite	Amazon	—	—	—	12.7	221 tok/s	0.7s
DeepHermes 3 - Mistral 24B Preview (Non-reasoning)	Nous Research	—	—	—	10.9	—	—
Jamba Reasoning 3B	AI21 Labs	—	—	—	9.6	—	—
Jamba 1.7 Large	AI21 Labs	—	—	—	10.9	49 tok/s	1.1s
NVIDIA Nemotron 3 Nano 30B A3B (Non-reasoning)	NVIDIA	—	—	—	13.2	78 tok/s	0.3s
Claude 3 Sonnet	Anthropic	—	—	—	10.3	—	—
Gemini 1.5 Flash-8B	Google	—	—	—	11.1	—	—
Jamba 1.5 Large	AI21 Labs	—	—	—	10.7	—	—
Hermes 3 - Llama-3.1 70B	Nous Research	—	—	—	10.6	28 tok/s	0.4s
Qwen3 1.7B (Reasoning)	Alibaba	—	—	—	8	138 tok/s	1.0s
Gemini 1.5 Flash (May '24)	Google	—	—	—	10.5	—	—
Llama 3 Instruct 70B	Meta	—	—	—	8.9	42 tok/s	0.7s
Jamba 1.6 Large	AI21 Labs	—	—	—	10.6	48 tok/s	0.9s
GPT-5 nano (minimal)	OpenAI	—	—	—	13.8	142 tok/s	1.0s
Llama 3.1 Nemotron Nano 4B v1.1 (Reasoning)	NVIDIA	—	—	—	14.4	—	—
Mixtral 8x22B Instruct	Mistral	—	—	—	9.8	—	—
DeepSeek R1 Distill Llama 8B	DeepSeek	—	—	—	12.1	—	—
Nova Micro	Amazon	—	—	—	10.3	314 tok/s	0.6s
Ministral 3 3B	Mistral	—	—	—	11.2	307 tok/s	0.3s
Olmo 3 7B Instruct	Allen Institute for AI	—	—	—	8.2	—	—
OLMo 2 32B	Allen Institute for AI	—	—	—	10.6	—	—
LFM2 8B A1B	Liquid AI	—	—	—	7	—	—
Claude 2.1	Anthropic	—	—	—	9.3	—	—
Exaone 4.0 1.2B (Non-reasoning)	LG AI Research	—	—	—	8.1	—	—
Gemma 3n E4B Instruct	Google	—	—	—	6.4	14 tok/s	0.4s
Phi-4 Multimodal Instruct	Microsoft Azure	—	—	—	10	16 tok/s	0.4s
Mistral Medium	Mistral	—	—	—	9	89 tok/s	0.4s
Claude 2.0	Anthropic	—	—	—	9.1	—	—
Llama 3.1 Instruct 8B	Meta	—	—	—	11.8	170 tok/s	0.4s
Gemma 3n E4B Instruct Preview (May '25)	Google	—	—	—	10.1	—	—
Qwen2.5 Coder Instruct 7B	Alibaba	—	—	—	10	—	—
Phi-4 Mini Instruct	Microsoft Azure	—	—	—	8.4	44 tok/s	0.3s
Granite 3.3 8B (Non-reasoning)	IBM	—	—	—	7	427 tok/s	7.3s
Llama 3.2 Instruct 11B (Vision)	Meta	—	—	—	8.7	79 tok/s	0.5s
GPT-3.5 Turbo	OpenAI	—	—	—	9	89 tok/s	0.5s
Granite 4.0 Micro	IBM	—	—	—	7.7	—	—
Phi-3 Mini Instruct 3.8B	Microsoft Azure	—	—	—	10.1	—	—
Claude Instant	Anthropic	—	—	—	7.4	—	—
DeepSeek Coder V2 Lite Instruct	DeepSeek	—	—	—	8.5	—	—
LFM 40B	Liquid AI	—	—	—	8.8	—	—
Command-R+ (Apr '24)	Cohere	—	—	—	8.3	—	—
Gemini 1.0 Pro	Google	—	—	—	8.5	—	—
Mistral Small (Feb '24)	Mistral	—	—	—	9	154 tok/s	0.5s
Gemma 3 4B Instruct	Google	—	—	—	6.3	30 tok/s	1.1s
Qwen3 1.7B (Non-reasoning)	Alibaba	—	—	—	6.8	141 tok/s	0.9s
Llama 2 Chat 13B	Meta	—	—	—	8.4	—	—
Llama 3 Instruct 8B	Meta	—	—	—	6.4	82 tok/s	0.5s
Llama 2 Chat 70B	Meta	—	—	—	8.4	—	—
Jamba 1.7 Mini	AI21 Labs	—	—	—	8.1	—	—
Mixtral 8x7B Instruct	Mistral	—	—	—	7.7	—	—
Gemma 3n E2B Instruct	Google	—	—	—	4.8	51 tok/s	0.5s
Molmo 7B-D	Allen Institute for AI	—	—	—	9.2	—	—
DeepHermes 3 - Llama-3.1 8B Preview (Non-reasoning)	Nous Research	—	—	—	7.6	—	—
Jamba 1.6 Mini	AI21 Labs	—	—	—	7.9	178 tok/s	0.8s
Jamba 1.5 Mini	AI21 Labs	—	—	—	8	—	—
Llama 3.2 Instruct 3B	Meta	—	—	—	9.7	53 tok/s	0.6s
Qwen3 0.6B (Reasoning)	Alibaba	—	—	—	6.5	189 tok/s	0.9s
Command-R (Mar '24)	Cohere	—	—	—	7.4	—	—
Granite 4.0 1B	IBM	—	—	—	7.3	—	—
OpenChat 3.5 (1210)	OpenChat	—	—	—	8.3	—	—
LFM2 2.6B	Liquid AI	—	—	—	8	—	—
OLMo 2 7B	Allen Institute for AI	—	—	—	9.3	—	—
Granite 4.0 H 1B	IBM	—	—	—	8	—	—
DeepSeek R1 Distill Qwen 1.5B	DeepSeek	—	—	—	9.1	—	—
LFM2 1.2B	Liquid AI	—	—	—	6.3	—	—
Mistral 7B Instruct	Mistral	—	—	—	7.4	190 tok/s	0.3s
Qwen3 0.6B (Non-reasoning)	Alibaba	—	—	—	5.7	194 tok/s	0.9s
Llama 3.2 Instruct 1B	Meta	—	—	—	6.3	88 tok/s	0.6s
Llama 2 Chat 7B	Meta	—	—	—	9.7	108 tok/s	12.6s
Gemma 3 1B Instruct	Google	—	—	—	5.5	48 tok/s	0.6s
Granite 4.0 H 350M	IBM	—	—	—	5.4	—	—
Granite 4.0 350M	IBM	—	—	—	6.1	—	—
Gemma 3 270M	Google	—	—	—	7.7	—	—
GLM-5.1 (Reasoning)	Z AI	—	—	—	51.4	43 tok/s	1.2s
GLM 5V Turbo (Reasoning)	Z AI	—	—	—	42.9	—	—
Tiny Aya Global	Cohere	—	—	—	4.7	—	—
GLM-5 (Reasoning)	Z AI	—	—	—	49.8	67 tok/s	0.9s
Qwen3.5 397B A17B (Reasoning)	Alibaba	—	—	—	45	52 tok/s	1.5s
Qwen3.5 0.8B (Reasoning)	Alibaba	—	—	—	10.5	—	—
Qwen3.5 2B (Non-reasoning)	Alibaba	—	—	—	14.7	232 tok/s	0.3s
Qwen3.5 0.8B (Non-reasoning)	Alibaba	—	—	—	9.9	285 tok/s	0.3s
Qwen3.5 4B (Non-reasoning)	Alibaba	—	—	—	22.6	178 tok/s	0.3s
o1-preview	OpenAI	—	—	—	23.7	—	—
Qwen3 Coder Next	Alibaba	—	—	—	28.3	165 tok/s	0.8s
Qwen3.5 9B (Reasoning)	Alibaba	—	—	—	32.4	56 tok/s	0.4s
Qwen3.5 2B (Reasoning)	Alibaba	—	—	—	16.3	—	—
Qwen3.5 35B A3B (Reasoning)	Alibaba	—	—	—	37.1	149 tok/s	1.2s
Qwen3.5 27B (Non-reasoning)	Alibaba	—	—	—	37.2	92 tok/s	1.4s
Qwen3.5 122B A10B (Reasoning)	Alibaba	—	—	—	41.6	159 tok/s	1.1s
Qwen3 Max Thinking	Alibaba	—	—	—	39.9	36 tok/s	1.7s
K2 Think V2	MBZUAI Institute of Foundation Models	—	—	—	24.1	—	—
Sarvam 105B (high)	Sarvam	—	—	—	18.2	124 tok/s	1.2s
Sarvam 30B (high)	Sarvam	—	—	—	12.3	294 tok/s	1.2s
Qwen3.5 Omni Plus	Alibaba	—	—	—	38.6	55 tok/s	1.3s
Qwen3.5 4B (Reasoning)	Alibaba	—	—	—	27.1	177 tok/s	0.3s
MiMo-V2-Omni-0327	Xiaomi	—	—	—	44.9	—	—
KAT Coder Pro V2	KwaiKAT	—	—	—	43.8	114 tok/s	1.8s
Qwen3.6 Plus	Alibaba	—	—	—	50	53 tok/s	1.6s
Qwen3.5 397B A17B (Non-reasoning)	Alibaba	—	—	—	40.1	52 tok/s	1.4s
Qwen3.5 122B A10B (Non-reasoning)	Alibaba	—	—	—	35.9	152 tok/s	1.1s
Qwen3.5 Omni Flash	Alibaba	—	—	—	25.9	170 tok/s	1.2s
Qwen3.5 27B (Reasoning)	Alibaba	—	—	—	42.1	92 tok/s	1.4s
MiMo-V2-Pro	Xiaomi	—	—	—	49.2	67 tok/s	2.1s
MiMo-V2-Omni	Xiaomi	—	—	—	43.4	—	—
Qwen3.5 35B A3B (Non-reasoning)	Alibaba	—	—	—	30.7	153 tok/s	1.1s
MiMo-V2-Flash (Feb 2026)	Xiaomi	—	—	—	41.5	127 tok/s	1.5s
GPT-3.5 Turbo (0613)	OpenAI	—	—	—	—	—	—
NVIDIA Nemotron 3 Nano 4B	NVIDIA	—	—	—	14.7	—	—
DeepSeek-V2.5	DeepSeek	—	—	—	12.3	—	—
o3-pro	OpenAI	—	—	—	40.7	19 tok/s	95.4s
o1-pro	OpenAI	—	—	—	25.8	—	—
Mercury 2	Inception	—	—	—	32.8	872 tok/s	4.7s
GPT-4o (Aug '24)	OpenAI	—	—	—	18.6	108 tok/s	0.6s
Molmo2-8B	Allen Institute for AI	—	—	—	7.3	—	—
GPT-5.2 Codex (xhigh)	OpenAI	—	—	—	49	107 tok/s	7.4s
GPT-4o Realtime (Dec '24)	OpenAI	—	—	—	—	—	—
GPT-4	OpenAI	—	—	—	12.8	35 tok/s	0.8s
GPT-4o mini Realtime (Dec '24)	OpenAI	—	—	—	—	—	—
Step3 VL 10B	StepFun	—	—	—	15.4	—	—
GPT-4.5 (Preview)	OpenAI	—	—	—	20	—	—
Olmo 3.1 32B Instruct	Allen Institute for AI	—	—	—	12.2	54 tok/s	0.3s
Gemini 2.0 Flash-Lite (Preview)	Google	—	—	—	14.5	—	—
Kimi K2.5 (Non-reasoning)	Kimi	—	—	—	37.3	32 tok/s	1.4s
Step 3.5 Flash 2603	StepFun	—	—	—	38.5	186 tok/s	0.8s
Step 3.5 Flash	StepFun	—	—	—	37.8	163 tok/s	0.8s
Nemotron Cascade 2 30B A3B	NVIDIA	—	—	—	28.4	—	—
Llama 65B	Meta	—	—	—	7.4	—	—
Kimi K2.5 (Reasoning)	Kimi	—	—	—	46.8	32 tok/s	1.3s
Gemini 1.0 Ultra	Google	—	—	—	10.1	—	—
Gemini 2.0 Flash Thinking Experimental (Dec '24)	Google	—	—	—	12.3	—	—
PALM-2	Google	—	—	—	8.6	—	—
LFM2 24B A2B	Liquid AI	—	—	—	10.5	163 tok/s	0.3s
Solar Open 100B (Reasoning)	Upstage	—	—	—	21.7	—	—
Ling 2.6 Flash	InclusionAI	—	—	—	26.2	202 tok/s	0.8s
Qwen3.6 Max Preview	Alibaba	—	—	—	51.8	57 tok/s	1.9s
Claude 3 Haiku	Anthropic	—	—	—	12.3	131 tok/s	0.5s
NVIDIA Nemotron 3 Super 120B A12B (Reasoning)	NVIDIA	—	—	—	36	154 tok/s	1.1s
MiniMax-M2.7	MiniMax	—	—	—	49.6	47 tok/s	1.6s
LFM2.5-1.2B-Thinking	Liquid AI	—	—	—	8.1	—	—
LFM2.5-1.2B-Instruct	Liquid AI	—	—	—	8	—	—
Solar Pro 3	Upstage	—	—	—	25.9	—	—
LFM2.5-VL-1.6B	Liquid AI	—	—	—	6.2	—	—
Claude 4.1 Opus (Non-reasoning)	Anthropic	—	—	—	36	39 tok/s	1.4s
Claude Opus 4.7 (Non-reasoning, High Effort)	Anthropic	—	—	—	51.8	53 tok/s	1.2s
DeepSeek-V2.5 (Dec '24)	DeepSeek	—	—	—	12.5	—	—
DeepSeek-Coder-V2	DeepSeek	—	—	—	10.6	—	—
DeepSeek LLM 67B Chat (V1)	DeepSeek	—	—	—	8.4	—	—
Grok 4.20 0309 v2 (Reasoning)	xAI	—	—	—	49.3	175 tok/s	15.5s
Grok 4.20 0309 v2 (Non-reasoning)	xAI	—	—	—	29	177 tok/s	0.4s
R1 1776	Perplexity	—	—	—	12	—	—
Grok 4.20 0309 (Non-reasoning)	xAI	—	—	—	29.7	164 tok/s	0.4s
GPT-5.4 mini (Non-Reasoning)	OpenAI	—	—	—	23.3	176 tok/s	0.6s
Sonar Reasoning	Perplexity	—	—	—	17.9	—	—
Grok 4.20 0309 (Reasoning)	xAI	—	—	—	48.5	183 tok/s	16.1s
Grok 3 Reasoning Beta	xAI	—	—	—	21.6	—	—
Solar Mini	Upstage	—	—	—	11.9	87 tok/s	1.4s
Codestral	Mistral AI	$0.3	$0.9	32K	—	180 tok/s	—
MiniMax-M2.5	MiniMax	—	—	—	41.9	59 tok/s	2.1s
Gemini 3.1 Pro Preview	Google	—	—	—	57.2	124 tok/s	28.7s
Sonar Reasoning Pro	Perplexity	—	—	—	24.6	—	—
Reka Flash (Sep '24)	Reka AI	—	—	—	12	85 tok/s	1.3s
Claude Sonnet 4.6 (Non-reasoning, Low Effort)	Anthropic	—	—	—	42.6	60 tok/s	1.0s
GLM-4.7-Flash (Non-reasoning)	Z AI	—	—	—	22.1	105 tok/s	1.0s
GLM-4.7-Flash (Reasoning)	Z AI	—	—	—	30.1	91 tok/s	0.9s
Mistral Small 4 (Reasoning)	Mistral	—	—	—	27.8	173 tok/s	0.5s
Claude Sonnet 4.6 (Adaptive Reasoning, Max Effort)	Anthropic	—	—	—	51.7	72 tok/s	46.6s
Gemma 4 E2B (Non-reasoning)	Google	—	—	—	12.1	—	—
Gemma 4 E4B (Non-reasoning)	Google	—	—	—	14.8	—	—
Claude Opus 4.6 (Adaptive Reasoning, Max Effort)	Anthropic	—	—	—	53	53 tok/s	11.7s
Gemma 4 E2B (Reasoning)	Google	—	—	—	15.2	—	—
Gemini 3.1 Flash-Lite Preview	Google	—	—	—	33.5	319 tok/s	5.7s
GPT-5.4 nano (xhigh)	OpenAI	—	—	—	44	157 tok/s	2.5s
Gemma 4 E4B (Reasoning)	Google	—	—	—	18.8	—	—
GPT-5.4 mini (medium)	OpenAI	—	—	—	37.7	181 tok/s	6.3s
GPT-5.4 mini (xhigh)	OpenAI	—	—	—	48.9	189 tok/s	6.9s
Gemma 4 31B (Non-reasoning)	Google	—	—	—	32.3	—	—
Qwen Chat 72B	Alibaba	—	—	—	8.8	—	—
Grok-1	xAI	—	—	—	11.7	—	—
Arctic Instruct	Snowflake	—	—	—	8.8	—	—
Qwen1.5 Chat 110B	Alibaba	—	—	—	9.5	—	—
Gemini 3 Deep Think	Google	—	—	—	—	—	—
Gemma 4 26B A4B (Non-reasoning)	Google	—	—	—	27.1	—	—
Muse Spark	Meta	—	—	—	52.1	—	—
Gemma 4 31B (Reasoning)	Google	—	—	—	39.2	35 tok/s	1.0s
GPT-5.4 nano (medium)	OpenAI	—	—	—	38.1	158 tok/s	3.8s
GPT-5.4 nano (Non-Reasoning)	OpenAI	—	—	—	24.4	161 tok/s	0.6s
Claude Opus 4.7 (Adaptive Reasoning, Max Effort)	Anthropic	—	—	—	57.3	57 tok/s	11.6s
GPT-5.4 (xhigh)	OpenAI	—	—	—	56.8	81 tok/s	157.8s
Mistral Small 4 (Non-reasoning)	Mistral	—	—	—	18.6	149 tok/s	0.5s
JT-MINI	China Mobile	—	—	—	25.4	—	—
GLM-5.1 (Non-reasoning)	Z AI	—	—	—	43.8	47 tok/s	2.1s
GPT-5.4 (Non-reasoning)	OpenAI	—	—	—	35.4	62 tok/s	0.7s
Qwen3.5 9B (Non-reasoning)	Alibaba	—	—	—	27.3	143 tok/s	0.3s
GPT-5.4 Pro (xhigh)	OpenAI	—	—	—	—	—	—
Gemma 4 26B A4B (Reasoning)	Google	—	—	—	31.2	—	—
Qwen Chat 14B	Alibaba	—	—	—	7.4	—	—
GPT-5.3 Codex (xhigh)	OpenAI	—	—	—	53.6	85 tok/s	60.3s
DeepSeek-V2-Chat	DeepSeek	—	—	—	9.1	—	—
Kimi K2.6	Kimi	—	—	—	53.9	135 tok/s	0.8s
Qwen3.6 35B A3B (Reasoning)	Alibaba	—	—	—	43.5	238 tok/s	1.7s
Qwen3.6 35B A3B (Non-reasoning)	Alibaba	—	—	—	31.5	193 tok/s	1.5s
Tri-21B-think Preview	Trillion Labs	—	—	—	20	—	—
LongCat Flash Lite	LongCat	—	—	—	23.9	115 tok/s	3.9s
Nanbeige4.1-3B	Nanbeige	—	—	—	16.1	—	—
Tri-21B-Think	Trillion Labs	—	—	—	18.6	—	—
Apertus 70B Instruct	Swiss AI Initiative	—	—	—	7.7	—	—
Apertus 8B Instruct	Swiss AI Initiative	—	—	—	5.9	—	—
Trinity Large Thinking	Arcee AI	—	—	—	31.9	127 tok/s	0.6s
GLM-5 (Non-reasoning)	Z AI	—	—	—	40.6	53 tok/s	1.4s
GLM-5-Turbo	Z AI	—	—	—	46.8	—	—

Model	Input Cost	Output Cost	Total/Month	vs Cheapest
DeepSeek R2 DeepSeek	$0.55	$1.09	$1.65	✓ Best value
GPT-4.1 OpenAI	$2.00	$4.00	$6.00	3.6× more
Claude Sonnet 4.6 Anthropic	$3.00	$7.50	$10.50	6.4× more
GPT-4o OpenAI	$5.00	$7.50	$12.50	7.6× more
o3 OpenAI	$10.00	$20.00	$30.00	18.2× more
Claude Opus 4.6 Anthropic	$15.00	$37.50	$52.50	31.9× more

AI Model Comparison

Estimate Your Monthly Cost