∞AI
ToolsModelsJobsHackathons
SubmitSign In

AI Model Comparison

Compare pricing, benchmarks, and capabilities across 596 AI models

596 models tracked0 open source
AllLanguage ModelsText → ImageText → VideoText → SpeechImage → Video
Type
AllProprietaryOpen Source
Provider
AllAI21 LabsAlibabaAlibaba CloudAllen Institute for AIAmazonAnthropicArcee AIBaiduByteDance SeedCartesiaChina MobileCohereCoquiDatabricksDeep CogitoDeepSeekElevenLabsFish AudioGoogleGoogle DeepMindGradiumHume AIIBMInceptionInclusionAIInworldKimiKlingAIKokoroKorea TelecomKwaiKATLG AI ResearchLMNTLiquid AILongCatMBZUAI Institute of Foundation ModelsMaya ResearchMetaMeta AIMetaVoiceMicrosoftMiniMaxMistralMistral AIMotif TechnologiesMurf AINVIDIANanbeigeNaverNeuphonicNous ResearchOpenAIOpenChatOpenVoicePerplexityPrime IntellectReka AIResemble AIRimeSarvamServiceNowSmallest.aiSnowflakeSpeechifyStepFunStyleTTS Swiss AI InitiativeTIITII UAETencentTrillion LabsUpstageXiaomiZ AIZyphraasyncxAI
Price
AnyFree<$1/M<$5/M<$20/M
Sort
Best BenchmarkCheapest FirstMost ExpensiveLargest ContextFastest
Clear all filters
ModelProviderInput $/1M↕Output $/1M↕Context↕Intelligence↑Speed↕LatencyAPI
DeepSeek R2
★
DeepSeek$0.55$2.19128K
91%
60 tok/s—
GPT-4.1
★
OpenAI$2$81M
90.5%
80 tok/s—
Claude Opus 4.6
★
Anthropic$15$75200K
88.7%
60 tok/s—
GPT-4o
★
OpenAI$5$15128K
87.2%
120 tok/s—
Claude Sonnet 4.6
★
Anthropic$3$15200K
86.8%
100 tok/s—
o3
OpenAI$10$40200K
96.7%
40 tok/s—
o4-mini
OpenAI$1.1$4.4200K
93.4%
100 tok/s—
Gemini 3 Ultra
Google DeepMind$7$211M
90.1%
70 tok/s—
Claude Opus 4.5 (Reasoning)
Anthropic———
49.7
70 tok/s13.9s
Gemini 3 Pro Preview (low)
Google———
41.3
——
Claude Opus 4.5 (Non-reasoning)
Anthropic———
43.1
58 tok/s1.2s
Gemini 3 Flash Preview (Reasoning)
Google———
46.4
203 tok/s6.4s
Claude 4.5 Sonnet (Reasoning)
Anthropic———
43
46 tok/s11.3s
MiniMax-M2.1
MiniMax———
39.4
85 tok/s1.1s
Claude 4.1 Opus (Reasoning)
Anthropic———
42
36 tok/s9.3s
Grok 3
xAI$3$15131K
87.5%
90 tok/s—
Gemini 3 Pro
Google DeepMind$3.5$10.51M
87%
100 tok/s—
GPT-5.1 (high)
OpenAI———
47.7
168 tok/s21.6s
Claude 4 Opus (Reasoning)
Anthropic———
39
35 tok/s7.6s
Grok 4
xAI———
41.5
44 tok/s15.3s
GPT-5 (high)
OpenAI———
44.6
88 tok/s102.1s
GPT-5 Codex (high)
OpenAI———
44.6
178 tok/s5.6s
GPT-5.2 (xhigh)
OpenAI———
51.3
80 tok/s91.1s
GPT-5 (medium)
OpenAI———
42
84 tok/s45.4s
Qwen3-Max
Alibaba Cloud$0.4$1.232K
87%
90 tok/s—
DeepSeek V3.2 (Reasoning)
DeepSeek———
41.7
——
Claude 4 Opus (Non-reasoning)
Anthropic———
33
36 tok/s1.6s
GPT-5.2 (medium)
OpenAI———
46.6
——
Claude 4.5 Sonnet (Non-reasoning)
Anthropic———
37.1
46 tok/s0.9s
GPT-5.1 Codex (high)
OpenAI———
43.1
196 tok/s5.9s
GPT-5 (low)
OpenAI———
39.2
73 tok/s13.5s
GLM-4.7 (Reasoning)
Z AI———
42.1
106 tok/s0.7s
Gemini 2.5 Pro
Google———
34.6
135 tok/s20.0s
Gemini 2.5 Pro Preview (Mar' 25)
Google———
30.3
——
DeepSeek V3.2 Speciale
DeepSeek———
29.4
——
DeepSeek V3.1 Terminus (Reasoning)
DeepSeek———
33.9
——
DeepSeek R1 0528 (May '25)
DeepSeek———
27.1
——
Kimi K2 Thinking
Kimi———
40.9
107 tok/s0.9s
Grok 4.1 Fast (Reasoning)
xAI———
38.6
88 tok/s20.1s
DeepSeek V3.1 (Reasoning)
DeepSeek———
27.7
——
DeepSeek V3.2 Exp (Reasoning)
DeepSeek———
32.9
——
Grok 4 Fast (Reasoning)
xAI———
35.1
92 tok/s9.0s
Doubao Seed Code
ByteDance Seed———
33.5
——
Cogito v2.1 (Reasoning)
Deep Cogito———
85%
86 tok/s0.5s
o1
OpenAI———
30.8
108 tok/s20.4s
Gemini 2.5 Flash Preview (Sep '25) (Non-reasoning)
Google———
25.7
——
Qwen3 Max (Preview)
Alibaba———
26.1
45 tok/s1.7s
Qwen3 235B A22B 2507 (Reasoning)
Alibaba———
29.5
58 tok/s1.2s
DeepSeek V3.2 Exp (Non-reasoning)
DeepSeek———
28.4
——
Gemini 2.5 Flash Preview (Sep '25) (Reasoning)
Google———
31.1
——
GLM-4.5 (Reasoning)
Z AI———
26.4
42 tok/s1.2s
Claude 4 Sonnet (Reasoning)
Anthropic———
38.7
50 tok/s12.5s
Qwen3 VL 235B A22B (Reasoning)
Alibaba———
27.6
38 tok/s1.2s
GPT-5 mini (high)
OpenAI———
41.2
85 tok/s116.6s
DeepSeek V3.2 (Non-reasoning)
DeepSeek———
32.1
——
Mistral Large
Mistral AI$2$6128K
84%
90 tok/s—
K-EXAONE (Reasoning)
LG AI Research———
32.1
——
DeepSeek R1 (Jan '25)
DeepSeek———
18.8
——
MiMo-V2-Flash (Reasoning)
Xiaomi———
39.2
140 tok/s1.6s
Claude 4 Sonnet (Non-reasoning)
Anthropic———
33
47 tok/s0.8s
Claude 3.7 Sonnet (Reasoning)
Anthropic———
34.7
——
DeepSeek V3.1 Terminus (Non-reasoning)
DeepSeek———
28.5
——
Gemini 2.5 Pro Preview (May' 25)
Google———
29.5
——
ERNIE 5.0 Thinking Preview
Baidu———
29.1
——
Qwen3 235B A22B 2507 Instruct
Alibaba———
25
66 tok/s1.1s
Gemini 2.5 Flash (Reasoning)
Google———
27
246 tok/s14.3s
Hermes 4 - Llama-3.1 405B (Reasoning)
Nous Research———
18.6
——
DeepSeek V3.1 (Non-reasoning)
DeepSeek———
28.1
——
Grok 3 Mini
xAI$0.3$0.5131K
83%
160 tok/s—
GPT-5 mini (medium)
OpenAI———
38.9
98 tok/s20.5s
Grok 3 mini Reasoning (high)
xAI———
32.1
191 tok/s0.4s
Nova 2.0 Pro Preview (medium)
Amazon———
35.7
155 tok/s19.7s
Qwen3 235B A22B (Reasoning)
Alibaba———
19.8
61 tok/s1.2s
Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)
NVIDIA———
15
41 tok/s0.7s
GLM-4.6 (Reasoning)
Z AI———
32.5
38 tok/s0.8s
EXAONE 4.0 32B (Reasoning)
LG AI Research———
16.7
——
Qwen3 Next 80B A3B (Reasoning)
Alibaba———
26.7
163 tok/s1.1s
Seed-OSS-36B-Instruct
ByteDance Seed———
25.2
37 tok/s1.7s
GPT-4o mini
OpenAI$0.15$0.6128K
82%
200 tok/s—
INTELLECT-3
Prime Intellect———
22.2
——
Qwen3 VL 32B (Reasoning)
Alibaba———
24.7
96 tok/s1.3s
MiniMax-M2
MiniMax———
36.1
77 tok/s1.2s
Ling-1T
InclusionAI———
19
——
Nova 2.0 Pro Preview (low)
Amazon———
31.9
162 tok/s8.4s
Qwen3 Next 80B A3B Instruct
Alibaba———
20.1
166 tok/s1.1s
DeepSeek V3 0324
DeepSeek———
22.3
——
Gemini 3 Flash
Google DeepMind$0.075$0.31M
82%
250 tok/s—
GPT-5 (ChatGPT)
OpenAI———
21.8
178 tok/s0.6s
Nova 2.0 Lite (high)
Amazon———
34.5
153 tok/s18.2s
Kimi K2 0905
Kimi———
30.9
16 tok/s1.9s
GPT-5.1 Codex mini (high)
OpenAI———
38.6
198 tok/s4.3s
Qwen3 VL 235B A22B Instruct
Alibaba———
20.8
51 tok/s1.1s
Magistral Medium 1.2
Mistral———
27.1
44 tok/s0.5s
GLM-4.5-Air
Z AI———
23.2
58 tok/s1.5s
Kimi K2
Kimi———
26.3
35 tok/s1.1s
Qwen3 Max Thinking (Preview)
Alibaba———
32.5
42 tok/s1.8s
MiniMax M1 80k
MiniMax———
24.4
——
Nova 2.0 Lite (medium)
Amazon———
29.7
153 tok/s16.9s
K-EXAONE (Non-reasoning)
LG AI Research———
23.4
——
Qwen3 30B A3B 2507 (Reasoning)
Alibaba———
22.4
148 tok/s1.1s
KAT-Coder-Pro V1
KwaiKAT———
36
118 tok/s1.4s
Mi:dm K 2.5 Pro
Korea Telecom———
23.1
——
GPT-5.2 (Non-reasoning)
OpenAI———
33.6
68 tok/s0.7s
Gemini 2.0 Pro Experimental (Feb '25)
Google———
18.1
——
Mistral Large 3
Mistral———
22.8
55 tok/s0.6s
Gemini 2.5 Flash (Non-reasoning)
Google———
20.6
215 tok/s0.5s
Nova 2.0 Omni (medium)
Amazon———
28
——
GPT-5 (minimal)
OpenAI———
23.9
74 tok/s1.0s
MiniMax M1 40k
MiniMax———
20.9
——
Llama 4 Maverick
Meta———
18.4
113 tok/s0.6s
Qwen3 VL 30B A3B (Reasoning)
Alibaba———
19.7
125 tok/s1.1s
Ring-1T
InclusionAI———
22.8
——
Hermes 4 - Llama-3.1 70B (Reasoning)
Nous Research———
16
——
Solar Pro 2 (Reasoning)
Upstage———
14.9
——
Gemini 2.5 Flash-Lite Preview (Sep '25) (Reasoning)
Google———
21.6
——
gpt-oss-120B (high)
OpenAI———
33.3
224 tok/s0.5s
Llama Nemotron Super 49B v1.5 (Reasoning)
NVIDIA———
18.7
51 tok/s0.3s
Mi:dm K 2.5 Pro Preview
Korea Telecom———
81%
——
Claude 4.5 Haiku (Non-reasoning)
Anthropic———
31.1
100 tok/s0.6s
Qwen3 32B (Reasoning)
Alibaba———
16.5
104 tok/s1.0s
GPT-5.1 (Non-reasoning)
OpenAI———
27.4
167 tok/s0.6s
o3-mini (high)
OpenAI———
25.2
164 tok/s27.4s
DeepSeek R1 Distill Llama 70B
DeepSeek———
16
44 tok/s0.3s
Nova 2.0 Omni (low)
Amazon———
23.2
——
Gemini 2.0 Flash Thinking Experimental (Jan '25)
Google———
19.6
——
Gemini 2.5 Flash-Lite Preview (Sep '25) (Non-reasoning)
Google———
19.4
——
Gemini 2.5 Flash Preview (Reasoning)
Google———
24.3
——
Motif-2-12.7B-Reasoning
Motif Technologies———
19.1
——
GLM-4.6V (Reasoning)
Z AI———
23.4
30 tok/s1.4s
Claude 3.7 Sonnet (Non-reasoning)
Anthropic———
30.8
——
GPT-4o (March 2025, chatgpt-4o-latest)
OpenAI———
18.6
——
Qwen3 Omni 30B A3B (Reasoning)
Alibaba———
15.6
97 tok/s1.1s
Qwen3 Coder 480B A35B Instruct
Alibaba———
24.8
66 tok/s1.7s
Nova 2.0 Lite (low)
Amazon———
24.6
157 tok/s6.4s
Grok Code Fast 1
xAI———
28.7
133 tok/s5.7s
NVIDIA Nemotron 3 Nano 30B A3B (Reasoning)
NVIDIA———
24.3
137 tok/s1.2s
Llama 3.3 Nemotron Super 49B v1 (Reasoning)
NVIDIA———
18.5
——
Qwen3 VL 32B Instruct
Alibaba———
17.2
72 tok/s1.1s
K2-V2 (high)
MBZUAI Institute of Foundation Models———
20.6
——
HyperCLOVA X SEED Think (32B)
Naver———
23.7
——
GLM-4.5V (Reasoning)
Z AI———
15.1
47 tok/s1.2s
Apriel-v1.6-15B-Thinker
ServiceNow———
27.6
——
GLM-4.7 (Non-reasoning)
Z AI———
34.2
107 tok/s0.6s
Ring-flash-2.0
InclusionAI———
14
86 tok/s1.5s
o3-mini
OpenAI———
25.9
162 tok/s6.6s
Qwen3 30B A3B (Reasoning)
Alibaba———
15.3
71 tok/s1.3s
gpt-oss-120B (low)
OpenAI———
24.5
220 tok/s0.5s
Gemini 2.5 Flash Preview (Non-reasoning)
Google———
17.8
——
Gemini 2.0 Flash (Feb '25)
Google———
18.5
——
GLM-4.6 (Non-reasoning)
Z AI———
30.2
52 tok/s1.1s
Gemini 2.0 Flash (experimental)
Google———
16.8
——
Command R+
Cohere$2.5$10128K
78%
80 tok/s—
GPT-5 nano (high)
OpenAI———
26.8
148 tok/s109.5s
Ling-flash-2.0
InclusionAI———
15.7
84 tok/s1.4s
ERNIE 4.5 300B A47B
Baidu———
15
23 tok/s1.5s
Qwen3 30B A3B 2507 Instruct
Alibaba———
15
114 tok/s1.1s
GPT-4.1 mini
OpenAI———
22.9
79 tok/s0.5s
GPT-5 mini (minimal)
OpenAI———
20.7
87 tok/s0.8s
Apriel-v1.5-15B-Thinker
ServiceNow———
28.3
——
Claude 3.5 Sonnet (Oct '24)
Anthropic———
15.9
——
Qwen3 14B (Reasoning)
Alibaba———
16.2
64 tok/s1.0s
GPT-4o (ChatGPT)
OpenAI———
14.1
——
Solar Pro 2 (Preview) (Reasoning)
Upstage———
18.8
——
Nova 2.0 Pro Preview (Non-reasoning)
Amazon———
23.1
155 tok/s0.7s
Magistral Small 1.2
Mistral———
18.2
109 tok/s0.4s
GPT-5 nano (medium)
OpenAI———
25.9
144 tok/s55.2s
EXAONE 4.0 32B (Non-reasoning)
LG AI Research———
11.7
——
Devstral 2
Mistral———
22
64 tok/s0.6s
QwQ 32B
Alibaba———
19.7
31 tok/s0.5s
K2-V2 (medium)
MBZUAI Institute of Foundation Models———
18.7
——
Olmo 3.1 32B Think
Allen Institute for AI———
13.9
——
Gemini 2.5 Flash-Lite (Reasoning)
Google———
17.6
302 tok/s24.7s
Mistral Medium 3
Mistral———
18.8
48 tok/s0.5s
Qwen3 VL 30B A3B Instruct
Alibaba———
16.1
123 tok/s1.0s
Olmo 3 32B Think
Allen Institute for AI———
12.1
——
Qwen3 235B A22B (Non-reasoning)
Alibaba———
17
62 tok/s1.1s
Claude 4.5 Haiku (Reasoning)
Anthropic———
37.1
139 tok/s16.7s
NVIDIA Nemotron Nano 12B v2 VL (Reasoning)
NVIDIA———
14.9
——
Qwen2.5 Max
Alibaba———
16.3
49 tok/s1.2s
Sonar Pro
Perplexity———
15.2
——
Claude Haiku 4.5
Anthropic$0.8$4200K
75.2%
250 tok/s—
Gemini 1.5 Pro (Sep '24)
Google———
16
——
Magistral Small 1
Mistral———
16.8
——
Solar Pro 2 (Non-reasoning)
Upstage———
13.6
——
Qwen3 VL 8B (Reasoning)
Alibaba———
16.7
131 tok/s1.1s
gpt-oss-20B (high)
OpenAI———
24.5
253 tok/s0.4s
Llama 4 Scout
Meta———
13.5
134 tok/s0.6s
Magistral Medium 1
Mistral———
18.8
——
Claude 3.5 Sonnet (June '24)
Anthropic———
14.2
——
GLM-4.5V (Non-reasoning)
Z AI———
12.7
49 tok/s29.7s
GLM-4.6V (Non-reasoning)
Z AI———
17.1
34 tok/s4.8s
GPT-4o (May '24)
OpenAI———
14.5
132 tok/s0.6s
o1-mini
OpenAI———
20.4
——
DeepSeek R1 0528 Qwen3 8B
DeepSeek———
16.4
——
NVIDIA Nemotron Nano 9B V2 (Non-reasoning)
NVIDIA———
13.2
141 tok/s0.5s
MiMo-V2-Flash (Non-reasoning)
Xiaomi———
30.4
137 tok/s1.2s
NVIDIA Nemotron Nano 9B V2 (Reasoning)
NVIDIA———
14.8
125 tok/s0.2s
DeepSeek R1 Distill Qwen 32B
DeepSeek———
17.2
——
Qwen3 8B (Reasoning)
Alibaba———
13.2
91 tok/s1.0s
Qwen3 4B 2507 (Reasoning)
Alibaba———
18.2
——
Grok 4.1 Fast (Non-reasoning)
xAI———
23.6
75 tok/s0.4s
Nova 2.0 Lite (Non-reasoning)
Amazon———
18
229 tok/s0.8s
DeepSeek R1 Distill Qwen 14B
DeepSeek———
15.8
——
Falcon-H1R-7B
TII UAE———
15.8
——
Llama 3.1 Instruct 405B
Meta———
17.4
66 tok/s0.6s
Qwen3 Omni 30B A3B Instruct
Alibaba———
10.7
108 tok/s0.9s
Grok 4 Fast (Non-reasoning)
xAI———
23.1
85 tok/s0.4s
Qwen3 32B (Non-reasoning)
Alibaba———
14.5
104 tok/s1.1s
Nova Premier
Amazon———
19
28 tok/s1.5s
Hermes 4 - Llama-3.1 405B (Non-reasoning)
Nous Research———
17.6
35 tok/s0.8s
Solar Pro 2 (Preview) (Non-reasoning)
Upstage———
16
——
Command R
Cohere$0.15$0.6128K
72%
150 tok/s—
Gemini 2.5 Flash-Lite (Non-reasoning)
Google———
12.7
268 tok/s1.7s
Gemini 2.0 Flash-Lite (Feb '25)
Google———
14.7
——
Gemini 3.1 Flash-Lite
Google DeepMind$0.01$0.041M
72%
500 tok/s—
gpt-oss-20B (low)
OpenAI———
20.8
251 tok/s0.4s
Llama 3.1 Tulu3 405B
Allen Institute for AI———
14.1
——
Qwen2.5 Instruct 72B
Alibaba———
15.6
55 tok/s1.3s
Nova 2.0 Omni (Non-reasoning)
Amazon———
16.6
——
Mistral Small
Mistral AI$0.1$0.332K
72%
200 tok/s—
Qwen3 30B A3B (Non-reasoning)
Alibaba———
12.5
68 tok/s1.1s
Llama 3.3 Instruct 70B
Meta———
14.5
93 tok/s0.6s
Grok 2 (Dec '24)
xAI———
13.9
——
K2-V2 (low)
MBZUAI Institute of Foundation Models———
14.4
——
Qwen3 Coder 30B A3B Instruct
Alibaba———
20
111 tok/s1.5s
Command A
Cohere———
13.5
36 tok/s0.6s
Devstral Medium
Mistral———
18.7
69 tok/s0.5s
Claude 3 Opus
Anthropic———
18
——
Qwen2.5 Instruct 32B
Alibaba———
13.2
——
Llama 3.3 Nemotron Super 49B v1 (Non-reasoning)
NVIDIA———
14.3
——
Qwen3 4B (Reasoning)
Alibaba———
14.2
104 tok/s1.1s
Sarvam M (Reasoning)
Sarvam———
8.4
141 tok/s1.2s
Qwen3 VL 4B (Reasoning)
Alibaba———
13.7
——
Mistral Large 2 (Nov '24)
Mistral———
15.1
31 tok/s0.6s
Grok Beta
xAI———
13.3
——
Pixtral Large
Mistral———
14
55 tok/s0.6s
Llama Nemotron Super 49B v1.5 (Non-reasoning)
NVIDIA———
14.6
52 tok/s0.3s
Qwen3 VL 8B Instruct
Alibaba———
14.3
144 tok/s0.9s
Ministral 3 14B
Mistral———
16
149 tok/s0.4s
GPT-4 Turbo
OpenAI———
13.7
37 tok/s1.0s
Sonar
Perplexity———
15.5
——
Nova Pro
Amazon———
13.5
——
Llama 3.1 Nemotron Instruct 70B
NVIDIA———
13.4
284 tok/s0.3s
Llama 3.1 Instruct 70B
Meta———
12.5
34 tok/s0.6s
Devstral Small 2
Mistral———
19.5
66 tok/s0.5s
Qwen3 14B (Non-reasoning)
Alibaba———
12.8
64 tok/s1.0s
Mistral Large 2 (Jul '24)
Mistral———
13
——
Mistral Small 3.2
Mistral———
15.1
152 tok/s0.4s
Gemini 1.5 Flash (Sep '24)
Google———
13.8
——
Mistral Medium 3.1
Mistral———
21.3
85 tok/s0.5s
Qwen3 4B 2507 Instruct
Alibaba———
12.9
——
Llama 3.2 Instruct 90B (Vision)
Meta———
11.9
48 tok/s0.6s
Ling-mini-2.0
InclusionAI———
9.2
——
Reka Flash 3
Reka AI———
9.5
92 tok/s1.6s
Olmo 3 7B Think
Allen Institute for AI———
9.4
——
Gemini 1.5 Pro (May '24)
Google———
12
——
Mistral Small 3.1
Mistral———
14.5
137 tok/s0.5s
Hermes 4 - Llama-3.1 70B (Non-reasoning)
Nous Research———
12.6
76 tok/s0.6s
GPT-4.1 nano
OpenAI———
13
126 tok/s0.4s
QwQ 32B-Preview
Alibaba———
15.2
——
Mistral Small 3
Mistral———
12.7
138 tok/s0.5s
NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning)
NVIDIA———
10.1
235 tok/s0.7s
Ministral 3 8B
Mistral———
14.8
166 tok/s0.3s
Qwen3 8B (Non-reasoning)
Alibaba———
10.6
84 tok/s1.0s
Qwen2.5 Coder Instruct 32B
Alibaba———
12.9
——
Claude 3.5 Haiku
Anthropic———
18.7
——
Qwen2.5 Turbo
Alibaba———
12
68 tok/s1.2s
Qwen3 VL 4B Instruct
Alibaba———
9.6
——
Devstral Small (May '25)
Mistral———
18
——
Devstral Small (Jul '25)
Mistral———
15.2
196 tok/s0.4s
Qwen2 Instruct 72B
Alibaba———
11.7
——
Granite 4.0 H Small
IBM———
10.8
286 tok/s8.8s
Mistral Saba
Mistral———
12.1
——
Gemma 3 12B Instruct
Google———
8.8
——
Qwen3 4B (Non-reasoning)
Alibaba———
12.5
105 tok/s1.0s
Nova Lite
Amazon———
12.7
200 tok/s0.7s
Exaone 4.0 1.2B (Reasoning)
LG AI Research———
8.3
——
Kimi Linear 48B A3B Instruct
Kimi———
14.4
——
DeepHermes 3 - Mistral 24B Preview (Non-reasoning)
Nous Research———
10.9
——
Jamba 1.7 Large
AI21 Labs———
10.9
50 tok/s1.0s
NVIDIA Nemotron 3 Nano 30B A3B (Non-reasoning)
NVIDIA———
13.2
81 tok/s0.3s
Jamba Reasoning 3B
AI21 Labs———
9.6
——
Claude 3 Sonnet
Anthropic———
10.3
——
Gemini 1.5 Flash (May '24)
Google———
10.5
——
Jamba 1.5 Large
AI21 Labs———
10.7
——
Hermes 3 - Llama-3.1 70B
Nous Research———
10.6
30 tok/s0.4s
Gemini 1.5 Flash-8B
Google———
11.1
——
Qwen3 1.7B (Reasoning)
Alibaba———
8
138 tok/s0.9s
Llama 3 Instruct 70B
Meta———
8.9
46 tok/s0.7s
Llama 3.1 Nemotron Nano 4B v1.1 (Reasoning)
NVIDIA———
14.4
——
Jamba 1.6 Large
AI21 Labs———
10.6
56 tok/s0.9s
GPT-5 nano (minimal)
OpenAI———
13.8
142 tok/s0.8s
DeepSeek R1 Distill Llama 8B
DeepSeek———
12.1
——
Mixtral 8x22B Instruct
Mistral———
9.8
——
Nova Micro
Amazon———
10.3
305 tok/s0.6s
Ministral 3 3B
Mistral———
11.2
297 tok/s0.3s
Olmo 3 7B Instruct
Allen Institute for AI———
8.2
——
LFM2 8B A1B
Liquid AI———
7
——
OLMo 2 32B
Allen Institute for AI———
10.6
——
Exaone 4.0 1.2B (Non-reasoning)
LG AI Research———
8.1
——
Claude 2.1
Anthropic———
9.3
——
Claude 2.0
Anthropic———
9.1
——
Phi-4 Multimodal Instruct
Microsoft———
10
16 tok/s1.8s
Gemma 3n E4B Instruct
Google———
6.4
30 tok/s0.7s
Mistral Medium
Mistral———
9
83 tok/s0.6s
Gemma 3n E4B Instruct Preview (May '25)
Google———
10.1
——
Llama 3.1 Instruct 8B
Meta———
11.8
203 tok/s0.5s
Phi-4 Mini Instruct
Microsoft———
8.4
45 tok/s0.3s
Granite 3.3 8B (Non-reasoning)
IBM———
7
392 tok/s21.9s
Qwen2.5 Coder Instruct 7B
Alibaba———
10
——
Llama 3.2 Instruct 11B (Vision)
Meta———
8.7
86 tok/s0.4s
GPT-3.5 Turbo
OpenAI———
9
93 tok/s0.4s
Granite 4.0 Micro
IBM———
7.7
——
Phi-3 Mini Instruct 3.8B
Microsoft———
10.1
——
Claude Instant
Anthropic———
7.4
——
Gemini 1.0 Pro
Google———
8.5
——
Command-R+ (Apr '24)
Cohere———
8.3
——
LFM 40B
Liquid AI———
8.8
——
DeepSeek Coder V2 Lite Instruct
DeepSeek———
8.5
——
Mistral Small (Feb '24)
Mistral———
9
143 tok/s0.6s
Gemma 3 4B Instruct
Google———
6.3
——
Llama 2 Chat 13B
Meta———
8.4
——
Llama 2 Chat 70B
Meta———
8.4
——
Llama 3 Instruct 8B
Meta———
6.4
82 tok/s0.5s
Qwen3 1.7B (Non-reasoning)
Alibaba———
6.8
139 tok/s1.0s
Jamba 1.7 Mini
AI21 Labs———
8.1
——
Mixtral 8x7B Instruct
Mistral———
7.7
——
Gemma 3n E2B Instruct
Google———
4.8
——
DeepHermes 3 - Llama-3.1 8B Preview (Non-reasoning)
Nous Research———
7.6
——
Molmo 7B-D
Allen Institute for AI———
9.2
——
Jamba 1.5 Mini
AI21 Labs———
8
——
Jamba 1.6 Mini
AI21 Labs———
7.9
180 tok/s0.7s
Llama 3.2 Instruct 3B
Meta———
9.7
52 tok/s0.7s
Qwen3 0.6B (Reasoning)
Alibaba———
6.5
225 tok/s0.9s
Command-R (Mar '24)
Cohere———
7.4
——
Granite 4.0 1B
IBM———
7.3
——
OpenChat 3.5 (1210)
OpenChat———
8.3
——
LFM2 2.6B
Liquid AI———
8
——
OLMo 2 7B
Allen Institute for AI———
9.3
——
Granite 4.0 H 1B
IBM———
8
——
DeepSeek R1 Distill Qwen 1.5B
DeepSeek———
9.1
——
LFM2 1.2B
Liquid AI———
6.3
——
Mistral 7B Instruct
Mistral———
7.4
163 tok/s0.4s
Qwen3 0.6B (Non-reasoning)
Alibaba———
5.7
222 tok/s0.9s
Llama 3.2 Instruct 1B
Meta———
6.3
98 tok/s0.6s
Llama 2 Chat 7B
Meta———
9.7
98 tok/s10.3s
Gemma 3 1B Instruct
Google———
5.5
——
Granite 4.0 H 350M
IBM———
5.4
——
Granite 4.0 350M
IBM———
6.1
——
Gemma 3 270M
Google———
7.7
——
Nemotron 3 Nano Omni 30B A3B Reasoning
NVIDIA———
21.4
307 tok/s0.6s
GPT-4o (Aug '24)
OpenAI———
18.6
106 tok/s0.5s
GPT-5.2 Codex (xhigh)
OpenAI———
49
103 tok/s1.3s
Grok 4.20 0309 (Non-reasoning)
xAI———
29.7
88 tok/s0.6s
GPT-3.5 Turbo (0613)
OpenAI——————
Qwen3 Max Thinking
Alibaba———
39.9
46 tok/s1.5s
Qwen3.5 122B A10B (Non-reasoning)
Alibaba———
35.9
163 tok/s1.1s
MiniMax-M2.5
MiniMax———
41.9
86 tok/s1.2s
Qwen3.5 27B (Reasoning)
Alibaba———
42.1
92 tok/s1.4s
Qwen3.5 2B (Reasoning)
Alibaba———
16.3
——
Qwen3.6 Plus
Alibaba———
50
53 tok/s1.7s
Qwen3.5 27B (Non-reasoning)
Alibaba———
37.2
94 tok/s1.4s
Qwen3.5 9B (Reasoning)
Alibaba———
32.4
71 tok/s0.4s
Qwen3 Coder Next
Alibaba———
28.3
127 tok/s1.0s
Sonar Reasoning Pro
Perplexity———
24.6
——
Qwen3.5 4B (Reasoning)
Alibaba———
27.1
199 tok/s0.2s
Sonar Reasoning
Perplexity———
17.9
——
Qwen3.5 Omni Flash
Alibaba———
25.9
243 tok/s0.9s
Qwen3.5 Omni Plus
Alibaba———
38.6
56 tok/s1.3s
Qwen3.5 35B A3B (Reasoning)
Alibaba———
37.1
118 tok/s1.1s
Qwen3.5 35B A3B (Non-reasoning)
Alibaba———
30.7
134 tok/s1.2s
Qwen3.5 397B A17B (Non-reasoning)
Alibaba———
40.1
53 tok/s1.8s
Qwen3.5 122B A10B (Reasoning)
Alibaba———
41.6
162 tok/s1.1s
Solar Mini
Upstage———
11.9
66 tok/s1.0s
Grok 3 Reasoning Beta
xAI———
21.6
——
GLM-5 (Reasoning)
Z AI———
49.8
68 tok/s0.7s
Grok 4.20 0309 (Reasoning)
xAI———
48.5
86 tok/s34.1s
Apertus 8B Instruct
Swiss AI Initiative———
5.9
——
Qwen3.5 0.8B (Non-reasoning)
Alibaba———
9.9
356 tok/s0.2s
Tiny Aya Global
Cohere———
4.7
124 tok/s0.3s
Grok 4.3
xAI———
53.2
92 tok/s9.6s
GLM 5V Turbo (Reasoning)
Z AI———
42.9
——
GLM-5-Turbo
Z AI———
46.8
——
GLM-4.7-Flash (Reasoning)
Z AI———
30.1
86 tok/s0.9s
GLM-5 (Non-reasoning)
Z AI———
40.6
58 tok/s1.3s
Reka Flash (Sep '24)
Reka AI———
12
84 tok/s1.8s
Nanbeige4.1-3B
Nanbeige———
16.1
——
Tri-21B-think Preview
Trillion Labs———
20
——
GLM-4.7-Flash (Non-reasoning)
Z AI———
22.1
153 tok/s1.0s
Trinity Large Thinking
Arcee AI———
31.9
127 tok/s0.6s
Qwen3.5 2B (Non-reasoning)
Alibaba———
14.7
343 tok/s0.2s
Qwen3.5 397B A17B (Reasoning)
Alibaba———
45
52 tok/s1.7s
Tri-21B-Think
Trillion Labs———
18.6
——
GPT-5.5 (medium)
OpenAI———
56.7
57 tok/s6.1s
Qwen3.5 0.8B (Reasoning)
Alibaba———
10.5
——
LongCat Flash Lite
LongCat———
23.9
122 tok/s4.6s
Qwen3.5 4B (Non-reasoning)
Alibaba———
22.6
201 tok/s0.2s
Apertus 70B Instruct
Swiss AI Initiative———
7.7
——
GLM-5.1 (Reasoning)
Z AI———
51.4
56 tok/s0.9s
Sarvam 105B (high)
Sarvam———
18.2
158 tok/s1.3s
KAT Coder Pro V2
KwaiKAT———
43.8
117 tok/s1.7s
MiMo-V2-Omni-0327
Xiaomi———
44.9
109 tok/s1.3s
MiMo-V2-Omni
Xiaomi———
43.4
107 tok/s1.8s
MiMo-V2-Pro
Xiaomi———
49.2
62 tok/s2.1s
o1-preview
OpenAI———
23.7
——
Sarvam 30B (high)
Sarvam———
12.3
170 tok/s1.2s
K2 Think V2
MBZUAI Institute of Foundation Models———
24.1
——
Granite 4.1 8B
IBM———
12.4
121 tok/s0.5s
Hy3-preview (Non-reasoning)
Tencent———
33.7
114 tok/s2.6s
MiMo-V2-Flash (Feb 2026)
Xiaomi———
41.5
138 tok/s1.4s
DeepSeek V4 Pro (Reasoning, Max Effort)
DeepSeek———
51.5
31 tok/s1.0s
Qwen1.5 Chat 110B
Alibaba———
9.5
——
GPT-5.5 (high)
OpenAI———
58.9
55 tok/s19.6s
GPT-5.5 (low)
OpenAI———
50.8
55 tok/s2.0s
Mistral Medium 3.5
Mistral———
39.2
169 tok/s0.6s
Step 3.5 Flash
StepFun———
37.8
143 tok/s0.9s
Step 3.5 Flash 2603
StepFun———
38.5
159 tok/s0.9s
Kimi K2.5 (Non-reasoning)
Kimi———
37.3
49 tok/s1.3s
Llama 65B
Meta———
7.4
——
Olmo 3.1 32B Instruct
Allen Institute for AI———
12.2
——
NVIDIA Nemotron 3 Nano 4B
NVIDIA———
14.7
——
Step3 VL 10B
StepFun———
15.4
——
Nemotron Cascade 2 30B A3B
NVIDIA———
28.4
——
Mercury 2
Inception———
32.8
812 tok/s4.1s
Qwen Chat 72B
Alibaba———
8.8
——
Eleven v3
ElevenLabs——————
Kimi K2.5 (Reasoning)
Kimi———
46.8
42 tok/s1.1s
Molmo2-8B
Allen Institute for AI———
7.3
——
LFM2 24B A2B
Liquid AI———
10.5
158 tok/s0.2s
Solar Pro 3
Upstage———
25.9
——
LFM2.5-1.2B-Thinking
Liquid AI———
8.1
——
Realtime TTS 1.5 Max
Inworld——————
Granite 4.1 3B
IBM———
8.5
——
Kimi K2.6 (Non-reasoning)
Kimi———
43
40 tok/s1.3s
LFM2.5-VL-1.6B
Liquid AI———
6.2
——
LFM2.5-1.2B-Instruct
Liquid AI———
8
——
NVIDIA Nemotron 3 Super 120B A12B (Reasoning)
NVIDIA———
36
159 tok/s0.9s
GPT-5.5 (xhigh)
OpenAI———
60.2
67 tok/s62.9s
Solar Open 100B (Reasoning)
Upstage———
21.7
——
Inworld TTS 1 Max
Inworld——————
MiniMax-M2.7
MiniMax———
49.6
47 tok/s1.2s
R1 1776
Perplexity———
12
——
Grok 4.20 0309 v2 (Non-reasoning)
xAI———
29
89 tok/s0.5s
Grok 4.20 0309 v2 (Reasoning)
xAI———
49.3
97 tok/s33.2s
Studio
Google——————
Polly Generative
Amazon——————
T2A-01-HD
MiniMax——————
Sonic 3
Cartesia——————
SIMBA 1.6
Speechify——————
Kokoro 82M v1.0
Kokoro——————
Voxtral TTS
Mistral——————
Azure Neural
Microsoft——————
AsyncFlow V2, async
async——————
Maya1
Maya Research——————
GPT-5.5 (Non-reasoning)
OpenAI———
40.9
56 tok/s1.0s
MiMo-V2.5
Xiaomi———
49
96 tok/s1.6s
Step TTS 2 (Mar 2026)
StepFun——————
Speech 2.6 HD
MiniMax——————
Speech 2.8 Turbo
MiniMax——————
Codestral
Mistral AI$0.3$0.932K—180 tok/s—
Azure HD 2.5
Microsoft——————
Speech-02-HD
MiniMax——————
Step Audio EditX (Mar 2026)
StepFun——————
Inworld TTS 1
Inworld——————
TTS-1
OpenAI——————
Speech-02-Turbo
MiniMax——————
Flash v2.5
ElevenLabs——————
Turbo v2.5
ElevenLabs——————
TTS-1 HD
OpenAI——————
OpenAudio S1
Fish Audio——————
Multilingual v2
ElevenLabs——————
Speech 2.6 Turbo
MiniMax——————
OpenVoice v2
OpenVoice——————
Murf Speech Gen 2
Murf AI——————
Neuphonic TTS
Neuphonic——————
T2A-01-Turbo
MiniMax——————
VibeVoice 7B
Microsoft——————
Qwen3 TTS Flash
Alibaba——————
StyleTTS 2
StyleTTS ——————
XTTS v2
Coqui——————
WaveNet
Google——————
VibeVoice 1.5B
Microsoft——————
Polly Neural
Amazon——————
Sonic English (Oct 2024)
Cartesia——————
Chatterbox HD
Resemble AI——————
Polly Long-Form
Amazon——————
Journey
Google——————
Lightning v3.1
Smallest.ai——————
SIMBA 1.0
Speechify——————
MAI-Voice-1
Microsoft——————
Gemini 2.5 Pro (Dec 2025)
Google——————
Octave TTS
Hume AI——————
MiMo-V2-TTS
Xiaomi——————
Fish Speech 1.5
Fish Audio——————
Chatterbox
Resemble AI——————
LMNT
LMNT——————
Magpie-Multilingual 357M
NVIDIA——————
Zonos-v0.1
Zyphra——————
Qwen3 TTS
Alibaba——————
Claude Sonnet 4.6 (Non-reasoning, Low Effort)
Anthropic———
42.6
50 tok/s0.9s
Claude Opus 4.7 (Adaptive Reasoning, Max Effort)
Anthropic———
57.3
60 tok/s27.0s
GPT-5.4 (xhigh)
OpenAI———
56.8
94 tok/s179.1s
GLM-5.1 (Non-reasoning)
Z AI———
43.8
45 tok/s1.1s
Qwen3.5 9B (Non-reasoning)
Alibaba———
27.3
——
GPT-5.4 Pro (xhigh)
OpenAI——————
SIMBA 3.0
Speechify——————
Gemma 4 E2B (Reasoning)
Google———
15.2
——
Polly Standard
Amazon——————
MetaVoice v1
MetaVoice——————
Mistral Small 4 (Non-reasoning)
Mistral———
18.6
142 tok/s0.5s
Magpie-Multilingual 357M (Feb 2026)
NVIDIA——————
Gemini 3.1 Pro Preview
Google———
57.2
130 tok/s22.5s
Gemma 4 E4B (Reasoning)
Google———
18.8
44 tok/s1.0s
Qwen Chat 14B
Alibaba———
7.4
——
Mistral Small 4 (Reasoning)
Mistral———
27.8
162 tok/s0.6s
Gemma 4 E4B (Non-reasoning)
Google———
14.8
55 tok/s0.5s
Qwen3.6 27B (Reasoning)
Alibaba———
45.8
64 tok/s1.5s
JT-MINI
China Mobile———
25.4
——
Claude Sonnet 4.6 (Adaptive Reasoning, Max Effort)
Anthropic———
51.7
79 tok/s55.2s
Claude Opus 4.6 (Adaptive Reasoning, Max Effort)
Anthropic———
53
52 tok/s16.5s
Gemma 4 26B A4B (Reasoning)
Google———
31.2
——
Gemma 4 E2B (Non-reasoning)
Google———
12.1
——
Gemini 3.1 Flash-Lite Preview
Google———
33.5
350 tok/s5.1s
Falcon (Beta)
Murf AI——————
Chirp 3: HD
Google——————
GPT-5.4 nano (xhigh)
OpenAI———
44
169 tok/s3.7s
DeepSeek-V2-Chat
DeepSeek———
9.1
——
Grok-1
xAI———
11.7
——
DeepSeek V4 Pro (Reasoning, High Effort)
DeepSeek———
49.8
31 tok/s1.1s
Grok 4.3 (Non-reasoning)
xAI———
31
79 tok/s0.6s
Gemma 4 26B A4B (Non-reasoning)
Google———
27.1
——
Muse Spark
Meta———
52.1
——
Realtime TTS 1.5 Mini
Inworld——————
Mist V2
Rime——————
Standard
Google——————
Claude Opus 4.7 (Non-reasoning, High Effort)
Anthropic———
51.8
47 tok/s1.6s
DeepSeek V4 Flash (Reasoning, High Effort)
DeepSeek———
44.9
——
Kimi K2.6
Kimi———
53.9
41 tok/s1.4s
Qwen3.6 35B A3B (Reasoning)
Alibaba———
43.5
189 tok/s1.5s
Ling-2.6-1T
InclusionAI———
33.6
——
Gemini 3 Deep Think
Google——————
Gemini 2.5 Flash TTS (Dec 2025)
Google——————
Speech 2.8 HD
MiniMax——————
Qwen3.6 35B A3B (Non-reasoning)
Alibaba———
31.5
182 tok/s1.4s
Qwen3.5 Omni Flash
Alibaba——————
EXAONE 4.5 33B
LG AI Research———
30.2
——
GPT-5.4 mini (xhigh)
OpenAI———
48.9
185 tok/s5.7s
DeepSeek V4 Flash (Non-reasoning)
DeepSeek———
36.5
66 tok/s0.8s
Octave 2
Hume AI——————
Neural2
Google——————
DeepSeek V4 Flash (Reasoning, Max Effort)
DeepSeek———
46.5
65 tok/s0.8s
Hy3-preview (Reasoning)
Tencent———
41.9
115 tok/s2.3s
GPT-5.4 nano (medium)
OpenAI———
38.1
168 tok/s3.6s
Qwen3.6 27B (Non-reasoning)
Alibaba———
37.1
61 tok/s1.4s
Ling 2.6 Flash
InclusionAI———
26.2
211 tok/s1.2s
GPT-5.4 (Non-reasoning)
OpenAI———
35.4
68 tok/s0.7s
GPT-5.4 mini (medium)
OpenAI———
37.7
184 tok/s5.4s
Gemini 3.1 Flash TTS
Google——————
GPT-5.4 nano (Non-Reasoning)
OpenAI———
24.4
167 tok/s0.5s
GPT-5.4 mini (Non-Reasoning)
OpenAI———
23.3
170 tok/s0.5s
Arctic Instruct
Snowflake———
8.8
——
MiMo-V2.5-Pro
Xiaomi———
53.8
60 tok/s1.9s
Fish Audio S2 Pro
Fish Audio——————
GPT-5.4 (low)
OpenAI———
47.9
64 tok/s1.6s
Granite 4.1 30B
IBM———
14.7
——
Gemma 4 31B (Reasoning)
Google———
39.2
35 tok/s1.0s
GPT-5.3 Codex (xhigh)
OpenAI———
53.6
96 tok/s67.2s
Arcana v3
Rime——————
Magpie Multilingual
NVIDIA——————
Gemma 4 31B (Non-reasoning)
Google———
32.3
——
Kling Image 3.0 Omni
KlingAI——————
GPT-5.5 Pro (xhigh)
OpenAI——————
DeepSeek V4 Pro (Non-reasoning)
DeepSeek———
39.3
31 tok/s1.1s
MiMo-V2.5-Pro (Non-reasoning)
Xiaomi———
35.6
59 tok/s1.9s
EXAONE 4.5 33B (Non-reasoning)
LG AI Research——————
Gemini 2.5 Flash Lite TTS
Google——————
DeepSeek LLM 67B Chat (V1)
DeepSeek———
8.4
——
MiMo-V2.5-TTS
Xiaomi——————
Claude 4.1 Opus (Non-reasoning)
Anthropic———
36
36 tok/s1.7s
Claude 3 Haiku
Anthropic———
12.3
——
Qwen3.6 Max Preview
Alibaba———
51.8
38 tok/s2.0s
Gemini 2.0 Flash Thinking Experimental (Dec '24)
Google———
12.3
——
PALM-2
Google———
8.6
——
Gemini 1.0 Ultra
Google———
10.1
——
Gemini 2.0 Flash-Lite (Preview)
Google———
14.5
——
GPT-4o Realtime (Dec '24)
OpenAI——————
StepAudio 2.5 TTS
StepFun——————
DeepSeek-V2.5
DeepSeek———
12.3
——
GPT-4
OpenAI———
12.8
30 tok/s1.1s
DeepSeek-V2.5 (Dec '24)
DeepSeek———
12.5
——
GPT-4o mini Realtime (Dec '24)
OpenAI——————
DeepSeek-Coder-V2
DeepSeek———
10.6
——
GPT-4.5 (Preview)
OpenAI———
20
——
Gradium TTS
Gradium——————
o1-pro
OpenAI———
25.8
——
o3-pro
OpenAI———
40.7
21 tok/s84.7s
∞AI

Everything AI. In one place.

Platform

ToolsModelsJobsHackathonsSubmit

Company

AboutContact

Stay updated

Get weekly AI news in your inbox

© 2026 ∞AI. Built for the AI community.everythingai.tech

Estimate Your Monthly Cost

Enter your expected usage to compare costs across models

e.g. 1,000,000 = ~750,000 words

Usually 30–50% of input volume

6 models selected

ModelInput CostOutput CostTotal/Monthvs Cheapest
DeepSeek R2
DeepSeek
$0.55$1.09$1.65✓ Best value
GPT-4.1
OpenAI
$2.00$4.00$6.003.6× more
Claude Sonnet 4.6
Anthropic
$3.00$7.50$10.506.4× more
GPT-4o
OpenAI
$5.00$7.50$12.507.6× more
o3
OpenAI
$10.00$20.00$30.0018.2× more
Claude Opus 4.6
Anthropic
$15.00$37.50$52.5031.9× more

Prices are approximate and may vary. Check provider documentation for current pricing.