Compare pricing, benchmarks, and capabilities across 17 AI models
| Model | Provider | Input $/1M↕ | Output $/1M↕ | Context↕ | Intelligence↑ | Speed↕ | Latency | API |
|---|---|---|---|---|---|---|---|---|
Llama 4 Maverick | Meta | — | — | — | 18.4 | 115 tok/s | 0.6s | |
Llama 4 Scout | Meta | — | — | — | 13.5 | 137 tok/s | 0.5s | |
Llama 3.1 Instruct 405B | Meta | — | — | — | 17.4 | 31 tok/s | 0.7s | |
Llama 3.3 Instruct 70B | Meta | — | — | — | 14.5 | 96 tok/s | 0.6s | |
Llama 3.1 Instruct 70B | Meta | — | — | — | 12.5 | 31 tok/s | 0.8s | |
Llama 3.2 Instruct 90B (Vision) | Meta | — | — | — | 11.9 | 42 tok/s | 0.5s | |
Llama 3 Instruct 70B | Meta | — | — | — | 8.9 | 42 tok/s | 0.7s | |
Llama 3.1 Instruct 8B | Meta | — | — | — | 11.8 | 170 tok/s | 0.4s | |
Llama 3.2 Instruct 11B (Vision) | Meta | — | — | — | 8.7 | 79 tok/s | 0.5s | |
Llama 2 Chat 70B | Meta | — | — | — | 8.4 | — | — | |
Llama 2 Chat 13B | Meta | — | — | — | 8.4 | — | — | |
Llama 3 Instruct 8B | Meta | — | — | — | 6.4 | 82 tok/s | 0.5s | |
Llama 3.2 Instruct 3B | Meta | — | — | — | 9.7 | 53 tok/s | 0.6s | |
Llama 3.2 Instruct 1B | Meta | — | — | — | 6.3 | 88 tok/s | 0.6s | |
Llama 2 Chat 7B | Meta | — | — | — | 9.7 | 108 tok/s | 12.6s | |
Muse Spark | Meta | — | — | — | 52.1 | — | — | |
Llama 65B | Meta | — | — | — | 7.4 | — | — |
Enter your expected usage to compare costs across models
e.g. 1,000,000 = ~750,000 words
Usually 30–50% of input volume
6 models selected
Prices are approximate and may vary. Check provider documentation for current pricing.