Compare pricing, benchmarks, and capabilities across 7 AI models
| Model | Provider | Input $/1M↕ | Output $/1M↕ | Context↕ | Intelligence↑ | Speed↕ | Latency | API |
|---|---|---|---|---|---|---|---|---|
Hermes 4 - Llama-3.1 405B (Reasoning) | Nous Research | — | — | — | 18.6 | 32 tok/s | 0.8s | |
Hermes 4 - Llama-3.1 70B (Reasoning) | Nous Research | — | — | — | 16 | 62 tok/s | 0.6s | |
Hermes 4 - Llama-3.1 405B (Non-reasoning) | Nous Research | — | — | — | 17.6 | 32 tok/s | 0.9s | |
Hermes 4 - Llama-3.1 70B (Non-reasoning) | Nous Research | — | — | — | 12.6 | 63 tok/s | 0.6s | |
DeepHermes 3 - Mistral 24B Preview (Non-reasoning) | Nous Research | — | — | — | 10.9 | — | — | |
Hermes 3 - Llama-3.1 70B | Nous Research | — | — | — | 10.6 | 28 tok/s | 0.4s | |
DeepHermes 3 - Llama-3.1 8B Preview (Non-reasoning) | Nous Research | — | — | — | 7.6 | — | — |
Enter your expected usage to compare costs across models
e.g. 1,000,000 = ~750,000 words
Usually 30–50% of input volume
6 models selected
Prices are approximate and may vary. Check provider documentation for current pricing.