Compare pricing, benchmarks, and capabilities across 7 AI models
| Model | Provider | Input $/1M↕ | Output $/1M↕ | Context↕ | Intelligence↑ | Speed↕ | Latency | API |
|---|---|---|---|---|---|---|---|---|
Granite 4.0 H Small | IBM | — | — | — | 10.8 | 453 tok/s | 8.7s | |
Granite 3.3 8B (Non-reasoning) | IBM | — | — | — | 7 | 427 tok/s | 7.3s | |
Granite 4.0 Micro | IBM | — | — | — | 7.7 | — | — | |
Granite 4.0 1B | IBM | — | — | — | 7.3 | — | — | |
Granite 4.0 H 1B | IBM | — | — | — | 8 | — | — | |
Granite 4.0 H 350M | IBM | — | — | — | 5.4 | — | — | |
Granite 4.0 350M | IBM | — | — | — | 6.1 | — | — |
Enter your expected usage to compare costs across models
e.g. 1,000,000 = ~750,000 words
Usually 30–50% of input volume
6 models selected
Prices are approximate and may vary. Check provider documentation for current pricing.