731 models across 9 providers. Prices per 1M tokens in USD. Updated weekly.
Providers: anthropic, deepseek, fireworks, gemini, groq, mistral, openai, together, xai. Data sourced from official pricing pages. Use the free API for programmatic access.
29 models
| Model | Input $/1M | Output $/1M | Cache $/1M |
|---|---|---|---|
| claude-3-haiku-20240307 | $0.25 | $1.25 | $0.03 |
| claude-3-haiku | $0.25 | $1.25 | $0.03 |
| claude-3-5-haiku-20241022 | $0.8 | $4 | $0.08 |
| claude-3-5-haiku-latest | $0.8 | $4 | $0.08 |
| claude-haiku-4-5 | $1 | $5 | $0.1 |
| claude-haiku-4-5-20251001 | $1 | $5 | $0.1 |
| claude-sonnet-4-6 | $3 | $15 | $0.3 |
| claude-sonnet-4-5 | $3 | $15 | $0.3 |
| claude-sonnet-4-20250514 | $3 | $15 | $0.3 |
| claude-3-5-sonnet | $3 | $15 | $0.3 |
| claude-3-7-sonnet-latest | $3 | $15 | $0.3 |
| claude-3-sonnet | $3 | $15 | $0.3 |
| claude-sonnet-4-0 | $3 | $15 | $0.3 |
| claude-3-7-sonnet-20250219 | $3 | $15 | $0.3 |
| claude-4-sonnet-20250514 | $3 | $15 | $0.3 |
| claude-sonnet-4-5-20250929 | $3 | $15 | $0.3 |
| claude-opus-4-6 | $5 | $25 | $0.5 |
| claude-opus-4-5 | $5 | $25 | $0.5 |
| claude-opus-4-5-20251101 | $5 | $25 | $0.5 |
| claude-opus-4-6-20260205 | $5 | $25 | $0.5 |
| claude-2 | $8 | $24 | - |
| claude-v1 | $8 | $24 | - |
| claude-opus-4-20250514 | $15 | $75 | $1.5 |
| claude-3-opus-latest | $15 | $75 | $1.5 |
| claude-opus-4-0 | $15 | $75 | $1.5 |
| claude-opus-4-1 | $15 | $75 | $1.5 |
| claude-3-opus-20240229 | $15 | $75 | $1.5 |
| claude-4-opus-20250514 | $15 | $75 | $1.5 |
| claude-opus-4-1-20250805 | $15 | $75 | $1.5 |
6 models
| Model | Input $/1M | Output $/1M | Cache $/1M |
|---|---|---|---|
| deepseek-coder | $0.14 | $0.28 | - |
| deepseek-v3 | $0.27 | $1.1 | $0.07 |
| deepseek-chat | $0.28 | $0.42 | $0.028 |
| deepseek-reasoner | $0.28 | $0.42 | $0.028 |
| deepseek-v3.2 | $0.28 | $0.42 | $0.028 |
| deepseek-r1 | $0.55 | $2.19 | - |
257 models
| Model | Input $/1M | Output $/1M | Cache $/1M |
|---|---|---|---|
| SSD-1B | $0.0001 | $0.0001 | - |
| japanese-stable-diffusion-xl | $0.0001 | $0.0001 | - |
| playground-v2-1024px-aesthetic | $0.0001 | $0.0001 | - |
| playground-v2-5-1024px-aesthetic | $0.0001 | $0.0001 | - |
| stable-diffusion-xl-1024-v1-0 | $0.0001 | $0.0001 | - |
| flux-1-schnell-fp8 | $0.0003 | $0.0003 | - |
| flux-1-dev-fp8 | $0.0005 | $0.0005 | - |
| flux-1-dev-controlnet-union | $0.001 | $0.001 | - |
| flux-kontext-pro | $0.04 | $0.04 | - |
| gpt-oss-20b | $0.07 | $0.3 | $0.04 |
| flux-kontext-max | $0.08 | $0.08 | - |
| gemma-3-27b-it | $0.1 | $0.1 | - |
| llama-v3p2-1b-instruct | $0.1 | $0.1 | - |
| llama-v3p2-3b-instruct | $0.1 | $0.1 | - |
| codegemma-2b | $0.1 | $0.1 | - |
| cogito-v1-preview-llama-3b | $0.1 | $0.1 | - |
| deepseek-coder-1b-base | $0.1 | $0.1 | - |
| deepseek-r1-distill-qwen-1p5b | $0.1 | $0.1 | - |
| ernie-4p5-21b-a3b-pt | $0.1 | $0.1 | - |
| ernie-4p5-300b-a47b-pt | $0.1 | $0.1 | - |
| flux-1-dev | $0.1 | $0.1 | - |
| flux-1-schnell | $0.1 | $0.1 | - |
| gemma-2b-it | $0.1 | $0.1 | - |
| llama-guard-3-1b | $0.1 | $0.1 | - |
| llama-v2-70b | $0.1 | $0.1 | - |
| llama-v3p1-405b-instruct-long | $0.1 | $0.1 | - |
| llama-v3p1-70b-instruct-1b | $0.1 | $0.1 | - |
| llama-v3p2-1b | $0.1 | $0.1 | - |
| llama-v3p2-3b | $0.1 | $0.1 | - |
| minimax-m1-80k | $0.1 | $0.1 | - |
| ministral-3-3b-instruct-2512 | $0.1 | $0.1 | - |
| nemotron-nano-v2-12b-vl | $0.1 | $0.1 | - |
| phi-2-3b | $0.1 | $0.1 | - |
| phi-3-mini-128k-instruct | $0.1 | $0.1 | - |
| qwen2-vl-2b-instruct | $0.1 | $0.1 | - |
| qwen2p5-0p5b-instruct | $0.1 | $0.1 | - |
| qwen2p5-1p5b-instruct | $0.1 | $0.1 | - |
| qwen2p5-coder-0p5b | $0.1 | $0.1 | - |
| qwen2p5-coder-0p5b-instruct | $0.1 | $0.1 | - |
| qwen2p5-coder-1p5b | $0.1 | $0.1 | - |
| qwen2p5-coder-1p5b-instruct | $0.1 | $0.1 | - |
| qwen2p5-coder-3b | $0.1 | $0.1 | - |
| qwen2p5-coder-3b-instruct | $0.1 | $0.1 | - |
| qwen3-0p6b | $0.1 | $0.1 | - |
| qwen3-1p7b | $0.1 | $0.1 | - |
| qwen3-1p7b-fp8-draft | $0.1 | $0.1 | - |
| qwen3-1p7b-fp8-draft-131072 | $0.1 | $0.1 | - |
| qwen3-1p7b-fp8-draft-40960 | $0.1 | $0.1 | - |
| stablecode-3b | $0.1 | $0.1 | - |
| starcoder2-3b | $0.1 | $0.1 | - |
| gpt-oss-120b | $0.15 | $0.6 | $0.07 |
| llama4-scout-instruct-basic | $0.15 | $0.6 | - |
| qwen3-30b-a3b | $0.15 | $0.6 | - |
| qwen3-coder-30b-a3b-instruct | $0.15 | $0.6 | - |
| qwen3-vl-30b-a3b-instruct | $0.15 | $0.6 | - |
| qwen3-vl-30b-a3b-thinking | $0.15 | $0.6 | - |
| accounts/fireworks/models/llama-v3p1-8b-instruct | $0.2 | $0.2 | $0.1 |
| llama-v3p1-8b-instruct | $0.2 | $0.2 | - |
| fireworks-ai-4.1b-to-16b | $0.2 | $0.2 | - |
| fireworks-ai-up-to-4b | $0.2 | $0.2 | - |
| llama-v3p2-11b-vision-instruct | $0.2 | $0.2 | - |
| chronos-hermes-13b-v2 | $0.2 | $0.2 | - |
| code-llama-13b | $0.2 | $0.2 | - |
| code-llama-13b-instruct | $0.2 | $0.2 | - |
| code-llama-13b-python | $0.2 | $0.2 | - |
| code-llama-7b | $0.2 | $0.2 | - |
| code-llama-7b-instruct | $0.2 | $0.2 | - |
| code-llama-7b-python | $0.2 | $0.2 | - |
| code-qwen-1p5-7b | $0.2 | $0.2 | - |
| codegemma-7b | $0.2 | $0.2 | - |
| cogito-v1-preview-llama-8b | $0.2 | $0.2 | - |
| cogito-v1-preview-qwen-14b | $0.2 | $0.2 | - |
| deepseek-coder-7b-base | $0.2 | $0.2 | - |
| deepseek-coder-7b-base-v1p5 | $0.2 | $0.2 | - |
| deepseek-coder-7b-instruct-v1p5 | $0.2 | $0.2 | - |
| deepseek-r1-0528-distill-qwen3-8b | $0.2 | $0.2 | - |
| deepseek-r1-distill-llama-8b | $0.2 | $0.2 | - |
| deepseek-r1-distill-qwen-14b | $0.2 | $0.2 | - |
| deepseek-r1-distill-qwen-7b | $0.2 | $0.2 | - |
| dobby-mini-unhinged-plus-llama-3-1-8b | $0.2 | $0.2 | - |
| firellava-13b | $0.2 | $0.2 | - |
| firesearch-ocr-v6 | $0.2 | $0.2 | - |
| gemma-7b | $0.2 | $0.2 | - |
| gemma-7b-it | $0.2 | $0.2 | - |
| gemma2-9b-it | $0.2 | $0.2 | - |
| hermes-2-pro-mistral-7b | $0.2 | $0.2 | - |
| internvl3-8b | $0.2 | $0.2 | - |
| llama-guard-2-8b | $0.2 | $0.2 | - |
| llama-guard-3-8b | $0.2 | $0.2 | - |
| llama-v2-13b | $0.2 | $0.2 | - |
| llama-v2-13b-chat | $0.2 | $0.2 | - |
| llama-v2-7b | $0.2 | $0.2 | - |
| llama-v2-7b-chat | $0.2 | $0.2 | - |
| llama-v3-8b | $0.2 | $0.2 | - |
| llama-v3-8b-instruct-hf | $0.2 | $0.2 | - |
| llamaguard-7b | $0.2 | $0.2 | - |
| ministral-3-14b-instruct-2512 | $0.2 | $0.2 | - |
| ministral-3-8b-instruct-2512 | $0.2 | $0.2 | - |
| mistral-7b | $0.2 | $0.2 | - |
| mistral-7b-instruct-4k | $0.2 | $0.2 | - |
| mistral-7b-instruct-v0p2 | $0.2 | $0.2 | - |
| mistral-7b-instruct-v3 | $0.2 | $0.2 | - |
| mistral-7b-v0p2 | $0.2 | $0.2 | - |
| mistral-nemo-base-2407 | $0.2 | $0.2 | - |
| mistral-nemo-instruct-2407 | $0.2 | $0.2 | - |
| mythomax-l2-13b | $0.2 | $0.2 | - |
| nous-capybara-7b-v1p9 | $0.2 | $0.2 | - |
| nous-hermes-llama2-13b | $0.2 | $0.2 | - |
| nous-hermes-llama2-7b | $0.2 | $0.2 | - |
| nvidia-nemotron-nano-12b-v2 | $0.2 | $0.2 | - |
| nvidia-nemotron-nano-9b-v2 | $0.2 | $0.2 | - |
| openchat-3p5-0106-7b | $0.2 | $0.2 | - |
| openhermes-2-mistral-7b | $0.2 | $0.2 | - |
| openhermes-2p5-mistral-7b | $0.2 | $0.2 | - |
| openorca-7b | $0.2 | $0.2 | - |
| phi-3-vision-128k-instruct | $0.2 | $0.2 | - |
| pythia-12b | $0.2 | $0.2 | - |
| qwen-v2p5-14b-instruct | $0.2 | $0.2 | - |
| qwen-v2p5-7b | $0.2 | $0.2 | - |
| qwen2-7b-instruct | $0.2 | $0.2 | - |
| qwen2-vl-7b-instruct | $0.2 | $0.2 | - |
| qwen2p5-14b | $0.2 | $0.2 | - |
| qwen2p5-7b-instruct | $0.2 | $0.2 | - |
| qwen2p5-coder-14b | $0.2 | $0.2 | - |
| qwen2p5-coder-14b-instruct | $0.2 | $0.2 | - |
| qwen2p5-coder-7b | $0.2 | $0.2 | - |
| qwen2p5-coder-7b-instruct | $0.2 | $0.2 | - |
| qwen2p5-vl-3b-instruct | $0.2 | $0.2 | - |
| qwen2p5-vl-7b-instruct | $0.2 | $0.2 | - |
| qwen3-14b | $0.2 | $0.2 | - |
| qwen3-4b | $0.2 | $0.2 | - |
| qwen3-4b-instruct-2507 | $0.2 | $0.2 | - |
| qwen3-8b | $0.2 | $0.2 | - |
| qwen3-vl-8b-instruct | $0.2 | $0.2 | - |
| rolm-ocr | $0.2 | $0.2 | - |
| snorkel-mistral-7b-pairrm-dpo | $0.2 | $0.2 | - |
| starcoder-16b | $0.2 | $0.2 | - |
| starcoder-7b | $0.2 | $0.2 | - |
| starcoder2-15b | $0.2 | $0.2 | - |
| starcoder2-7b | $0.2 | $0.2 | - |
| toppy-m-7b | $0.2 | $0.2 | - |
| yi-6b | $0.2 | $0.2 | - |
| zephyr-7b-beta | $0.2 | $0.2 | - |
| llama4-maverick-instruct-basic | $0.22 | $0.88 | - |
| qwen3-235b-a22b | $0.22 | $0.88 | - |
| glm-4p5-air | $0.22 | $0.88 | - |
| qwen3-235b-a22b-instruct-2507 | $0.22 | $0.88 | - |
| qwen3-235b-a22b-thinking-2507 | $0.22 | $0.88 | - |
| qwen3-vl-235b-a22b-instruct | $0.22 | $0.88 | - |
| qwen3-vl-235b-a22b-thinking | $0.22 | $0.88 | - |
| minimax-m2p1 | $0.3 | $1.2 | - |
| minimax-m2 | $0.3 | $1.2 | - |
| qwen3-coder-480b-a35b-instruct | $0.45 | $1.8 | - |
| fireworks-ai-moe-up-to-56b | $0.5 | $0.5 | - |
| deepseek-coder-v2-lite-base | $0.5 | $0.5 | - |
| deepseek-coder-v2-lite-instruct | $0.5 | $0.5 | - |
| deepseek-v2-lite-chat | $0.5 | $0.5 | - |
| dolphin-2p6-mixtral-8x7b | $0.5 | $0.5 | - |
| firefunction-v1 | $0.5 | $0.5 | - |
| gpt-oss-safeguard-20b | $0.5 | $0.5 | - |
| mixtral-8x7b | $0.5 | $0.5 | - |
| mixtral-8x7b-instruct | $0.5 | $0.5 | - |
| mixtral-8x7b-instruct-hf | $0.5 | $0.5 | - |
| nous-hermes-2-mixtral-8x7b-dpo | $0.5 | $0.5 | - |
| qwen3-30b-a3b-instruct-2507 | $0.5 | $0.5 | - |
| deepseek-r1-basic | $0.55 | $2.19 | - |
| glm-4p5 | $0.55 | $2.19 | - |
| glm-4p6 | $0.55 | $2.19 | - |
| deepseek-v3p2 | $0.56 | $1.68 | $0.28 |
| deepseek-v3p1 | $0.56 | $1.68 | - |
| deepseek-v3p1-terminus | $0.56 | $1.68 | - |
| glm-4p7 | $0.6 | $2.2 | - |
| kimi-k2p5 | $0.6 | $3 | $0.1 |
| kimi-k2-instruct | $0.6 | $2.5 | - |
| kimi-k2-instruct-0905 | $0.6 | $2.5 | - |
| kimi-k2-thinking | $0.6 | $2.5 | - |
| accounts/fireworks/models/llama-v3p3-70b-instruct | $0.9 | $0.9 | $0.45 |
| deepseek-v3-0324 | $0.9 | $0.9 | - |
| qwen2p5-vl-72b-instruct | $0.9 | $0.9 | - |
| fireworks-ai-above-16b | $0.9 | $0.9 | - |
| deepseek-v3 | $0.9 | $0.9 | - |
| firefunction-v2 | $0.9 | $0.9 | - |
| llama-v3p2-90b-vision-instruct | $0.9 | $0.9 | - |
| qwen2-72b-instruct | $0.9 | $0.9 | - |
| qwen2p5-coder-32b-instruct | $0.9 | $0.9 | - |
| code-llama-34b | $0.9 | $0.9 | - |
| code-llama-34b-instruct | $0.9 | $0.9 | - |
| code-llama-34b-python | $0.9 | $0.9 | - |
| code-llama-70b | $0.9 | $0.9 | - |
| code-llama-70b-instruct | $0.9 | $0.9 | - |
| code-llama-70b-python | $0.9 | $0.9 | - |
| cogito-v1-preview-llama-70b | $0.9 | $0.9 | - |
| cogito-v1-preview-qwen-32b | $0.9 | $0.9 | - |
| deepseek-coder-33b-instruct | $0.9 | $0.9 | - |
| deepseek-r1-distill-llama-70b | $0.9 | $0.9 | - |
| deepseek-r1-distill-qwen-32b | $0.9 | $0.9 | - |
| devstral-small-2505 | $0.9 | $0.9 | - |
| dobby-unhinged-llama-3-3-70b-new | $0.9 | $0.9 | - |
| dolphin-2-9-2-qwen2-72b | $0.9 | $0.9 | - |
| fare-20b | $0.9 | $0.9 | - |
| internvl3-38b | $0.9 | $0.9 | - |
| internvl3-78b | $0.9 | $0.9 | - |
| kat-coder | $0.9 | $0.9 | - |
| kat-dev-32b | $0.9 | $0.9 | - |
| kat-dev-72b-exp | $0.9 | $0.9 | - |
| llama-v2-70b-chat | $0.9 | $0.9 | - |
| llama-v3-70b-instruct | $0.9 | $0.9 | - |
| llama-v3-70b-instruct-hf | $0.9 | $0.9 | - |
| llama-v3p1-70b-instruct | $0.9 | $0.9 | - |
| llama-v3p1-nemotron-70b-instruct | $0.9 | $0.9 | - |
| llama-v3p3-70b-instruct | $0.9 | $0.9 | - |
| llava-yi-34b | $0.9 | $0.9 | - |
| mistral-small-24b-instruct-2501 | $0.9 | $0.9 | - |
| nous-hermes-2-yi-34b | $0.9 | $0.9 | - |
| nous-hermes-llama2-70b | $0.9 | $0.9 | - |
| phind-code-llama-34b-python-v1 | $0.9 | $0.9 | - |
| phind-code-llama-34b-v1 | $0.9 | $0.9 | - |
| phind-code-llama-34b-v2 | $0.9 | $0.9 | - |
| qwen-qwq-32b-preview | $0.9 | $0.9 | - |
| qwen1p5-72b-chat | $0.9 | $0.9 | - |
| qwen2-vl-72b-instruct | $0.9 | $0.9 | - |
| qwen2p5-32b | $0.9 | $0.9 | - |
| qwen2p5-32b-instruct | $0.9 | $0.9 | - |
| qwen2p5-72b | $0.9 | $0.9 | - |
| qwen2p5-72b-instruct | $0.9 | $0.9 | - |
| qwen2p5-coder-32b | $0.9 | $0.9 | - |
| qwen2p5-coder-32b-instruct-128k | $0.9 | $0.9 | - |
| qwen2p5-coder-32b-instruct-32k-rope | $0.9 | $0.9 | - |
| qwen2p5-coder-32b-instruct-64k | $0.9 | $0.9 | - |
| qwen2p5-math-72b-instruct | $0.9 | $0.9 | - |
| qwen2p5-vl-32b-instruct | $0.9 | $0.9 | - |
| qwen3-30b-a3b-thinking-2507 | $0.9 | $0.9 | - |
| qwen3-32b | $0.9 | $0.9 | - |
| qwen3-coder-480b-instruct-bf16 | $0.9 | $0.9 | - |
| qwen3-next-80b-a3b-instruct | $0.9 | $0.9 | - |
| qwen3-next-80b-a3b-thinking | $0.9 | $0.9 | - |
| qwen3-vl-32b-instruct | $0.9 | $0.9 | - |
| qwq-32b | $0.9 | $0.9 | - |
| yi-34b | $0.9 | $0.9 | - |
| yi-34b-200k-capybara | $0.9 | $0.9 | - |
| yi-34b-chat | $0.9 | $0.9 | - |
| fireworks-ai-56b-to-176b | $1.2 | $1.2 | - |
| deepseek-coder-v2-instruct | $1.2 | $1.2 | - |
| mixtral-8x22b-instruct-hf | $1.2 | $1.2 | - |
| cogito-671b-v2-p1 | $1.2 | $1.2 | - |
| dbrx-instruct | $1.2 | $1.2 | - |
| deepseek-prover-v2 | $1.2 | $1.2 | - |
| deepseek-v2p5 | $1.2 | $1.2 | - |
| glm-4p5v | $1.2 | $1.2 | - |
| gpt-oss-safeguard-120b | $1.2 | $1.2 | - |
| mistral-large-3-fp8 | $1.2 | $1.2 | - |
| mixtral-8x22b | $1.2 | $1.2 | - |
| mixtral-8x22b-instruct | $1.2 | $1.2 | - |
| deepseek-r1-0528 | $3 | $8 | - |
| deepseek-r1 | $3 | $8 | - |
| llama-v3p1-405b-instruct | $3 | $3 | - |
| yi-large | $3 | $3 | - |
50 models
| Model | Input $/1M | Output $/1M | Cache $/1M |
|---|---|---|---|
| gemini-flash-1.5-8b | $0.0375 | $0.15 | $0.01 |
| gemini-1.5-flash | $0.075 | $0.3 | $0.01875 |
| gemini-2.0-flash-lite | $0.075 | $0.3 | - |
| gemini-flash-1.5 | $0.075 | $0.3 | $0.01875 |
| gemini-2.0-flash-lite-001 | $0.075 | $0.3 | $0.0187 |
| gemini-2.0-flash | $0.1 | $0.4 | $0.025 |
| gemini-2.5-flash-lite | $0.1 | $0.4 | $0.01 |
| gemini-2.0-flash-001 | $0.1 | $0.4 | $0.025 |
| gemini-2.5-flash-lite-preview-09-2025 | $0.1 | $0.4 | $0.01 |
| gemini-flash-lite-latest | $0.1 | $0.4 | $0.025 |
| gemini-2.5-flash-lite-preview-06-17 | $0.1 | $0.4 | $0.025 |
| gemini-1.0-pro-vision-001 | $0.125 | $0.375 | - |
| gemini-pro | $0.125 | $0.375 | - |
| gemini-2.5-flash-preview | $0.15 | $0.6 | - |
| claude-3-haiku | $0.25 | $1.25 | $0.03 |
| gemini-3.1-flash-lite-preview | $0.25 | $1.5 | $0.025 |
| gemini-2.5-flash | $0.3 | $2.5 | $0.03 |
| gemini-2.5-flash-image | $0.3 | $30 | - |
| gemini-live-2.5-flash-preview-native-audio-09-2025 | $0.3 | $2 | $0.075 |
| gemini-robotics-er-1.5-preview | $0.3 | $2.5 | - |
| gemini-2.5-flash-preview-09-2025 | $0.3 | $2.5 | $0.075 |
| gemini-flash-latest | $0.3 | $2.5 | $0.075 |
| gemini-2.5-flash-preview-tts | $0.3 | $2.5 | - |
| gemini-2.5-flash-native-audio-latest | $0.3 | $2.5 | - |
| gemini-2.5-flash-native-audio-preview-09-2025 | $0.3 | $2.5 | - |
| gemini-2.5-flash-native-audio-preview-12-2025 | $0.3 | $2.5 | - |
| gemini-exp-1206 | $0.3 | $2.5 | $0.03 |
| gemini-gemma-2-27b-it | $0.35 | $1.05 | - |
| gemini-gemma-2-9b-it | $0.35 | $1.05 | - |
| gemini-3-flash-preview | $0.5 | $3 | $0.05 |
| gemini-3.1-flash-image-preview | $0.5 | $60 | - |
| gemini-live-2.5-flash-preview | $0.5 | $2 | - |
| claude-3-5-haiku | $0.8 | $4 | $0.08 |
| gemini-2.5-pro | $1.25 | $10 | $0.125 |
| gemini-1.5-pro | $1.25 | $5 | - |
| gemini-pro-1.5 | $1.25 | $5 | $0.3125 |
| gemini-2.5-computer-use-preview-10-2025 | $1.25 | $10 | - |
| gemini-2.5-pro-preview-tts | $1.25 | $10 | $0.125 |
| gemini-pro-latest | $1.25 | $10 | $0.125 |
| gemini-3-pro-image-preview | $2 | $120 | - |
| gemini-3-pro-preview | $2 | $12 | $0.2 |
| gemini-3.1-pro-preview | $2 | $12 | $0.2 |
| deep-research-pro-preview-12-2025 | $2 | $12 | - |
| gemini-3.1-pro-preview-customtools | $2 | $12 | $0.2 |
| claude-3-5-sonnet | $3 | $15 | $0.3 |
| claude-3-7-sonnet | $3 | $15 | $0.3 |
| claude-4-sonnet | $3 | $15 | $0.3 |
| claude-opus-4-6 | $5 | $25 | $0.5 |
| claude-3-opus | $15 | $75 | $1.5 |
| claude-4-opus | $15 | $75 | $1.5 |
37 models
| Model | Input $/1M | Output $/1M | Cache $/1M |
|---|---|---|---|
| llama-3.2-1b-preview | $0.04 | $0.04 | - |
| llama-3.1-8b-instant | $0.05 | $0.08 | - |
| llama3-8b-8192 | $0.05 | $0.08 | - |
| llama-3.2-3b-preview | $0.06 | $0.06 | - |
| gemma-7b-it | $0.07 | $0.07 | - |
| openai/gpt-oss-20b | $0.075 | $0.3 | $0.0375 |
| gpt-oss-20b | $0.075 | $0.3 | $0.0375 |
| gpt-oss-safeguard-20b | $0.075 | $0.3 | $0.037 |
| meta-llama/llama-4-scout-17b-16e-instruct | $0.11 | $0.34 | - |
| llama-4-scout-17b-16e-instruct | $0.11 | $0.34 | - |
| openai/gpt-oss-120b | $0.15 | $0.6 | $0.075 |
| gpt-oss-120b | $0.15 | $0.6 | $0.075 |
| llama-3.2-11b-text-preview | $0.18 | $0.18 | - |
| llama-3.2-11b-vision-preview | $0.18 | $0.18 | - |
| llama3-groq-8b-8192-tool-use-preview | $0.19 | $0.19 | - |
| gemma2-9b-it | $0.2 | $0.2 | - |
| llama-guard-3-8b | $0.2 | $0.2 | - |
| meta-llama/llama-4-maverick-17b-128e-instruct | $0.2 | $0.6 | - |
| meta-llama/llama-guard-4-12b | $0.2 | $0.2 | - |
| llama-guard-4-12b | $0.2 | $0.2 | - |
| llama-4-maverick-17b-128e-instruct | $0.2 | $0.6 | - |
| mixtral-8x7b-32768 | $0.24 | $0.24 | - |
| qwen/qwen3-32b | $0.29 | $0.59 | - |
| qwen3-32b | $0.29 | $0.59 | - |
| llama-3.3-70b-versatile | $0.59 | $0.79 | - |
| llama-3.1-405b-reasoning | $0.59 | $0.79 | - |
| llama-3.1-70b-versatile | $0.59 | $0.79 | - |
| llama-3.3-70b-specdec | $0.59 | $0.99 | - |
| llama3-70b-8192 | $0.59 | $0.79 | - |
| llama2-70b-4096 | $0.7 | $0.8 | - |
| deepseek-r1-distill-llama-70b | $0.75 | $0.99 | - |
| mistral-saba-24b | $0.79 | $0.79 | - |
| llama3-groq-70b-8192-tool-use-preview | $0.89 | $0.89 | - |
| llama-3.2-90b-text-preview | $0.9 | $0.9 | - |
| llama-3.2-90b-vision-preview | $0.9 | $0.9 | - |
| moonshotai/kimi-k2-instruct | $1 | $3 | $0.5 |
| kimi-k2-instruct-0905 | $1 | $3 | $0.5 |
63 models
| Model | Input $/1M | Output $/1M | Cache $/1M |
|---|---|---|---|
| ministral-3b | $0.04 | $0.04 | - |
| mistral-small-24b-instruct-2501 | $0.05 | $0.08 | - |
| devstral-small | $0.06 | $0.12 | - |
| mistral-small-3-2-2506 | $0.06 | $0.18 | - |
| mistral-small-latest | $0.1 | $0.3 | - |
| ministral-8b | $0.1 | $1 | - |
| mistral-embed | $0.1 | $0.1 | - |
| devstral-small-2505 | $0.1 | $0.3 | - |
| devstral-small-2507 | $0.1 | $0.3 | - |
| devstral-small-latest | $0.1 | $0.3 | - |
| labs-devstral-small-2512 | $0.1 | $0.3 | - |
| mistral-small | $0.1 | $0.3 | - |
| ministral-3-3b-2512 | $0.1 | $0.1 | - |
| mistral-nemo | $0.15 | $0.15 | - |
| pixtral-12b | $0.15 | $0.15 | - |
| ministral-3-8b-2512 | $0.15 | $0.15 | - |
| pixtral-12b-2409 | $0.15 | $0.15 | - |
| mistral-saba | $0.2 | $0.6 | - |
| ministral-3-14b-2512 | $0.2 | $0.2 | - |
| mistral-7b | $0.25 | $0.25 | - |
| mistral-tiny | $0.25 | $0.25 | - |
| codestral-mamba-latest | $0.25 | $0.25 | - |
| open-codestral-mamba | $0.25 | $0.25 | - |
| open-mistral-7b | $0.25 | $0.25 | - |
| codestral-latest | $0.3 | $0.9 | - |
| codestral | $0.3 | $0.9 | - |
| codestral-2508 | $0.3 | $0.9 | - |
| open-mistral-nemo | $0.3 | $0.3 | - |
| open-mistral-nemo-2407 | $0.3 | $0.3 | - |
| mistral-medium-3 | $0.4 | $2 | - |
| devstral-medium-2507 | $0.4 | $2 | - |
| devstral-latest | $0.4 | $2 | - |
| devstral-medium-latest | $0.4 | $2 | - |
| devstral-2512 | $0.4 | $2 | - |
| mistral-medium-2505 | $0.4 | $2 | - |
| mistral-medium-latest | $0.4 | $2 | - |
| mistral-medium-3-1-2508 | $0.4 | $2 | - |
| magistral-small | $0.5 | $1.5 | - |
| magistral-small-2506 | $0.5 | $1.5 | - |
| magistral-small-latest | $0.5 | $1.5 | - |
| magistral-small-1-2-2509 | $0.5 | $1.5 | - |
| mistral-large-3 | $0.5 | $1.5 | - |
| mistral-large-2512 | $0.5 | $1.5 | - |
| mixtral-8x7b | $0.7 | $0.7 | - |
| open-mixtral-8x7b | $0.7 | $0.7 | - |
| mixtral-8x22b-instruct | $0.9 | $0.9 | - |
| codestral-2405 | $1 | $3 | - |
| mistral-large-latest | $2 | $6 | - |
| magistral-medium | $2 | $5 | - |
| mistral-large | $2 | $6 | - |
| pixtral-large | $2 | $6 | - |
| magistral-medium-2506 | $2 | $5 | - |
| magistral-medium-2509 | $2 | $5 | - |
| magistral-medium-1-2-2509 | $2 | $5 | - |
| magistral-medium-latest | $2 | $5 | - |
| mistral-large-2411 | $2 | $6 | - |
| open-mixtral-8x22b | $2 | $6 | - |
| pixtral-large-2411 | $2 | $6 | - |
| pixtral-large-latest | $2 | $6 | - |
| mistral-medium | $2.7 | $8.1 | - |
| mistral-medium-2312 | $2.7 | $8.1 | - |
| mistral-large-2407 | $3 | $9 | - |
| mistral-large-2402 | $4 | $12 | - |
145 models
| Model | Input $/1M | Output $/1M | Cache $/1M |
|---|---|---|---|
| gpt-5-nano | $0.05 | $0.4 | $0.005 |
| gpt-5-nano-2025-08-07 | $0.05 | $0.4 | $0.005 |
| gpt-4.1-nano | $0.1 | $0.4 | $0.025 |
| gpt-4.1-nano-2025-04-14 | $0.1 | $0.4 | $0.025 |
| gpt-4o-mini | $0.15 | $0.6 | $0.075 |
| gpt-4o-mini-2024-07-18 | $0.15 | $0.6 | $0.075 |
| gpt-4o-mini-audio-preview | $0.15 | $0.6 | - |
| gpt-4o-mini-audio-preview-2024-12-17 | $0.15 | $0.6 | - |
| gpt-4o-mini-search-preview | $0.15 | $0.6 | $0.075 |
| gpt-4o-mini-search-preview-2025-03-11 | $0.15 | $0.6 | $0.075 |
| gpt-5.4-nano | $0.2 | $1.25 | $0.02 |
| ft:gpt-4.1-nano-2025-04-14 | $0.2 | $0.8 | $0.05 |
| gpt-5-mini | $0.25 | $2 | $0.025 |
| gpt-5.1-codex-mini | $0.25 | $2 | $0.025 |
| gpt-5-mini-2025-08-07 | $0.25 | $2 | $0.025 |
| ft:gpt-4o-mini | $0.3 | $1.2 | - |
| gpt-4o-mini-2024-07-18.ft- | $0.3 | $1.2 | - |
| ft:gpt-4o-mini-2024-07-18 | $0.3 | $1.2 | $0.15 |
| gpt-4.1-mini | $0.4 | $1.6 | $0.1 |
| ada | $0.4 | $0.4 | - |
| gpt-4.1-mini-2025-04-14 | $0.4 | $1.6 | $0.1 |
| babbage | $0.5 | $0.5 | - |
| gpt-3.5-turbo | $0.5 | $1.5 | - |
| gpt-3.5-turbo-0125 | $0.5 | $1.5 | - |
| gpt-4o-mini-realtime-preview | $0.6 | $2.4 | $0.3 |
| gpt-realtime-mini | $0.6 | $2.4 | $0.06 |
| gpt-audio-mini | $0.6 | $2.4 | - |
| gpt-audio-mini-2025-10-06 | $0.6 | $2.4 | - |
| gpt-audio-mini-2025-12-15 | $0.6 | $2.4 | - |
| gpt-4o-mini-realtime-preview-2024-12-17 | $0.6 | $2.4 | $0.3 |
| gpt-realtime-mini-2025-10-06 | $0.6 | $2.4 | $0.06 |
| gpt-realtime-mini-2025-12-15 | $0.6 | $2.4 | $0.06 |
| gpt-5.4-mini | $0.75 | $4.5 | $0.075 |
| ft:gpt-4.1-mini-2025-04-14 | $0.8 | $3.2 | $0.2 |
| gpt-3.5-turbo-1106 | $1 | $2 | - |
| o4-mini | $1.1 | $4.4 | $0.275 |
| o3-mini | $1.1 | $4.4 | $0.55 |
| o3-mini-2025-01-31 | $1.1 | $4.4 | $0.55 |
| o4-mini-2025-04-16 | $1.1 | $4.4 | $0.275 |
| gpt-4o-mini-transcribe | $1.25 | $5 | - |
| gpt-5 | $1.25 | $10 | $0.125 |
| gpt-5.1 | $1.25 | $10 | $0.125 |
| gpt-5.1-2025-11-13 | $1.25 | $10 | $0.125 |
| gpt-5.1-chat-latest | $1.25 | $10 | $0.125 |
| gpt-5-2025-08-07 | $1.25 | $10 | $0.125 |
| gpt-5-chat | $1.25 | $10 | $0.125 |
| gpt-5-chat-latest | $1.25 | $10 | $0.125 |
| gpt-5-codex | $1.25 | $10 | $0.125 |
| gpt-5.1-codex | $1.25 | $10 | $0.125 |
| gpt-5.1-codex-max | $1.25 | $10 | $0.125 |
| gpt-4o-mini-transcribe-2025-03-20 | $1.25 | $5 | - |
| gpt-4o-mini-transcribe-2025-12-15 | $1.25 | $5 | - |
| gpt-5-search-api | $1.25 | $10 | $0.125 |
| gpt-5-search-api-2025-10-14 | $1.25 | $10 | $0.125 |
| codex-mini | $1.5 | $6 | $0.375 |
| gpt-3.5-0301 | $1.5 | $2 | - |
| gpt-3.5-turbo-0613 | $1.5 | $2 | - |
| gpt-3.5-turbo-instruct | $1.5 | $2 | - |
| codex-mini-latest | $1.5 | $6 | $0.375 |
| gpt-5.2 | $1.75 | $14 | $0.175 |
| gpt-5.3-codex | $1.75 | $14 | $0.175 |
| gpt-5.2-2025-12-11 | $1.75 | $14 | $0.175 |
| gpt-5.2-chat-latest | $1.75 | $14 | $0.175 |
| gpt-5.3-chat-latest | $1.75 | $14 | $0.175 |
| gpt-5.2-codex | $1.75 | $14 | $0.175 |
| gpt-4.1 | $2 | $8 | $0.5 |
| o3 | $2 | $8 | $0.5 |
| curie | $2 | $2 | - |
| o4-mini-deep-research | $2 | $8 | $0.5 |
| gpt-4.1-2025-04-14 | $2 | $8 | $0.5 |
| o3-2025-04-16 | $2 | $8 | $0.5 |
| o4-mini-deep-research-2025-06-26 | $2 | $8 | $0.5 |
| gpt-4o | $2.5 | $10 | $1.25 |
| gpt-4o-search-preview | $2.5 | $10 | - |
| gpt-4o-transcribe | $2.5 | $10 | - |
| gpt-5-image-mini | $2.5 | $2 | $0.25 |
| gpt-5.4 | $2.5 | $15 | $0.25 |
| gpt-4o-transcribe-diarize | $2.5 | $10 | - |
| gpt-4o-2024-08-06 | $2.5 | $10 | $1.25 |
| gpt-4o-2024-11-20 | $2.5 | $10 | $1.25 |
| gpt-4o-audio-preview | $2.5 | $10 | - |
| gpt-4o-audio-preview-2024-12-17 | $2.5 | $10 | - |
| gpt-4o-audio-preview-2025-06-03 | $2.5 | $10 | - |
| gpt-audio | $2.5 | $10 | - |
| gpt-audio-1.5 | $2.5 | $10 | - |
| gpt-audio-2025-08-28 | $2.5 | $10 | - |
| gpt-4o-mini-tts | $2.5 | $10 | - |
| gpt-4o-search-preview-2025-03-11 | $2.5 | $10 | $1.25 |
| gpt-5.4-2026-03-05 | $2.5 | $15 | $0.25 |
| gpt-4o-mini-tts-2025-03-20 | $2.5 | $10 | - |
| gpt-4o-mini-tts-2025-12-15 | $2.5 | $10 | - |
| computer-use | $3 | $12 | - |
| ft:gpt-3.5-turbo- | $3 | $6 | - |
| gpt-3.5-turbo-16k | $3 | $4 | - |
| o1-mini | $3 | $12 | $1.5 |
| ft:gpt-3.5-turbo | $3 | $6 | - |
| ft:gpt-3.5-turbo-0125 | $3 | $6 | - |
| ft:gpt-3.5-turbo-0613 | $3 | $6 | - |
| ft:gpt-3.5-turbo-1106 | $3 | $6 | - |
| ft:gpt-4.1-2025-04-14 | $3 | $12 | $0.75 |
| ft:gpt-4o | $3.75 | $15 | - |
| ft:gpt-4o-2024-08-06 | $3.75 | $15 | $1.875 |
| ft:gpt-4o-2024-11-20 | $3.75 | $15 | - |
| gpt-realtime | $4 | $16 | $0.4 |
| ft:o4-mini-2025-04-16 | $4 | $16 | $1 |
| gpt-realtime-1.5 | $4 | $16 | $0.4 |
| gpt-realtime-2025-08-28 | $4 | $16 | $0.4 |
| chatgpt-4o-latest | $5 | $15 | - |
| gpt-4o-realtime-preview | $5 | $20 | $2.5 |
| gpt-4o-2024-05-13 | $5 | $15 | - |
| gpt-4o-realtime-preview-2024-12-17 | $5 | $20 | $2.5 |
| gpt-4o-realtime-preview-2025-06-03 | $5 | $20 | $2.5 |
| gpt-image-1.5 | $5 | $10 | $1.25 |
| gpt-image-1.5-2025-12-16 | $5 | $10 | $1.25 |
| gpt-4o:extended | $6 | $18 | - |
| gpt-4-turbo | $10 | $30 | - |
| gpt-4-vision-preview | $10 | $30 | - |
| gpt-5-image | $10 | $10 | $1.25 |
| o3-deep-research | $10 | $40 | $2.5 |
| gpt-4-0125-preview | $10 | $30 | - |
| gpt-4-1106-preview | $10 | $30 | - |
| gpt-4-turbo-2024-04-09 | $10 | $30 | - |
| gpt-4-turbo-preview | $10 | $30 | - |
| o3-deep-research-2025-06-26 | $10 | $40 | $2.5 |
| gpt-5-pro | $15 | $120 | - |
| o1 | $15 | $60 | $7.5 |
| gpt-5-pro-2025-10-06 | $15 | $120 | - |
| o1-2024-12-17 | $15 | $60 | $7.5 |
| davinci | $20 | $20 | - |
| o3-pro | $20 | $80 | - |
| text-davinci-002 | $20 | $20 | - |
| text-davinci-003 | $20 | $20 | - |
| o3-pro-2025-06-10 | $20 | $80 | - |
| gpt-5.2-pro | $21 | $168 | - |
| gpt-5.2-pro-2025-12-11 | $21 | $168 | - |
| gpt-4 | $30 | $60 | - |
| gpt-5.4-pro | $30 | $180 | - |
| ft:gpt-4-0613 | $30 | $60 | - |
| gpt-4-0314 | $30 | $60 | - |
| gpt-4-0613 | $30 | $60 | - |
| gpt-5.4-pro-2026-03-05 | $30 | $180 | $3 |
| gpt-4-32k | $60 | $120 | - |
| gpt-4.5-preview | $75 | $150 | $37.5 |
| o1-pro | $150 | $600 | - |
| o1-pro-2025-03-19 | $150 | $600 | - |
105 models
| Model | Input $/1M | Output $/1M | Cache $/1M |
|---|---|---|---|
| gpt-oss-20b | $0.05 | $0.2 | - |
| Qwen/Qwen1.5-0.5B | $0.1 | $0.1 | - |
| Qwen/Qwen1.5-1.8B | $0.1 | $0.1 | - |
| Qwen/Qwen1.5-4B | $0.1 | $0.1 | - |
| google/gemma-2b | $0.1 | $0.1 | - |
| meta-llama/Meta-Llama-3-8B-Instruct-Lite | $0.1 | $0.1 | - |
| microsoft/phi-2 | $0.1 | $0.1 | - |
| togethercomputer/RedPajama-INCITE-Base-3B-v1 | $0.1 | $0.1 | - |
| togethercomputer/RedPajama-INCITE-Chat-3B-v1 | $0.1 | $0.1 | - |
| togethercomputer/RedPajama-INCITE-Instruct-3B-v1 | $0.1 | $0.1 | - |
| together-ai-up-to-4b | $0.1 | $0.1 | - |
| gpt-oss-120b | $0.15 | $0.6 | - |
| Qwen3-Next-80B-A3B-Instruct | $0.15 | $1.5 | - |
| Qwen3-Next-80B-A3B-Thinking | $0.15 | $1.5 | - |
| meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo | $0.18 | $0.18 | - |
| meta-llama/Llama-4-Scout-17B-16E-Instruct | $0.18 | $0.59 | - |
| meta-llama/Meta-Llama-3-8B-Instruct-Turbo | $0.18 | $0.18 | - |
| Llama-4-Scout-17B-16E-Instruct | $0.18 | $0.59 | - |
| Meta-Llama-3.1-8B-Instruct-Turbo | $0.18 | $0.18 | - |
| NousResearch/Nous-Capybara-7B-V1p9 | $0.2 | $0.2 | - |
| NousResearch/Nous-Hermes-llama-2-7b | $0.2 | $0.2 | - |
| Open-Orca/Mistral-7B-OpenOrca | $0.2 | $0.2 | - |
| Qwen/Qwen1.5-7B | $0.2 | $0.2 | - |
| Undi95/Toppy-M-7B | $0.2 | $0.2 | - |
| allenai/OLMo-7B | $0.2 | $0.2 | - |
| codellama/CodeLlama-7b-Instruct-hf | $0.2 | $0.2 | - |
| google/gemma-7b | $0.2 | $0.2 | - |
| lmsys/vicuna-7b-v1.5 | $0.2 | $0.2 | - |
| meta-llama/Llama-2-7b-chat-hf | $0.2 | $0.2 | - |
| meta-llama/Llama-3-8b-chat-hf | $0.2 | $0.2 | - |
| mistralai/Mistral-7B-Instruct-v0.1 | $0.2 | $0.2 | - |
| mistralai/Mistral-7B-Instruct-v0.2 | $0.2 | $0.2 | - |
| mistralai/Mistral-7B-v0.1 | $0.2 | $0.2 | - |
| openchat/openchat-3.5-1210 | $0.2 | $0.2 | - |
| snorkelai/Snorkel-Mistral-PairRM-DPO | $0.2 | $0.2 | - |
| teknium/OpenHermes-2-Mistral-7B | $0.2 | $0.2 | - |
| teknium/OpenHermes-2p5-Mistral-7B | $0.2 | $0.2 | - |
| togethercomputer/GPT-JT-Moderation-6B | $0.2 | $0.2 | - |
| togethercomputer/Llama-2-7B-32K-Instruct | $0.2 | $0.2 | - |
| togethercomputer/RedPajama-INCITE-7B-Base | $0.2 | $0.2 | - |
| togethercomputer/RedPajama-INCITE-7B-Chat | $0.2 | $0.2 | - |
| togethercomputer/RedPajama-INCITE-7B-Instruct | $0.2 | $0.2 | - |
| togethercomputer/StripedHyena-Hessian-7B | $0.2 | $0.2 | - |
| togethercomputer/StripedHyena-Nous-7B | $0.2 | $0.2 | - |
| togethercomputer/alpaca-7b | $0.2 | $0.2 | - |
| zero-one-ai/Yi-6B | $0.2 | $0.2 | - |
| together-ai-4.1b-8b | $0.2 | $0.2 | - |
| Qwen3-235B-A22B-Instruct-2507-tput | $0.2 | $6 | - |
| Qwen3-235B-A22B-fp8-tput | $0.2 | $0.6 | - |
| GLM-4.5-Air-FP8 | $0.2 | $1.1 | - |
| NousResearch/Nous-Hermes-Llama2-13b | $0.225 | $0.225 | - |
| codellama/CodeLlama-13b-Instruct-hf | $0.225 | $0.225 | - |
| meta-llama/Llama-2-13b-chat-hf | $0.225 | $0.225 | - |
| meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 | $0.27 | $0.85 | - |
| Llama-4-Maverick-17B-128E-Instruct-FP8 | $0.27 | $0.85 | - |
| Austism/chronos-hermes-13b | $0.3 | $0.3 | - |
| Gryphe/MythoMax-L2-13b | $0.3 | $0.3 | - |
| Nexusflow/NexusRaven-V2-13B | $0.3 | $0.3 | - |
| Qwen/Qwen1.5-14B | $0.3 | $0.3 | - |
| Undi95/ReMM-SLERP-L2-13B | $0.3 | $0.3 | - |
| WizardLM/WizardLM-13B-V1.2 | $0.3 | $0.3 | - |
| lmsys/vicuna-13b-v1.5 | $0.3 | $0.3 | - |
| upstage/SOLAR-10.7B-Instruct-v1.0 | $0.3 | $0.3 | - |
| together-ai-8.1b-21b | $0.3 | $0.3 | - |
| GLM-4.7 | $0.45 | $2 | - |
| Kimi-K2.5 | $0.5 | $2.8 | - |
| meta-llama/Meta-Llama-3-70B-Instruct-Lite | $0.54 | $0.54 | - |
| DeepSeek-R1-0528-tput | $0.55 | $2.19 | - |
| DeepSeek-V3.1 | $0.6 | $1.7 | - |
| Mixtral-8x7B-Instruct-v0.1 | $0.6 | $0.6 | - |
| GLM-4.6 | $0.6 | $2.2 | - |
| Qwen3.5-397B-A17B | $0.6 | $3.6 | - |
| Qwen3-235B-A22B-Thinking-2507 | $0.65 | $3 | - |
| codellama/CodeLlama-34b-Instruct-hf | $0.776 | $0.776 | - |
| NousResearch/Nous-Hermes-2-Yi-34B | $0.8 | $0.8 | - |
| deepseek-ai/deepseek-coder-33b-instruct | $0.8 | $0.8 | - |
| zero-one-ai/Yi-34B | $0.8 | $0.8 | - |
| together-ai-21.1b-41b | $0.8 | $0.8 | - |
| meta-llama/Meta-Llama-3.3-70B-Instruct-Turbo | $0.88 | $0.88 | - |
| meta-llama/Llama-3.3-70B-Instruct-Turbo | $0.88 | $0.88 | - |
| meta-llama/Meta-Llama-3-70B-Instruct-Turbo | $0.88 | $0.88 | - |
| meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo | $0.88 | $0.88 | - |
| Llama-3.3-70B-Instruct-Turbo | $0.88 | $0.88 | - |
| Meta-Llama-3.1-70B-Instruct-Turbo | $0.88 | $0.88 | - |
| mistralai/Mixtral-8x7B-Instruct-v0.1 | $0.9 | $0.9 | - |
| NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO | $0.9 | $0.9 | - |
| NousResearch/Nous-Hermes-2-Mixtral-8x7B-SFT | $0.9 | $0.9 | - |
| Qwen/Qwen1.5-72B | $0.9 | $0.9 | - |
| codellama/CodeLlama-70b-Instruct-hf | $0.9 | $0.9 | - |
| garage-bAInd/Platypus2-70B-instruct | $0.9 | $0.9 | - |
| meta-llama/Llama-2-70b-chat-hf | $0.9 | $0.9 | - |
| meta-llama/Llama-3-70b-chat-hf | $0.9 | $0.9 | - |
| mistralai/Mixtral-8x7B-v0.1 | $0.9 | $0.9 | - |
| together-ai-41.1b-80b | $0.9 | $0.9 | - |
| Kimi-K2-Instruct | $1 | $3 | - |
| Kimi-K2-Instruct-0905 | $1 | $3 | - |
| Qwen/Qwen2.5-72B-Instruct-Turbo | $1.2 | $1.2 | - |
| microsoft/WizardLM-2-8x22B | $1.2 | $1.2 | - |
| DeepSeek-V3 | $1.25 | $1.25 | - |
| together-ai-81.1b-110b | $1.8 | $1.8 | - |
| Qwen3-Coder-480B-A35B-Instruct-FP8 | $2 | $2 | - |
| mistralai/Mixtral-8x22B-Instruct-v0.1 | $2.4 | $2.4 | - |
| DeepSeek-R1 | $3 | $7 | - |
| meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo | $3.5 | $3.5 | - |
| Meta-Llama-3.1-405B-Instruct-Turbo | $3.5 | $3.5 | - |
39 models
| Model | Input $/1M | Output $/1M | Cache $/1M |
|---|---|---|---|
| grok-4-1-fast-reasoning | $0.2 | $0.5 | $0.05 |
| grok-4-1-fast-non-reasoning | $0.2 | $0.5 | $0.05 |
| grok-4-fast-non-reasoning | $0.2 | $0.5 | $0.05 |
| grok-4-fast-reasoning | $0.2 | $0.5 | $0.05 |
| grok-code-fast-1 | $0.2 | $1.5 | $0.02 |
| grok-4-1-fast | $0.2 | $0.5 | $0.05 |
| grok-4-1-fast-reasoning-latest | $0.2 | $0.5 | $0.05 |
| grok-4-1-fast-non-reasoning-latest | $0.2 | $0.5 | $0.05 |
| grok-code-fast | $0.2 | $1.5 | $0.02 |
| grok-code-fast-1-0825 | $0.2 | $1.5 | $0.02 |
| grok-3-mini | $0.3 | $0.5 | $0.075 |
| grok-3-mini-beta | $0.3 | $0.5 | $0.075 |
| grok-3-mini-latest | $0.3 | $0.5 | $0.075 |
| grok-3-mini-fast | $0.6 | $4 | $0.15 |
| grok-3-mini-fast-beta | $0.6 | $4 | $0.15 |
| grok-3-mini-fast-latest | $0.6 | $4 | $0.15 |
| grok-4.20-0309-reasoning | $2 | $6 | $0.2 |
| grok-4.20-0309-non-reasoning | $2 | $6 | $0.2 |
| grok-4.20-multi-agent-0309 | $2 | $6 | $0.2 |
| grok-4 | $2 | $6 | $0.2 |
| grok-2 | $2 | $10 | - |
| grok-2-1212 | $2 | $10 | - |
| grok-2-vision-1212 | $2 | $10 | - |
| grok-2-latest | $2 | $10 | - |
| grok-2-vision | $2 | $10 | - |
| grok-2-vision-latest | $2 | $10 | - |
| grok-4-latest | $2 | $6 | $0.2 |
| grok-4.20-multi-agent-beta-0309 | $2 | $6 | $0.2 |
| grok-4.20-beta-0309-reasoning | $2 | $6 | $0.2 |
| grok-4.20-beta-0309-non-reasoning | $2 | $6 | $0.2 |
| grok-3 | $3 | $15 | $0.75 |
| grok-4-0709 | $3 | $15 | $0.75 |
| grok-3-beta | $3 | $15 | $0.75 |
| grok-3-latest | $3 | $15 | $0.75 |
| grok-3-fast | $5 | $25 | $1.25 |
| grok-3-fast-beta | $5 | $25 | $1.25 |
| grok-3-fast-latest | $5 | $25 | $1.25 |
| grok-beta | $5 | $15 | - |
| grok-vision-beta | $5 | $15 | - |
LLMKit sits between your app and AI providers. Every request gets logged with token counts and dollar costs. Budget limits reject requests before they reach the provider.
MIT licensed. Built with Claude Code. Source on GitHub