LLM API Pricing Comparison

731 models across 9 providers. Prices per 1M tokens in USD. Updated weekly.

Providers: anthropic, deepseek, fireworks, gemini, groq, mistral, openai, together, xai. Data sourced from official pricing pages. Use the free API for programmatic access.

anthropic

29 models

Model	Input $/1M	Output $/1M	Cache $/1M
claude-3-haiku-20240307	$0.25	$1.25	$0.03
claude-3-haiku	$0.25	$1.25	$0.03
claude-3-5-haiku-20241022	$0.8	$4	$0.08
claude-3-5-haiku-latest	$0.8	$4	$0.08
claude-haiku-4-5	$1	$5	$0.1
claude-haiku-4-5-20251001	$1	$5	$0.1
claude-sonnet-4-6	$3	$15	$0.3
claude-sonnet-4-5	$3	$15	$0.3
claude-sonnet-4-20250514	$3	$15	$0.3
claude-3-5-sonnet	$3	$15	$0.3
claude-3-7-sonnet-latest	$3	$15	$0.3
claude-3-sonnet	$3	$15	$0.3
claude-sonnet-4-0	$3	$15	$0.3
claude-3-7-sonnet-20250219	$3	$15	$0.3
claude-4-sonnet-20250514	$3	$15	$0.3
claude-sonnet-4-5-20250929	$3	$15	$0.3
claude-opus-4-6	$5	$25	$0.5
claude-opus-4-5	$5	$25	$0.5
claude-opus-4-5-20251101	$5	$25	$0.5
claude-opus-4-6-20260205	$5	$25	$0.5
claude-2	$8	$24	-
claude-v1	$8	$24	-
claude-opus-4-20250514	$15	$75	$1.5
claude-3-opus-latest	$15	$75	$1.5
claude-opus-4-0	$15	$75	$1.5
claude-opus-4-1	$15	$75	$1.5
claude-3-opus-20240229	$15	$75	$1.5
claude-4-opus-20250514	$15	$75	$1.5
claude-opus-4-1-20250805	$15	$75	$1.5

deepseek

6 models

Model	Input $/1M	Output $/1M	Cache $/1M
deepseek-coder	$0.14	$0.28	-
deepseek-v3	$0.27	$1.1	$0.07
deepseek-chat	$0.28	$0.42	$0.028
deepseek-reasoner	$0.28	$0.42	$0.028
deepseek-v3.2	$0.28	$0.42	$0.028
deepseek-r1	$0.55	$2.19	-

fireworks

257 models

Model	Input $/1M	Output $/1M	Cache $/1M
SSD-1B	$0.0001	$0.0001	-
japanese-stable-diffusion-xl	$0.0001	$0.0001	-
playground-v2-1024px-aesthetic	$0.0001	$0.0001	-
playground-v2-5-1024px-aesthetic	$0.0001	$0.0001	-
stable-diffusion-xl-1024-v1-0	$0.0001	$0.0001	-
flux-1-schnell-fp8	$0.0003	$0.0003	-
flux-1-dev-fp8	$0.0005	$0.0005	-
flux-1-dev-controlnet-union	$0.001	$0.001	-
flux-kontext-pro	$0.04	$0.04	-
gpt-oss-20b	$0.07	$0.3	$0.04
flux-kontext-max	$0.08	$0.08	-
gemma-3-27b-it	$0.1	$0.1	-
llama-v3p2-1b-instruct	$0.1	$0.1	-
llama-v3p2-3b-instruct	$0.1	$0.1	-
codegemma-2b	$0.1	$0.1	-
cogito-v1-preview-llama-3b	$0.1	$0.1	-
deepseek-coder-1b-base	$0.1	$0.1	-
deepseek-r1-distill-qwen-1p5b	$0.1	$0.1	-
ernie-4p5-21b-a3b-pt	$0.1	$0.1	-
ernie-4p5-300b-a47b-pt	$0.1	$0.1	-
flux-1-dev	$0.1	$0.1	-
flux-1-schnell	$0.1	$0.1	-
gemma-2b-it	$0.1	$0.1	-
llama-guard-3-1b	$0.1	$0.1	-
llama-v2-70b	$0.1	$0.1	-
llama-v3p1-405b-instruct-long	$0.1	$0.1	-
llama-v3p1-70b-instruct-1b	$0.1	$0.1	-
llama-v3p2-1b	$0.1	$0.1	-
llama-v3p2-3b	$0.1	$0.1	-
minimax-m1-80k	$0.1	$0.1	-
ministral-3-3b-instruct-2512	$0.1	$0.1	-
nemotron-nano-v2-12b-vl	$0.1	$0.1	-
phi-2-3b	$0.1	$0.1	-
phi-3-mini-128k-instruct	$0.1	$0.1	-
qwen2-vl-2b-instruct	$0.1	$0.1	-
qwen2p5-0p5b-instruct	$0.1	$0.1	-
qwen2p5-1p5b-instruct	$0.1	$0.1	-
qwen2p5-coder-0p5b	$0.1	$0.1	-
qwen2p5-coder-0p5b-instruct	$0.1	$0.1	-
qwen2p5-coder-1p5b	$0.1	$0.1	-
qwen2p5-coder-1p5b-instruct	$0.1	$0.1	-
qwen2p5-coder-3b	$0.1	$0.1	-
qwen2p5-coder-3b-instruct	$0.1	$0.1	-
qwen3-0p6b	$0.1	$0.1	-
qwen3-1p7b	$0.1	$0.1	-
qwen3-1p7b-fp8-draft	$0.1	$0.1	-
qwen3-1p7b-fp8-draft-131072	$0.1	$0.1	-
qwen3-1p7b-fp8-draft-40960	$0.1	$0.1	-
stablecode-3b	$0.1	$0.1	-
starcoder2-3b	$0.1	$0.1	-
gpt-oss-120b	$0.15	$0.6	$0.07
llama4-scout-instruct-basic	$0.15	$0.6	-
qwen3-30b-a3b	$0.15	$0.6	-
qwen3-coder-30b-a3b-instruct	$0.15	$0.6	-
qwen3-vl-30b-a3b-instruct	$0.15	$0.6	-
qwen3-vl-30b-a3b-thinking	$0.15	$0.6	-
accounts/fireworks/models/llama-v3p1-8b-instruct	$0.2	$0.2	$0.1
llama-v3p1-8b-instruct	$0.2	$0.2	-
fireworks-ai-4.1b-to-16b	$0.2	$0.2	-
fireworks-ai-up-to-4b	$0.2	$0.2	-
llama-v3p2-11b-vision-instruct	$0.2	$0.2	-
chronos-hermes-13b-v2	$0.2	$0.2	-
code-llama-13b	$0.2	$0.2	-
code-llama-13b-instruct	$0.2	$0.2	-
code-llama-13b-python	$0.2	$0.2	-
code-llama-7b	$0.2	$0.2	-
code-llama-7b-instruct	$0.2	$0.2	-
code-llama-7b-python	$0.2	$0.2	-
code-qwen-1p5-7b	$0.2	$0.2	-
codegemma-7b	$0.2	$0.2	-
cogito-v1-preview-llama-8b	$0.2	$0.2	-
cogito-v1-preview-qwen-14b	$0.2	$0.2	-
deepseek-coder-7b-base	$0.2	$0.2	-
deepseek-coder-7b-base-v1p5	$0.2	$0.2	-
deepseek-coder-7b-instruct-v1p5	$0.2	$0.2	-
deepseek-r1-0528-distill-qwen3-8b	$0.2	$0.2	-
deepseek-r1-distill-llama-8b	$0.2	$0.2	-
deepseek-r1-distill-qwen-14b	$0.2	$0.2	-
deepseek-r1-distill-qwen-7b	$0.2	$0.2	-
dobby-mini-unhinged-plus-llama-3-1-8b	$0.2	$0.2	-
firellava-13b	$0.2	$0.2	-
firesearch-ocr-v6	$0.2	$0.2	-
gemma-7b	$0.2	$0.2	-
gemma-7b-it	$0.2	$0.2	-
gemma2-9b-it	$0.2	$0.2	-
hermes-2-pro-mistral-7b	$0.2	$0.2	-
internvl3-8b	$0.2	$0.2	-
llama-guard-2-8b	$0.2	$0.2	-
llama-guard-3-8b	$0.2	$0.2	-
llama-v2-13b	$0.2	$0.2	-
llama-v2-13b-chat	$0.2	$0.2	-
llama-v2-7b	$0.2	$0.2	-
llama-v2-7b-chat	$0.2	$0.2	-
llama-v3-8b	$0.2	$0.2	-
llama-v3-8b-instruct-hf	$0.2	$0.2	-
llamaguard-7b	$0.2	$0.2	-
ministral-3-14b-instruct-2512	$0.2	$0.2	-
ministral-3-8b-instruct-2512	$0.2	$0.2	-
mistral-7b	$0.2	$0.2	-
mistral-7b-instruct-4k	$0.2	$0.2	-
mistral-7b-instruct-v0p2	$0.2	$0.2	-
mistral-7b-instruct-v3	$0.2	$0.2	-
mistral-7b-v0p2	$0.2	$0.2	-
mistral-nemo-base-2407	$0.2	$0.2	-
mistral-nemo-instruct-2407	$0.2	$0.2	-
mythomax-l2-13b	$0.2	$0.2	-
nous-capybara-7b-v1p9	$0.2	$0.2	-
nous-hermes-llama2-13b	$0.2	$0.2	-
nous-hermes-llama2-7b	$0.2	$0.2	-
nvidia-nemotron-nano-12b-v2	$0.2	$0.2	-
nvidia-nemotron-nano-9b-v2	$0.2	$0.2	-
openchat-3p5-0106-7b	$0.2	$0.2	-
openhermes-2-mistral-7b	$0.2	$0.2	-
openhermes-2p5-mistral-7b	$0.2	$0.2	-
openorca-7b	$0.2	$0.2	-
phi-3-vision-128k-instruct	$0.2	$0.2	-
pythia-12b	$0.2	$0.2	-
qwen-v2p5-14b-instruct	$0.2	$0.2	-
qwen-v2p5-7b	$0.2	$0.2	-
qwen2-7b-instruct	$0.2	$0.2	-
qwen2-vl-7b-instruct	$0.2	$0.2	-
qwen2p5-14b	$0.2	$0.2	-
qwen2p5-7b-instruct	$0.2	$0.2	-
qwen2p5-coder-14b	$0.2	$0.2	-
qwen2p5-coder-14b-instruct	$0.2	$0.2	-
qwen2p5-coder-7b	$0.2	$0.2	-
qwen2p5-coder-7b-instruct	$0.2	$0.2	-
qwen2p5-vl-3b-instruct	$0.2	$0.2	-
qwen2p5-vl-7b-instruct	$0.2	$0.2	-
qwen3-14b	$0.2	$0.2	-
qwen3-4b	$0.2	$0.2	-
qwen3-4b-instruct-2507	$0.2	$0.2	-
qwen3-8b	$0.2	$0.2	-
qwen3-vl-8b-instruct	$0.2	$0.2	-
rolm-ocr	$0.2	$0.2	-
snorkel-mistral-7b-pairrm-dpo	$0.2	$0.2	-
starcoder-16b	$0.2	$0.2	-
starcoder-7b	$0.2	$0.2	-
starcoder2-15b	$0.2	$0.2	-
starcoder2-7b	$0.2	$0.2	-
toppy-m-7b	$0.2	$0.2	-
yi-6b	$0.2	$0.2	-
zephyr-7b-beta	$0.2	$0.2	-
llama4-maverick-instruct-basic	$0.22	$0.88	-
qwen3-235b-a22b	$0.22	$0.88	-
glm-4p5-air	$0.22	$0.88	-
qwen3-235b-a22b-instruct-2507	$0.22	$0.88	-
qwen3-235b-a22b-thinking-2507	$0.22	$0.88	-
qwen3-vl-235b-a22b-instruct	$0.22	$0.88	-
qwen3-vl-235b-a22b-thinking	$0.22	$0.88	-
minimax-m2p1	$0.3	$1.2	-
minimax-m2	$0.3	$1.2	-
qwen3-coder-480b-a35b-instruct	$0.45	$1.8	-
fireworks-ai-moe-up-to-56b	$0.5	$0.5	-
deepseek-coder-v2-lite-base	$0.5	$0.5	-
deepseek-coder-v2-lite-instruct	$0.5	$0.5	-
deepseek-v2-lite-chat	$0.5	$0.5	-
dolphin-2p6-mixtral-8x7b	$0.5	$0.5	-
firefunction-v1	$0.5	$0.5	-
gpt-oss-safeguard-20b	$0.5	$0.5	-
mixtral-8x7b	$0.5	$0.5	-
mixtral-8x7b-instruct	$0.5	$0.5	-
mixtral-8x7b-instruct-hf	$0.5	$0.5	-
nous-hermes-2-mixtral-8x7b-dpo	$0.5	$0.5	-
qwen3-30b-a3b-instruct-2507	$0.5	$0.5	-
deepseek-r1-basic	$0.55	$2.19	-
glm-4p5	$0.55	$2.19	-
glm-4p6	$0.55	$2.19	-
deepseek-v3p2	$0.56	$1.68	$0.28
deepseek-v3p1	$0.56	$1.68	-
deepseek-v3p1-terminus	$0.56	$1.68	-
glm-4p7	$0.6	$2.2	-
kimi-k2p5	$0.6	$3	$0.1
kimi-k2-instruct	$0.6	$2.5	-
kimi-k2-instruct-0905	$0.6	$2.5	-
kimi-k2-thinking	$0.6	$2.5	-
accounts/fireworks/models/llama-v3p3-70b-instruct	$0.9	$0.9	$0.45
deepseek-v3-0324	$0.9	$0.9	-
qwen2p5-vl-72b-instruct	$0.9	$0.9	-
fireworks-ai-above-16b	$0.9	$0.9	-
deepseek-v3	$0.9	$0.9	-
firefunction-v2	$0.9	$0.9	-
llama-v3p2-90b-vision-instruct	$0.9	$0.9	-
qwen2-72b-instruct	$0.9	$0.9	-
qwen2p5-coder-32b-instruct	$0.9	$0.9	-
code-llama-34b	$0.9	$0.9	-
code-llama-34b-instruct	$0.9	$0.9	-
code-llama-34b-python	$0.9	$0.9	-
code-llama-70b	$0.9	$0.9	-
code-llama-70b-instruct	$0.9	$0.9	-
code-llama-70b-python	$0.9	$0.9	-
cogito-v1-preview-llama-70b	$0.9	$0.9	-
cogito-v1-preview-qwen-32b	$0.9	$0.9	-
deepseek-coder-33b-instruct	$0.9	$0.9	-
deepseek-r1-distill-llama-70b	$0.9	$0.9	-
deepseek-r1-distill-qwen-32b	$0.9	$0.9	-
devstral-small-2505	$0.9	$0.9	-
dobby-unhinged-llama-3-3-70b-new	$0.9	$0.9	-
dolphin-2-9-2-qwen2-72b	$0.9	$0.9	-
fare-20b	$0.9	$0.9	-
internvl3-38b	$0.9	$0.9	-
internvl3-78b	$0.9	$0.9	-
kat-coder	$0.9	$0.9	-
kat-dev-32b	$0.9	$0.9	-
kat-dev-72b-exp	$0.9	$0.9	-
llama-v2-70b-chat	$0.9	$0.9	-
llama-v3-70b-instruct	$0.9	$0.9	-
llama-v3-70b-instruct-hf	$0.9	$0.9	-
llama-v3p1-70b-instruct	$0.9	$0.9	-
llama-v3p1-nemotron-70b-instruct	$0.9	$0.9	-
llama-v3p3-70b-instruct	$0.9	$0.9	-
llava-yi-34b	$0.9	$0.9	-
mistral-small-24b-instruct-2501	$0.9	$0.9	-
nous-hermes-2-yi-34b	$0.9	$0.9	-
nous-hermes-llama2-70b	$0.9	$0.9	-
phind-code-llama-34b-python-v1	$0.9	$0.9	-
phind-code-llama-34b-v1	$0.9	$0.9	-
phind-code-llama-34b-v2	$0.9	$0.9	-
qwen-qwq-32b-preview	$0.9	$0.9	-
qwen1p5-72b-chat	$0.9	$0.9	-
qwen2-vl-72b-instruct	$0.9	$0.9	-
qwen2p5-32b	$0.9	$0.9	-
qwen2p5-32b-instruct	$0.9	$0.9	-
qwen2p5-72b	$0.9	$0.9	-
qwen2p5-72b-instruct	$0.9	$0.9	-
qwen2p5-coder-32b	$0.9	$0.9	-
qwen2p5-coder-32b-instruct-128k	$0.9	$0.9	-
qwen2p5-coder-32b-instruct-32k-rope	$0.9	$0.9	-
qwen2p5-coder-32b-instruct-64k	$0.9	$0.9	-
qwen2p5-math-72b-instruct	$0.9	$0.9	-
qwen2p5-vl-32b-instruct	$0.9	$0.9	-
qwen3-30b-a3b-thinking-2507	$0.9	$0.9	-
qwen3-32b	$0.9	$0.9	-
qwen3-coder-480b-instruct-bf16	$0.9	$0.9	-
qwen3-next-80b-a3b-instruct	$0.9	$0.9	-
qwen3-next-80b-a3b-thinking	$0.9	$0.9	-
qwen3-vl-32b-instruct	$0.9	$0.9	-
qwq-32b	$0.9	$0.9	-
yi-34b	$0.9	$0.9	-
yi-34b-200k-capybara	$0.9	$0.9	-
yi-34b-chat	$0.9	$0.9	-
fireworks-ai-56b-to-176b	$1.2	$1.2	-
deepseek-coder-v2-instruct	$1.2	$1.2	-
mixtral-8x22b-instruct-hf	$1.2	$1.2	-
cogito-671b-v2-p1	$1.2	$1.2	-
dbrx-instruct	$1.2	$1.2	-
deepseek-prover-v2	$1.2	$1.2	-
deepseek-v2p5	$1.2	$1.2	-
glm-4p5v	$1.2	$1.2	-
gpt-oss-safeguard-120b	$1.2	$1.2	-
mistral-large-3-fp8	$1.2	$1.2	-
mixtral-8x22b	$1.2	$1.2	-
mixtral-8x22b-instruct	$1.2	$1.2	-
deepseek-r1-0528	$3	$8	-
deepseek-r1	$3	$8	-
llama-v3p1-405b-instruct	$3	$3	-
yi-large	$3	$3	-

gemini

50 models

Model	Input $/1M	Output $/1M	Cache $/1M
gemini-flash-1.5-8b	$0.0375	$0.15	$0.01
gemini-1.5-flash	$0.075	$0.3	$0.01875
gemini-2.0-flash-lite	$0.075	$0.3	-
gemini-flash-1.5	$0.075	$0.3	$0.01875
gemini-2.0-flash-lite-001	$0.075	$0.3	$0.0187
gemini-2.0-flash	$0.1	$0.4	$0.025
gemini-2.5-flash-lite	$0.1	$0.4	$0.01
gemini-2.0-flash-001	$0.1	$0.4	$0.025
gemini-2.5-flash-lite-preview-09-2025	$0.1	$0.4	$0.01
gemini-flash-lite-latest	$0.1	$0.4	$0.025
gemini-2.5-flash-lite-preview-06-17	$0.1	$0.4	$0.025
gemini-1.0-pro-vision-001	$0.125	$0.375	-
gemini-pro	$0.125	$0.375	-
gemini-2.5-flash-preview	$0.15	$0.6	-
claude-3-haiku	$0.25	$1.25	$0.03
gemini-3.1-flash-lite-preview	$0.25	$1.5	$0.025
gemini-2.5-flash	$0.3	$2.5	$0.03
gemini-2.5-flash-image	$0.3	$30	-
gemini-live-2.5-flash-preview-native-audio-09-2025	$0.3	$2	$0.075
gemini-robotics-er-1.5-preview	$0.3	$2.5	-
gemini-2.5-flash-preview-09-2025	$0.3	$2.5	$0.075
gemini-flash-latest	$0.3	$2.5	$0.075
gemini-2.5-flash-preview-tts	$0.3	$2.5	-
gemini-2.5-flash-native-audio-latest	$0.3	$2.5	-
gemini-2.5-flash-native-audio-preview-09-2025	$0.3	$2.5	-
gemini-2.5-flash-native-audio-preview-12-2025	$0.3	$2.5	-
gemini-exp-1206	$0.3	$2.5	$0.03
gemini-gemma-2-27b-it	$0.35	$1.05	-
gemini-gemma-2-9b-it	$0.35	$1.05	-
gemini-3-flash-preview	$0.5	$3	$0.05
gemini-3.1-flash-image-preview	$0.5	$60	-
gemini-live-2.5-flash-preview	$0.5	$2	-
claude-3-5-haiku	$0.8	$4	$0.08
gemini-2.5-pro	$1.25	$10	$0.125
gemini-1.5-pro	$1.25	$5	-
gemini-pro-1.5	$1.25	$5	$0.3125
gemini-2.5-computer-use-preview-10-2025	$1.25	$10	-
gemini-2.5-pro-preview-tts	$1.25	$10	$0.125
gemini-pro-latest	$1.25	$10	$0.125
gemini-3-pro-image-preview	$2	$120	-
gemini-3-pro-preview	$2	$12	$0.2
gemini-3.1-pro-preview	$2	$12	$0.2
deep-research-pro-preview-12-2025	$2	$12	-
gemini-3.1-pro-preview-customtools	$2	$12	$0.2
claude-3-5-sonnet	$3	$15	$0.3
claude-3-7-sonnet	$3	$15	$0.3
claude-4-sonnet	$3	$15	$0.3
claude-opus-4-6	$5	$25	$0.5
claude-3-opus	$15	$75	$1.5
claude-4-opus	$15	$75	$1.5

groq

37 models

Model	Input $/1M	Output $/1M	Cache $/1M
llama-3.2-1b-preview	$0.04	$0.04	-
llama-3.1-8b-instant	$0.05	$0.08	-
llama3-8b-8192	$0.05	$0.08	-
llama-3.2-3b-preview	$0.06	$0.06	-
gemma-7b-it	$0.07	$0.07	-
openai/gpt-oss-20b	$0.075	$0.3	$0.0375
gpt-oss-20b	$0.075	$0.3	$0.0375
gpt-oss-safeguard-20b	$0.075	$0.3	$0.037
meta-llama/llama-4-scout-17b-16e-instruct	$0.11	$0.34	-
llama-4-scout-17b-16e-instruct	$0.11	$0.34	-
openai/gpt-oss-120b	$0.15	$0.6	$0.075
gpt-oss-120b	$0.15	$0.6	$0.075
llama-3.2-11b-text-preview	$0.18	$0.18	-
llama-3.2-11b-vision-preview	$0.18	$0.18	-
llama3-groq-8b-8192-tool-use-preview	$0.19	$0.19	-
gemma2-9b-it	$0.2	$0.2	-
llama-guard-3-8b	$0.2	$0.2	-
meta-llama/llama-4-maverick-17b-128e-instruct	$0.2	$0.6	-
meta-llama/llama-guard-4-12b	$0.2	$0.2	-
llama-guard-4-12b	$0.2	$0.2	-
llama-4-maverick-17b-128e-instruct	$0.2	$0.6	-
mixtral-8x7b-32768	$0.24	$0.24	-
qwen/qwen3-32b	$0.29	$0.59	-
qwen3-32b	$0.29	$0.59	-
llama-3.3-70b-versatile	$0.59	$0.79	-
llama-3.1-405b-reasoning	$0.59	$0.79	-
llama-3.1-70b-versatile	$0.59	$0.79	-
llama-3.3-70b-specdec	$0.59	$0.99	-
llama3-70b-8192	$0.59	$0.79	-
llama2-70b-4096	$0.7	$0.8	-
deepseek-r1-distill-llama-70b	$0.75	$0.99	-
mistral-saba-24b	$0.79	$0.79	-
llama3-groq-70b-8192-tool-use-preview	$0.89	$0.89	-
llama-3.2-90b-text-preview	$0.9	$0.9	-
llama-3.2-90b-vision-preview	$0.9	$0.9	-
moonshotai/kimi-k2-instruct	$1	$3	$0.5
kimi-k2-instruct-0905	$1	$3	$0.5

mistral

63 models

Model	Input $/1M	Output $/1M	Cache $/1M
ministral-3b	$0.04	$0.04	-
mistral-small-24b-instruct-2501	$0.05	$0.08	-
devstral-small	$0.06	$0.12	-
mistral-small-3-2-2506	$0.06	$0.18	-
mistral-small-latest	$0.1	$0.3	-
ministral-8b	$0.1	$1	-
mistral-embed	$0.1	$0.1	-
devstral-small-2505	$0.1	$0.3	-
devstral-small-2507	$0.1	$0.3	-
devstral-small-latest	$0.1	$0.3	-
labs-devstral-small-2512	$0.1	$0.3	-
mistral-small	$0.1	$0.3	-
ministral-3-3b-2512	$0.1	$0.1	-
mistral-nemo	$0.15	$0.15	-
pixtral-12b	$0.15	$0.15	-
ministral-3-8b-2512	$0.15	$0.15	-
pixtral-12b-2409	$0.15	$0.15	-
mistral-saba	$0.2	$0.6	-
ministral-3-14b-2512	$0.2	$0.2	-
mistral-7b	$0.25	$0.25	-
mistral-tiny	$0.25	$0.25	-
codestral-mamba-latest	$0.25	$0.25	-
open-codestral-mamba	$0.25	$0.25	-
open-mistral-7b	$0.25	$0.25	-
codestral-latest	$0.3	$0.9	-
codestral	$0.3	$0.9	-
codestral-2508	$0.3	$0.9	-
open-mistral-nemo	$0.3	$0.3	-
open-mistral-nemo-2407	$0.3	$0.3	-
mistral-medium-3	$0.4	$2	-
devstral-medium-2507	$0.4	$2	-
devstral-latest	$0.4	$2	-
devstral-medium-latest	$0.4	$2	-
devstral-2512	$0.4	$2	-
mistral-medium-2505	$0.4	$2	-
mistral-medium-latest	$0.4	$2	-
mistral-medium-3-1-2508	$0.4	$2	-
magistral-small	$0.5	$1.5	-
magistral-small-2506	$0.5	$1.5	-
magistral-small-latest	$0.5	$1.5	-
magistral-small-1-2-2509	$0.5	$1.5	-
mistral-large-3	$0.5	$1.5	-
mistral-large-2512	$0.5	$1.5	-
mixtral-8x7b	$0.7	$0.7	-
open-mixtral-8x7b	$0.7	$0.7	-
mixtral-8x22b-instruct	$0.9	$0.9	-
codestral-2405	$1	$3	-
mistral-large-latest	$2	$6	-
magistral-medium	$2	$5	-
mistral-large	$2	$6	-
pixtral-large	$2	$6	-
magistral-medium-2506	$2	$5	-
magistral-medium-2509	$2	$5	-
magistral-medium-1-2-2509	$2	$5	-
magistral-medium-latest	$2	$5	-
mistral-large-2411	$2	$6	-
open-mixtral-8x22b	$2	$6	-
pixtral-large-2411	$2	$6	-
pixtral-large-latest	$2	$6	-
mistral-medium	$2.7	$8.1	-
mistral-medium-2312	$2.7	$8.1	-
mistral-large-2407	$3	$9	-
mistral-large-2402	$4	$12	-

openai

145 models

Model	Input $/1M	Output $/1M	Cache $/1M
gpt-5-nano	$0.05	$0.4	$0.005
gpt-5-nano-2025-08-07	$0.05	$0.4	$0.005
gpt-4.1-nano	$0.1	$0.4	$0.025
gpt-4.1-nano-2025-04-14	$0.1	$0.4	$0.025
gpt-4o-mini	$0.15	$0.6	$0.075
gpt-4o-mini-2024-07-18	$0.15	$0.6	$0.075
gpt-4o-mini-audio-preview	$0.15	$0.6	-
gpt-4o-mini-audio-preview-2024-12-17	$0.15	$0.6	-
gpt-4o-mini-search-preview	$0.15	$0.6	$0.075
gpt-4o-mini-search-preview-2025-03-11	$0.15	$0.6	$0.075
gpt-5.4-nano	$0.2	$1.25	$0.02
ft:gpt-4.1-nano-2025-04-14	$0.2	$0.8	$0.05
gpt-5-mini	$0.25	$2	$0.025
gpt-5.1-codex-mini	$0.25	$2	$0.025
gpt-5-mini-2025-08-07	$0.25	$2	$0.025
ft:gpt-4o-mini	$0.3	$1.2	-
gpt-4o-mini-2024-07-18.ft-	$0.3	$1.2	-
ft:gpt-4o-mini-2024-07-18	$0.3	$1.2	$0.15
gpt-4.1-mini	$0.4	$1.6	$0.1
ada	$0.4	$0.4	-
gpt-4.1-mini-2025-04-14	$0.4	$1.6	$0.1
babbage	$0.5	$0.5	-
gpt-3.5-turbo	$0.5	$1.5	-
gpt-3.5-turbo-0125	$0.5	$1.5	-
gpt-4o-mini-realtime-preview	$0.6	$2.4	$0.3
gpt-realtime-mini	$0.6	$2.4	$0.06
gpt-audio-mini	$0.6	$2.4	-
gpt-audio-mini-2025-10-06	$0.6	$2.4	-
gpt-audio-mini-2025-12-15	$0.6	$2.4	-
gpt-4o-mini-realtime-preview-2024-12-17	$0.6	$2.4	$0.3
gpt-realtime-mini-2025-10-06	$0.6	$2.4	$0.06
gpt-realtime-mini-2025-12-15	$0.6	$2.4	$0.06
gpt-5.4-mini	$0.75	$4.5	$0.075
ft:gpt-4.1-mini-2025-04-14	$0.8	$3.2	$0.2
gpt-3.5-turbo-1106	$1	$2	-
o4-mini	$1.1	$4.4	$0.275
o3-mini	$1.1	$4.4	$0.55
o3-mini-2025-01-31	$1.1	$4.4	$0.55
o4-mini-2025-04-16	$1.1	$4.4	$0.275
gpt-4o-mini-transcribe	$1.25	$5	-
gpt-5	$1.25	$10	$0.125
gpt-5.1	$1.25	$10	$0.125
gpt-5.1-2025-11-13	$1.25	$10	$0.125
gpt-5.1-chat-latest	$1.25	$10	$0.125
gpt-5-2025-08-07	$1.25	$10	$0.125
gpt-5-chat	$1.25	$10	$0.125
gpt-5-chat-latest	$1.25	$10	$0.125
gpt-5-codex	$1.25	$10	$0.125
gpt-5.1-codex	$1.25	$10	$0.125
gpt-5.1-codex-max	$1.25	$10	$0.125
gpt-4o-mini-transcribe-2025-03-20	$1.25	$5	-
gpt-4o-mini-transcribe-2025-12-15	$1.25	$5	-
gpt-5-search-api	$1.25	$10	$0.125
gpt-5-search-api-2025-10-14	$1.25	$10	$0.125
codex-mini	$1.5	$6	$0.375
gpt-3.5-0301	$1.5	$2	-
gpt-3.5-turbo-0613	$1.5	$2	-
gpt-3.5-turbo-instruct	$1.5	$2	-
codex-mini-latest	$1.5	$6	$0.375
gpt-5.2	$1.75	$14	$0.175
gpt-5.3-codex	$1.75	$14	$0.175
gpt-5.2-2025-12-11	$1.75	$14	$0.175
gpt-5.2-chat-latest	$1.75	$14	$0.175
gpt-5.3-chat-latest	$1.75	$14	$0.175
gpt-5.2-codex	$1.75	$14	$0.175
gpt-4.1	$2	$8	$0.5
o3	$2	$8	$0.5
curie	$2	$2	-
o4-mini-deep-research	$2	$8	$0.5
gpt-4.1-2025-04-14	$2	$8	$0.5
o3-2025-04-16	$2	$8	$0.5
o4-mini-deep-research-2025-06-26	$2	$8	$0.5
gpt-4o	$2.5	$10	$1.25
gpt-4o-search-preview	$2.5	$10	-
gpt-4o-transcribe	$2.5	$10	-
gpt-5-image-mini	$2.5	$2	$0.25
gpt-5.4	$2.5	$15	$0.25
gpt-4o-transcribe-diarize	$2.5	$10	-
gpt-4o-2024-08-06	$2.5	$10	$1.25
gpt-4o-2024-11-20	$2.5	$10	$1.25
gpt-4o-audio-preview	$2.5	$10	-
gpt-4o-audio-preview-2024-12-17	$2.5	$10	-
gpt-4o-audio-preview-2025-06-03	$2.5	$10	-
gpt-audio	$2.5	$10	-
gpt-audio-1.5	$2.5	$10	-
gpt-audio-2025-08-28	$2.5	$10	-
gpt-4o-mini-tts	$2.5	$10	-
gpt-4o-search-preview-2025-03-11	$2.5	$10	$1.25
gpt-5.4-2026-03-05	$2.5	$15	$0.25
gpt-4o-mini-tts-2025-03-20	$2.5	$10	-
gpt-4o-mini-tts-2025-12-15	$2.5	$10	-
computer-use	$3	$12	-
ft:gpt-3.5-turbo-	$3	$6	-
gpt-3.5-turbo-16k	$3	$4	-
o1-mini	$3	$12	$1.5
ft:gpt-3.5-turbo	$3	$6	-
ft:gpt-3.5-turbo-0125	$3	$6	-
ft:gpt-3.5-turbo-0613	$3	$6	-
ft:gpt-3.5-turbo-1106	$3	$6	-
ft:gpt-4.1-2025-04-14	$3	$12	$0.75
ft:gpt-4o	$3.75	$15	-
ft:gpt-4o-2024-08-06	$3.75	$15	$1.875
ft:gpt-4o-2024-11-20	$3.75	$15	-
gpt-realtime	$4	$16	$0.4
ft:o4-mini-2025-04-16	$4	$16	$1
gpt-realtime-1.5	$4	$16	$0.4
gpt-realtime-2025-08-28	$4	$16	$0.4
chatgpt-4o-latest	$5	$15	-
gpt-4o-realtime-preview	$5	$20	$2.5
gpt-4o-2024-05-13	$5	$15	-
gpt-4o-realtime-preview-2024-12-17	$5	$20	$2.5
gpt-4o-realtime-preview-2025-06-03	$5	$20	$2.5
gpt-image-1.5	$5	$10	$1.25
gpt-image-1.5-2025-12-16	$5	$10	$1.25
gpt-4o:extended	$6	$18	-
gpt-4-turbo	$10	$30	-
gpt-4-vision-preview	$10	$30	-
gpt-5-image	$10	$10	$1.25
o3-deep-research	$10	$40	$2.5
gpt-4-0125-preview	$10	$30	-
gpt-4-1106-preview	$10	$30	-
gpt-4-turbo-2024-04-09	$10	$30	-
gpt-4-turbo-preview	$10	$30	-
o3-deep-research-2025-06-26	$10	$40	$2.5
gpt-5-pro	$15	$120	-
o1	$15	$60	$7.5
gpt-5-pro-2025-10-06	$15	$120	-
o1-2024-12-17	$15	$60	$7.5
davinci	$20	$20	-
o3-pro	$20	$80	-
text-davinci-002	$20	$20	-
text-davinci-003	$20	$20	-
o3-pro-2025-06-10	$20	$80	-
gpt-5.2-pro	$21	$168	-
gpt-5.2-pro-2025-12-11	$21	$168	-
gpt-4	$30	$60	-
gpt-5.4-pro	$30	$180	-
ft:gpt-4-0613	$30	$60	-
gpt-4-0314	$30	$60	-
gpt-4-0613	$30	$60	-
gpt-5.4-pro-2026-03-05	$30	$180	$3
gpt-4-32k	$60	$120	-
gpt-4.5-preview	$75	$150	$37.5
o1-pro	$150	$600	-
o1-pro-2025-03-19	$150	$600	-

together

105 models

Model	Input $/1M	Output $/1M	Cache $/1M
gpt-oss-20b	$0.05	$0.2	-
Qwen/Qwen1.5-0.5B	$0.1	$0.1	-
Qwen/Qwen1.5-1.8B	$0.1	$0.1	-
Qwen/Qwen1.5-4B	$0.1	$0.1	-
google/gemma-2b	$0.1	$0.1	-
meta-llama/Meta-Llama-3-8B-Instruct-Lite	$0.1	$0.1	-
microsoft/phi-2	$0.1	$0.1	-
togethercomputer/RedPajama-INCITE-Base-3B-v1	$0.1	$0.1	-
togethercomputer/RedPajama-INCITE-Chat-3B-v1	$0.1	$0.1	-
togethercomputer/RedPajama-INCITE-Instruct-3B-v1	$0.1	$0.1	-
together-ai-up-to-4b	$0.1	$0.1	-
gpt-oss-120b	$0.15	$0.6	-
Qwen3-Next-80B-A3B-Instruct	$0.15	$1.5	-
Qwen3-Next-80B-A3B-Thinking	$0.15	$1.5	-
meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo	$0.18	$0.18	-
meta-llama/Llama-4-Scout-17B-16E-Instruct	$0.18	$0.59	-
meta-llama/Meta-Llama-3-8B-Instruct-Turbo	$0.18	$0.18	-
Llama-4-Scout-17B-16E-Instruct	$0.18	$0.59	-
Meta-Llama-3.1-8B-Instruct-Turbo	$0.18	$0.18	-
NousResearch/Nous-Capybara-7B-V1p9	$0.2	$0.2	-
NousResearch/Nous-Hermes-llama-2-7b	$0.2	$0.2	-
Open-Orca/Mistral-7B-OpenOrca	$0.2	$0.2	-
Qwen/Qwen1.5-7B	$0.2	$0.2	-
Undi95/Toppy-M-7B	$0.2	$0.2	-
allenai/OLMo-7B	$0.2	$0.2	-
codellama/CodeLlama-7b-Instruct-hf	$0.2	$0.2	-
google/gemma-7b	$0.2	$0.2	-
lmsys/vicuna-7b-v1.5	$0.2	$0.2	-
meta-llama/Llama-2-7b-chat-hf	$0.2	$0.2	-
meta-llama/Llama-3-8b-chat-hf	$0.2	$0.2	-
mistralai/Mistral-7B-Instruct-v0.1	$0.2	$0.2	-
mistralai/Mistral-7B-Instruct-v0.2	$0.2	$0.2	-
mistralai/Mistral-7B-v0.1	$0.2	$0.2	-
openchat/openchat-3.5-1210	$0.2	$0.2	-
snorkelai/Snorkel-Mistral-PairRM-DPO	$0.2	$0.2	-
teknium/OpenHermes-2-Mistral-7B	$0.2	$0.2	-
teknium/OpenHermes-2p5-Mistral-7B	$0.2	$0.2	-
togethercomputer/GPT-JT-Moderation-6B	$0.2	$0.2	-
togethercomputer/Llama-2-7B-32K-Instruct	$0.2	$0.2	-
togethercomputer/RedPajama-INCITE-7B-Base	$0.2	$0.2	-
togethercomputer/RedPajama-INCITE-7B-Chat	$0.2	$0.2	-
togethercomputer/RedPajama-INCITE-7B-Instruct	$0.2	$0.2	-
togethercomputer/StripedHyena-Hessian-7B	$0.2	$0.2	-
togethercomputer/StripedHyena-Nous-7B	$0.2	$0.2	-
togethercomputer/alpaca-7b	$0.2	$0.2	-
zero-one-ai/Yi-6B	$0.2	$0.2	-
together-ai-4.1b-8b	$0.2	$0.2	-
Qwen3-235B-A22B-Instruct-2507-tput	$0.2	$6	-
Qwen3-235B-A22B-fp8-tput	$0.2	$0.6	-
GLM-4.5-Air-FP8	$0.2	$1.1	-
NousResearch/Nous-Hermes-Llama2-13b	$0.225	$0.225	-
codellama/CodeLlama-13b-Instruct-hf	$0.225	$0.225	-
meta-llama/Llama-2-13b-chat-hf	$0.225	$0.225	-
meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8	$0.27	$0.85	-
Llama-4-Maverick-17B-128E-Instruct-FP8	$0.27	$0.85	-
Austism/chronos-hermes-13b	$0.3	$0.3	-
Gryphe/MythoMax-L2-13b	$0.3	$0.3	-
Nexusflow/NexusRaven-V2-13B	$0.3	$0.3	-
Qwen/Qwen1.5-14B	$0.3	$0.3	-
Undi95/ReMM-SLERP-L2-13B	$0.3	$0.3	-
WizardLM/WizardLM-13B-V1.2	$0.3	$0.3	-
lmsys/vicuna-13b-v1.5	$0.3	$0.3	-
upstage/SOLAR-10.7B-Instruct-v1.0	$0.3	$0.3	-
together-ai-8.1b-21b	$0.3	$0.3	-
GLM-4.7	$0.45	$2	-
Kimi-K2.5	$0.5	$2.8	-
meta-llama/Meta-Llama-3-70B-Instruct-Lite	$0.54	$0.54	-
DeepSeek-R1-0528-tput	$0.55	$2.19	-
DeepSeek-V3.1	$0.6	$1.7	-
Mixtral-8x7B-Instruct-v0.1	$0.6	$0.6	-
GLM-4.6	$0.6	$2.2	-
Qwen3.5-397B-A17B	$0.6	$3.6	-
Qwen3-235B-A22B-Thinking-2507	$0.65	$3	-
codellama/CodeLlama-34b-Instruct-hf	$0.776	$0.776	-
NousResearch/Nous-Hermes-2-Yi-34B	$0.8	$0.8	-
deepseek-ai/deepseek-coder-33b-instruct	$0.8	$0.8	-
zero-one-ai/Yi-34B	$0.8	$0.8	-
together-ai-21.1b-41b	$0.8	$0.8	-
meta-llama/Meta-Llama-3.3-70B-Instruct-Turbo	$0.88	$0.88	-
meta-llama/Llama-3.3-70B-Instruct-Turbo	$0.88	$0.88	-
meta-llama/Meta-Llama-3-70B-Instruct-Turbo	$0.88	$0.88	-
meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo	$0.88	$0.88	-
Llama-3.3-70B-Instruct-Turbo	$0.88	$0.88	-
Meta-Llama-3.1-70B-Instruct-Turbo	$0.88	$0.88	-
mistralai/Mixtral-8x7B-Instruct-v0.1	$0.9	$0.9	-
NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO	$0.9	$0.9	-
NousResearch/Nous-Hermes-2-Mixtral-8x7B-SFT	$0.9	$0.9	-
Qwen/Qwen1.5-72B	$0.9	$0.9	-
codellama/CodeLlama-70b-Instruct-hf	$0.9	$0.9	-
garage-bAInd/Platypus2-70B-instruct	$0.9	$0.9	-
meta-llama/Llama-2-70b-chat-hf	$0.9	$0.9	-
meta-llama/Llama-3-70b-chat-hf	$0.9	$0.9	-
mistralai/Mixtral-8x7B-v0.1	$0.9	$0.9	-
together-ai-41.1b-80b	$0.9	$0.9	-
Kimi-K2-Instruct	$1	$3	-
Kimi-K2-Instruct-0905	$1	$3	-
Qwen/Qwen2.5-72B-Instruct-Turbo	$1.2	$1.2	-
microsoft/WizardLM-2-8x22B	$1.2	$1.2	-
DeepSeek-V3	$1.25	$1.25	-
together-ai-81.1b-110b	$1.8	$1.8	-
Qwen3-Coder-480B-A35B-Instruct-FP8	$2	$2	-
mistralai/Mixtral-8x22B-Instruct-v0.1	$2.4	$2.4	-
DeepSeek-R1	$3	$7	-
meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo	$3.5	$3.5	-
Meta-Llama-3.1-405B-Instruct-Turbo	$3.5	$3.5	-

xai

39 models

Model	Input $/1M	Output $/1M	Cache $/1M
grok-4-1-fast-reasoning	$0.2	$0.5	$0.05
grok-4-1-fast-non-reasoning	$0.2	$0.5	$0.05
grok-4-fast-non-reasoning	$0.2	$0.5	$0.05
grok-4-fast-reasoning	$0.2	$0.5	$0.05
grok-code-fast-1	$0.2	$1.5	$0.02
grok-4-1-fast	$0.2	$0.5	$0.05
grok-4-1-fast-reasoning-latest	$0.2	$0.5	$0.05
grok-4-1-fast-non-reasoning-latest	$0.2	$0.5	$0.05
grok-code-fast	$0.2	$1.5	$0.02
grok-code-fast-1-0825	$0.2	$1.5	$0.02
grok-3-mini	$0.3	$0.5	$0.075
grok-3-mini-beta	$0.3	$0.5	$0.075
grok-3-mini-latest	$0.3	$0.5	$0.075
grok-3-mini-fast	$0.6	$4	$0.15
grok-3-mini-fast-beta	$0.6	$4	$0.15
grok-3-mini-fast-latest	$0.6	$4	$0.15
grok-4.20-0309-reasoning	$2	$6	$0.2
grok-4.20-0309-non-reasoning	$2	$6	$0.2
grok-4.20-multi-agent-0309	$2	$6	$0.2
grok-4	$2	$6	$0.2
grok-2	$2	$10	-
grok-2-1212	$2	$10	-
grok-2-vision-1212	$2	$10	-
grok-2-latest	$2	$10	-
grok-2-vision	$2	$10	-
grok-2-vision-latest	$2	$10	-
grok-4-latest	$2	$6	$0.2
grok-4.20-multi-agent-beta-0309	$2	$6	$0.2
grok-4.20-beta-0309-reasoning	$2	$6	$0.2
grok-4.20-beta-0309-non-reasoning	$2	$6	$0.2
grok-3	$3	$15	$0.75
grok-4-0709	$3	$15	$0.75
grok-3-beta	$3	$15	$0.75
grok-3-latest	$3	$15	$0.75
grok-3-fast	$5	$25	$1.25
grok-3-fast-beta	$5	$25	$1.25
grok-3-fast-latest	$5	$25	$1.25
grok-beta	$5	$15	-
grok-vision-beta	$5	$15	-

Track what your AI agents actually cost

LLMKit sits between your app and AI providers. Every request gets logged with token counts and dollar costs. Budget limits reject requests before they reach the provider.

Get started free View source

MIT licensed. Built with Claude Code. Source on GitHub