All providers/Fireworks AI

Fireworks AI API Pricing

257 models. LLaMA, Mixtral, and open models on Fireworks AI. Prices per 1M tokens in USD.

Cheapest input

$0.0001/1M

SSD-1B

Most expensive input

$3/1M

yi-large

Models with cache pricing

6 of 257

Model	Input $/1M	Output $/1M	Cache $/1M
SSD-1B	$0.0001	$0.0001	-
japanese-stable-diffusion-xl	$0.0001	$0.0001	-
playground-v2-1024px-aesthetic	$0.0001	$0.0001	-
playground-v2-5-1024px-aesthetic	$0.0001	$0.0001	-
stable-diffusion-xl-1024-v1-0	$0.0001	$0.0001	-
flux-1-schnell-fp8	$0.0003	$0.0003	-
flux-1-dev-fp8	$0.0005	$0.0005	-
flux-1-dev-controlnet-union	$0.001	$0.001	-
flux-kontext-pro	$0.04	$0.04	-
gpt-oss-20b	$0.07	$0.3	$0.04
flux-kontext-max	$0.08	$0.08	-
gemma-3-27b-it	$0.1	$0.1	-
llama-v3p2-1b-instruct	$0.1	$0.1	-
llama-v3p2-3b-instruct	$0.1	$0.1	-
codegemma-2b	$0.1	$0.1	-
cogito-v1-preview-llama-3b	$0.1	$0.1	-
deepseek-coder-1b-base	$0.1	$0.1	-
deepseek-r1-distill-qwen-1p5b	$0.1	$0.1	-
ernie-4p5-21b-a3b-pt	$0.1	$0.1	-
ernie-4p5-300b-a47b-pt	$0.1	$0.1	-
flux-1-dev	$0.1	$0.1	-
flux-1-schnell	$0.1	$0.1	-
gemma-2b-it	$0.1	$0.1	-
llama-guard-3-1b	$0.1	$0.1	-
llama-v2-70b	$0.1	$0.1	-
llama-v3p1-405b-instruct-long	$0.1	$0.1	-
llama-v3p1-70b-instruct-1b	$0.1	$0.1	-
llama-v3p2-1b	$0.1	$0.1	-
llama-v3p2-3b	$0.1	$0.1	-
minimax-m1-80k	$0.1	$0.1	-
ministral-3-3b-instruct-2512	$0.1	$0.1	-
nemotron-nano-v2-12b-vl	$0.1	$0.1	-
phi-2-3b	$0.1	$0.1	-
phi-3-mini-128k-instruct	$0.1	$0.1	-
qwen2-vl-2b-instruct	$0.1	$0.1	-
qwen2p5-0p5b-instruct	$0.1	$0.1	-
qwen2p5-1p5b-instruct	$0.1	$0.1	-
qwen2p5-coder-0p5b	$0.1	$0.1	-
qwen2p5-coder-0p5b-instruct	$0.1	$0.1	-
qwen2p5-coder-1p5b	$0.1	$0.1	-
qwen2p5-coder-1p5b-instruct	$0.1	$0.1	-
qwen2p5-coder-3b	$0.1	$0.1	-
qwen2p5-coder-3b-instruct	$0.1	$0.1	-
qwen3-0p6b	$0.1	$0.1	-
qwen3-1p7b	$0.1	$0.1	-
qwen3-1p7b-fp8-draft	$0.1	$0.1	-
qwen3-1p7b-fp8-draft-131072	$0.1	$0.1	-
qwen3-1p7b-fp8-draft-40960	$0.1	$0.1	-
stablecode-3b	$0.1	$0.1	-
starcoder2-3b	$0.1	$0.1	-
gpt-oss-120b	$0.15	$0.6	$0.07
llama4-scout-instruct-basic	$0.15	$0.6	-
qwen3-30b-a3b	$0.15	$0.6	-
qwen3-coder-30b-a3b-instruct	$0.15	$0.6	-
qwen3-vl-30b-a3b-instruct	$0.15	$0.6	-
qwen3-vl-30b-a3b-thinking	$0.15	$0.6	-
accounts/fireworks/models/llama-v3p1-8b-instruct	$0.2	$0.2	$0.1
llama-v3p1-8b-instruct	$0.2	$0.2	-
fireworks-ai-4.1b-to-16b	$0.2	$0.2	-
fireworks-ai-up-to-4b	$0.2	$0.2	-
llama-v3p2-11b-vision-instruct	$0.2	$0.2	-
chronos-hermes-13b-v2	$0.2	$0.2	-
code-llama-13b	$0.2	$0.2	-
code-llama-13b-instruct	$0.2	$0.2	-
code-llama-13b-python	$0.2	$0.2	-
code-llama-7b	$0.2	$0.2	-
code-llama-7b-instruct	$0.2	$0.2	-
code-llama-7b-python	$0.2	$0.2	-
code-qwen-1p5-7b	$0.2	$0.2	-
codegemma-7b	$0.2	$0.2	-
cogito-v1-preview-llama-8b	$0.2	$0.2	-
cogito-v1-preview-qwen-14b	$0.2	$0.2	-
deepseek-coder-7b-base	$0.2	$0.2	-
deepseek-coder-7b-base-v1p5	$0.2	$0.2	-
deepseek-coder-7b-instruct-v1p5	$0.2	$0.2	-
deepseek-r1-0528-distill-qwen3-8b	$0.2	$0.2	-
deepseek-r1-distill-llama-8b	$0.2	$0.2	-
deepseek-r1-distill-qwen-14b	$0.2	$0.2	-
deepseek-r1-distill-qwen-7b	$0.2	$0.2	-
dobby-mini-unhinged-plus-llama-3-1-8b	$0.2	$0.2	-
firellava-13b	$0.2	$0.2	-
firesearch-ocr-v6	$0.2	$0.2	-
gemma-7b	$0.2	$0.2	-
gemma-7b-it	$0.2	$0.2	-
gemma2-9b-it	$0.2	$0.2	-
hermes-2-pro-mistral-7b	$0.2	$0.2	-
internvl3-8b	$0.2	$0.2	-
llama-guard-2-8b	$0.2	$0.2	-
llama-guard-3-8b	$0.2	$0.2	-
llama-v2-13b	$0.2	$0.2	-
llama-v2-13b-chat	$0.2	$0.2	-
llama-v2-7b	$0.2	$0.2	-
llama-v2-7b-chat	$0.2	$0.2	-
llama-v3-8b	$0.2	$0.2	-
llama-v3-8b-instruct-hf	$0.2	$0.2	-
llamaguard-7b	$0.2	$0.2	-
ministral-3-14b-instruct-2512	$0.2	$0.2	-
ministral-3-8b-instruct-2512	$0.2	$0.2	-
mistral-7b	$0.2	$0.2	-
mistral-7b-instruct-4k	$0.2	$0.2	-
mistral-7b-instruct-v0p2	$0.2	$0.2	-
mistral-7b-instruct-v3	$0.2	$0.2	-
mistral-7b-v0p2	$0.2	$0.2	-
mistral-nemo-base-2407	$0.2	$0.2	-
mistral-nemo-instruct-2407	$0.2	$0.2	-
mythomax-l2-13b	$0.2	$0.2	-
nous-capybara-7b-v1p9	$0.2	$0.2	-
nous-hermes-llama2-13b	$0.2	$0.2	-
nous-hermes-llama2-7b	$0.2	$0.2	-
nvidia-nemotron-nano-12b-v2	$0.2	$0.2	-
nvidia-nemotron-nano-9b-v2	$0.2	$0.2	-
openchat-3p5-0106-7b	$0.2	$0.2	-
openhermes-2-mistral-7b	$0.2	$0.2	-
openhermes-2p5-mistral-7b	$0.2	$0.2	-
openorca-7b	$0.2	$0.2	-
phi-3-vision-128k-instruct	$0.2	$0.2	-
pythia-12b	$0.2	$0.2	-
qwen-v2p5-14b-instruct	$0.2	$0.2	-
qwen-v2p5-7b	$0.2	$0.2	-
qwen2-7b-instruct	$0.2	$0.2	-
qwen2-vl-7b-instruct	$0.2	$0.2	-
qwen2p5-14b	$0.2	$0.2	-
qwen2p5-7b-instruct	$0.2	$0.2	-
qwen2p5-coder-14b	$0.2	$0.2	-
qwen2p5-coder-14b-instruct	$0.2	$0.2	-
qwen2p5-coder-7b	$0.2	$0.2	-
qwen2p5-coder-7b-instruct	$0.2	$0.2	-
qwen2p5-vl-3b-instruct	$0.2	$0.2	-
qwen2p5-vl-7b-instruct	$0.2	$0.2	-
qwen3-14b	$0.2	$0.2	-
qwen3-4b	$0.2	$0.2	-
qwen3-4b-instruct-2507	$0.2	$0.2	-
qwen3-8b	$0.2	$0.2	-
qwen3-vl-8b-instruct	$0.2	$0.2	-
rolm-ocr	$0.2	$0.2	-
snorkel-mistral-7b-pairrm-dpo	$0.2	$0.2	-
starcoder-16b	$0.2	$0.2	-
starcoder-7b	$0.2	$0.2	-
starcoder2-15b	$0.2	$0.2	-
starcoder2-7b	$0.2	$0.2	-
toppy-m-7b	$0.2	$0.2	-
yi-6b	$0.2	$0.2	-
zephyr-7b-beta	$0.2	$0.2	-
llama4-maverick-instruct-basic	$0.22	$0.88	-
qwen3-235b-a22b	$0.22	$0.88	-
glm-4p5-air	$0.22	$0.88	-
qwen3-235b-a22b-instruct-2507	$0.22	$0.88	-
qwen3-235b-a22b-thinking-2507	$0.22	$0.88	-
qwen3-vl-235b-a22b-instruct	$0.22	$0.88	-
qwen3-vl-235b-a22b-thinking	$0.22	$0.88	-
minimax-m2p1	$0.3	$1.2	-
minimax-m2	$0.3	$1.2	-
qwen3-coder-480b-a35b-instruct	$0.45	$1.8	-
fireworks-ai-moe-up-to-56b	$0.5	$0.5	-
deepseek-coder-v2-lite-base	$0.5	$0.5	-
deepseek-coder-v2-lite-instruct	$0.5	$0.5	-
deepseek-v2-lite-chat	$0.5	$0.5	-
dolphin-2p6-mixtral-8x7b	$0.5	$0.5	-
firefunction-v1	$0.5	$0.5	-
gpt-oss-safeguard-20b	$0.5	$0.5	-
mixtral-8x7b	$0.5	$0.5	-
mixtral-8x7b-instruct	$0.5	$0.5	-
mixtral-8x7b-instruct-hf	$0.5	$0.5	-
nous-hermes-2-mixtral-8x7b-dpo	$0.5	$0.5	-
qwen3-30b-a3b-instruct-2507	$0.5	$0.5	-
deepseek-r1-basic	$0.55	$2.19	-
glm-4p5	$0.55	$2.19	-
glm-4p6	$0.55	$2.19	-
deepseek-v3p2	$0.56	$1.68	$0.28
deepseek-v3p1	$0.56	$1.68	-
deepseek-v3p1-terminus	$0.56	$1.68	-
glm-4p7	$0.6	$2.2	-
kimi-k2p5	$0.6	$3	$0.1
kimi-k2-instruct	$0.6	$2.5	-
kimi-k2-instruct-0905	$0.6	$2.5	-
kimi-k2-thinking	$0.6	$2.5	-
accounts/fireworks/models/llama-v3p3-70b-instruct	$0.9	$0.9	$0.45
deepseek-v3-0324	$0.9	$0.9	-
qwen2p5-vl-72b-instruct	$0.9	$0.9	-
fireworks-ai-above-16b	$0.9	$0.9	-
deepseek-v3	$0.9	$0.9	-
firefunction-v2	$0.9	$0.9	-
llama-v3p2-90b-vision-instruct	$0.9	$0.9	-
qwen2-72b-instruct	$0.9	$0.9	-
qwen2p5-coder-32b-instruct	$0.9	$0.9	-
code-llama-34b	$0.9	$0.9	-
code-llama-34b-instruct	$0.9	$0.9	-
code-llama-34b-python	$0.9	$0.9	-
code-llama-70b	$0.9	$0.9	-
code-llama-70b-instruct	$0.9	$0.9	-
code-llama-70b-python	$0.9	$0.9	-
cogito-v1-preview-llama-70b	$0.9	$0.9	-
cogito-v1-preview-qwen-32b	$0.9	$0.9	-
deepseek-coder-33b-instruct	$0.9	$0.9	-
deepseek-r1-distill-llama-70b	$0.9	$0.9	-
deepseek-r1-distill-qwen-32b	$0.9	$0.9	-
devstral-small-2505	$0.9	$0.9	-
dobby-unhinged-llama-3-3-70b-new	$0.9	$0.9	-
dolphin-2-9-2-qwen2-72b	$0.9	$0.9	-
fare-20b	$0.9	$0.9	-
internvl3-38b	$0.9	$0.9	-
internvl3-78b	$0.9	$0.9	-
kat-coder	$0.9	$0.9	-
kat-dev-32b	$0.9	$0.9	-
kat-dev-72b-exp	$0.9	$0.9	-
llama-v2-70b-chat	$0.9	$0.9	-
llama-v3-70b-instruct	$0.9	$0.9	-
llama-v3-70b-instruct-hf	$0.9	$0.9	-
llama-v3p1-70b-instruct	$0.9	$0.9	-
llama-v3p1-nemotron-70b-instruct	$0.9	$0.9	-
llama-v3p3-70b-instruct	$0.9	$0.9	-
llava-yi-34b	$0.9	$0.9	-
mistral-small-24b-instruct-2501	$0.9	$0.9	-
nous-hermes-2-yi-34b	$0.9	$0.9	-
nous-hermes-llama2-70b	$0.9	$0.9	-
phind-code-llama-34b-python-v1	$0.9	$0.9	-
phind-code-llama-34b-v1	$0.9	$0.9	-
phind-code-llama-34b-v2	$0.9	$0.9	-
qwen-qwq-32b-preview	$0.9	$0.9	-
qwen1p5-72b-chat	$0.9	$0.9	-
qwen2-vl-72b-instruct	$0.9	$0.9	-
qwen2p5-32b	$0.9	$0.9	-
qwen2p5-32b-instruct	$0.9	$0.9	-
qwen2p5-72b	$0.9	$0.9	-
qwen2p5-72b-instruct	$0.9	$0.9	-
qwen2p5-coder-32b	$0.9	$0.9	-
qwen2p5-coder-32b-instruct-128k	$0.9	$0.9	-
qwen2p5-coder-32b-instruct-32k-rope	$0.9	$0.9	-
qwen2p5-coder-32b-instruct-64k	$0.9	$0.9	-
qwen2p5-math-72b-instruct	$0.9	$0.9	-
qwen2p5-vl-32b-instruct	$0.9	$0.9	-
qwen3-30b-a3b-thinking-2507	$0.9	$0.9	-
qwen3-32b	$0.9	$0.9	-
qwen3-coder-480b-instruct-bf16	$0.9	$0.9	-
qwen3-next-80b-a3b-instruct	$0.9	$0.9	-
qwen3-next-80b-a3b-thinking	$0.9	$0.9	-
qwen3-vl-32b-instruct	$0.9	$0.9	-
qwq-32b	$0.9	$0.9	-
yi-34b	$0.9	$0.9	-
yi-34b-200k-capybara	$0.9	$0.9	-
yi-34b-chat	$0.9	$0.9	-
fireworks-ai-56b-to-176b	$1.2	$1.2	-
deepseek-coder-v2-instruct	$1.2	$1.2	-
mixtral-8x22b-instruct-hf	$1.2	$1.2	-
cogito-671b-v2-p1	$1.2	$1.2	-
dbrx-instruct	$1.2	$1.2	-
deepseek-prover-v2	$1.2	$1.2	-
deepseek-v2p5	$1.2	$1.2	-
glm-4p5v	$1.2	$1.2	-
gpt-oss-safeguard-120b	$1.2	$1.2	-
mistral-large-3-fp8	$1.2	$1.2	-
mixtral-8x22b	$1.2	$1.2	-
mixtral-8x22b-instruct	$1.2	$1.2	-
deepseek-r1-0528	$3	$8	-
deepseek-r1	$3	$8	-
llama-v3p1-405b-instruct	$3	$3	-
yi-large	$3	$3	-

View all providers|Cost calculator|Pricing API

Track Fireworks AI costs with LLMKit

Proxy your Fireworks AI requests through LLMKit. Every call gets logged with token counts, dollar costs, and session attribution. Set budget limits that actually reject requests before they hit the provider.

Get started free View source

MIT licensed. Built with Claude Code. Source on GitHub