Mistral API Pricing

63 models. Mistral Large, Medium, Small, and Codestral models. Prices per 1M tokens in USD.

Cheapest input

$0.04/1M

ministral-3b

Most expensive input

$4/1M

mistral-large-2402

Models with cache pricing

0 of 63

ModelInput $/1MOutput $/1MCache $/1M
ministral-3b$0.04$0.04-
mistral-small-24b-instruct-2501$0.05$0.08-
devstral-small$0.06$0.12-
mistral-small-3-2-2506$0.06$0.18-
mistral-small-latest$0.1$0.3-
ministral-8b$0.1$1-
mistral-embed$0.1$0.1-
devstral-small-2505$0.1$0.3-
devstral-small-2507$0.1$0.3-
devstral-small-latest$0.1$0.3-
labs-devstral-small-2512$0.1$0.3-
mistral-small$0.1$0.3-
ministral-3-3b-2512$0.1$0.1-
mistral-nemo$0.15$0.15-
pixtral-12b$0.15$0.15-
ministral-3-8b-2512$0.15$0.15-
pixtral-12b-2409$0.15$0.15-
mistral-saba$0.2$0.6-
ministral-3-14b-2512$0.2$0.2-
mistral-7b$0.25$0.25-
mistral-tiny$0.25$0.25-
codestral-mamba-latest$0.25$0.25-
open-codestral-mamba$0.25$0.25-
open-mistral-7b$0.25$0.25-
codestral-latest$0.3$0.9-
codestral$0.3$0.9-
codestral-2508$0.3$0.9-
open-mistral-nemo$0.3$0.3-
open-mistral-nemo-2407$0.3$0.3-
mistral-medium-3$0.4$2-
devstral-medium-2507$0.4$2-
devstral-latest$0.4$2-
devstral-medium-latest$0.4$2-
devstral-2512$0.4$2-
mistral-medium-2505$0.4$2-
mistral-medium-latest$0.4$2-
mistral-medium-3-1-2508$0.4$2-
magistral-small$0.5$1.5-
magistral-small-2506$0.5$1.5-
magistral-small-latest$0.5$1.5-
magistral-small-1-2-2509$0.5$1.5-
mistral-large-3$0.5$1.5-
mistral-large-2512$0.5$1.5-
mixtral-8x7b$0.7$0.7-
open-mixtral-8x7b$0.7$0.7-
mixtral-8x22b-instruct$0.9$0.9-
codestral-2405$1$3-
mistral-large-latest$2$6-
magistral-medium$2$5-
mistral-large$2$6-
pixtral-large$2$6-
magistral-medium-2506$2$5-
magistral-medium-2509$2$5-
magistral-medium-1-2-2509$2$5-
magistral-medium-latest$2$5-
mistral-large-2411$2$6-
open-mixtral-8x22b$2$6-
pixtral-large-2411$2$6-
pixtral-large-latest$2$6-
mistral-medium$2.7$8.1-
mistral-medium-2312$2.7$8.1-
mistral-large-2407$3$9-
mistral-large-2402$4$12-

Track Mistral costs with LLMKit

Proxy your Mistral requests through LLMKit. Every call gets logged with token counts, dollar costs, and session attribution. Set budget limits that actually reject requests before they hit the provider.

MIT licensed. Built with Claude Code. Source on GitHub