Fireworks AI
fireworks
Updated 1 hour ago
Fireworks AI is a high-performance AI inference platform that provides fast, affordable access to over 200 open-source and proprietary AI models. The platform specializes in production-grade inference with ultra-low latency, offering models including Llama, Qwen, DeepSeek, Mistral, Google Gemma, FLUX image models, and more. Fireworks features serverless deployment, custom fine-tuning capabilities, and competitive pricing per 1M tokens. The platform is known for its speed and reliability, with models available through OpenAI-compatible APIs and dedicated instances for enterprise workloads.
Browse 122 LLM models available from Fireworks AI. Compare prices and features.
Models (122)
| Organization | Model Name | Original Model | Input | Output | Free | |||
|---|---|---|---|---|---|---|---|---|
|
|
Moonshot AI | Kimi K2.5 |
kimi-k2p5
|
$0.60 | $3.00 |
|
||
|
|
Z.ai | GLM-4.7 |
glm-4p7
|
$0.60 | $2.20 |
|
||
|
|
qwen | Qwen3-235B-A22B-Thinking-2507 |
qwen3-235b-a22b-thinking-2507
|
$0.22 | $0.88 | |||
|
|
Minimax | MiniMax M2.1 |
minimax-m2p1
|
$0.30 | $1.20 | |||
|
|
Z.ai | GLM-4.6 |
glm-4p6
|
$0.55 | $2.19 | |||
|
|
DeepSeek | DeepSeek-R1-0528 |
deepseek-r1-0528
|
$1.35 | $5.40 | |||
|
|
OpenAI | GPT OSS 120B |
gpt-oss-120b
|
$0.15 | $0.60 |
|
||
|
|
Z.ai | GLM-4.5 |
glm-4p5
|
$0.55 | $2.19 | |||
|
|
Minimax | MiniMax M2 |
minimax-m2
|
$0.30 | $1.20 | |||
|
|
qwen | Qwen3-235B-A22B-Instruct-2507 |
qwen3-235b-a22b-instruct-2507
|
$0.22 | $0.88 | |||
|
|
qwen | Qwen3-Next-80B-A3B-Thinking |
qwen3-next-80b-a3b-thinking
|
$0.90 | $0.90 | |||
|
|
Z.ai | GLM-4.7-Flash |
glm-4p7-flash
|
$0.50 | $0.50 | |||
|
|
Moonshot AI | Kimi K2-Instruct-0905 |
kimi-k2-instruct-0905
|
$0.60 | $2.50 | |||
|
|
Moonshot AI | Kimi K2 Instruct |
kimi-k2-instruct
|
$0.60 | $2.50 | |||
|
|
Z.ai | GLM-4.5-Air |
glm-4p5-air
|
$0.22 | $0.88 | |||
|
|
DeepSeek | DeepSeek-V3.1 |
deepseek-v3p1
|
$0.56 | $1.68 | |||
|
|
qwen | Qwen3 VL 30B A3B Thinking |
qwen3-vl-30b-a3b-thinking
|
$0.15 | $0.60 | |||
|
|
qwen | Qwen3-Next-80B-A3B-Instruct |
qwen3-next-80b-a3b-instruct
|
$0.90 | $0.90 | |||
|
|
OpenAI | GPT OSS 20B |
gpt-oss-20b
|
$0.07 | $0.30 | |||
|
|
qwen | Qwen3 VL 30B A3B Instruct |
qwen3-vl-30b-a3b-instruct
|
$0.15 | $0.60 | |||
|
|
Minimax | MiniMax M1 80K |
minimax-m1-80k
|
$0.10 | $0.10 | |||
|
|
qwen | Qwen3 VL 32B Instruct |
qwen3-vl-32b-instruct
|
$0.90 | $0.90 | |||
|
|
DeepSeek | DeepSeek-V3 0324 |
deepseek-v3-0324
|
$0.90 | $0.90 | |||
|
|
qwen | Qwen3 30B A3B |
qwen3-30b-a3b
|
$0.15 | $0.60 | |||
|
|
qwen | QwQ-32B |
qwq-32b
|
$0.90 | $0.90 | |||
|
|
DeepSeek | DeepSeek R1 Distill Llama 70B |
deepseek-r1-distill-llama-70b
|
$0.90 | $0.90 | |||
|
|
Nvidia | Nemotron Nano 9B v2 |
nvidia-nemotron-nano-9b-v2
|
$0.20 | $0.20 | |||
|
|
DeepSeek | DeepSeek R1 Distill Qwen 32B |
deepseek-r1-distill-qwen-32b
|
$0.90 | $0.90 | |||
|
|
DeepSeek | DeepSeek R1 Distill Qwen 14B |
deepseek-r1-distill-qwen-14b
|
$0.20 | $0.20 | |||
|
|
DeepSeek | DeepSeek-V3 |
deepseek-v3
|
$0.90 | $0.90 | |||
|
|
Meta | Llama 3.1 405B Instruct |
llama-v3p1-405b-instruct
|
$3.00 | $3.00 | |||
|
|
qwen | Qwen2.5 32B Instruct |
qwen2p5-32b-instruct
|
$0.90 | $0.90 | |||
|
|
DeepSeek | DeepSeek R1 Distill Qwen 7B |
deepseek-r1-distill-qwen-7b
|
$0.20 | $0.20 | |||
|
|
DeepSeek | DeepSeek R1 Distill Llama 8B |
deepseek-r1-distill-llama-8b
|
$0.20 | $0.20 | |||
|
|
qwen | Qwen2.5 72B Instruct |
qwen2p5-72b-instruct
|
$0.90 | $0.90 | |||
|
|
qwen | Qwen3 235B A22B |
qwen3-235b-a22b
|
$0.22 | $0.88 | |||
|
|
qwen | Qwen2.5 14B Instruct |
qwen2p5-14b-instruct
|
$0.20 | $0.20 | |||
|
|
qwen | Qwen2.5 14B Instruct |
qwen-v2p5-14b-instruct
|
$0.20 | $0.20 | |||
|
|
Mistral | Mistral Small 3 24B Instruct |
mistral-small-24b-instruct-2501
|
$0.90 | $0.90 | |||
|
|
Gemma 3 27B |
gemma-3-27b-it
|
$0.90 | $0.90 | ||||
|
|
qwen | Qwen2 72B Instruct |
qwen2-72b-instruct
|
$0.90 | $0.90 | |||
|
|
Meta | Llama 3.1 70B Instruct |
llama-v3p1-70b-instruct
|
$0.90 | $0.90 | |||
|
|
Gemma 3 12B |
gemma-3-12b-it
|
$0.20 | $0.20 | ||||
|
|
DeepSeek | DeepSeek R1 Distill Qwen 1.5B |
deepseek-r1-distill-qwen-1p5b
|
$0.10 | $0.10 | |||
|
|
Meta | Llama 3.2 3B Instruct |
llama-v3p2-3b-instruct
|
$0.10 | $0.10 | |||
|
|
Gemma 3 4B |
gemma-3-4b-it
|
$0.20 | $0.20 | ||||
|
|
Meta | Llama 3.1 8B Instruct |
llama-v3p1-8b-instruct
|
$0.20 | $0.20 | |||
|
|
qwen | Qwen2 7B Instruct |
qwen2-7b-instruct
|
$0.20 | $0.20 | |||
|
|
Black Forest Labs | Flux Kontext Pro |
flux-kontext-pro
|
- | - | |||
|
|
Black Forest Labs | Flux Kontext Max |
flux-kontext-max
|
- | - | |||
|
|
DeepSeek | deepseek-v3.2 |
deepseek-v3p2
|
$0.56 | $1.68 | |||
|
|
Black Forest Labs | flux-1-dev-fp8 |
flux-1-dev-fp8
|
- | - | |||
|
|
Moonshot AI | Kimi K2 Thinking |
kimi-k2-thinking
|
$0.60 | $2.50 | |||
|
|
Minimax | MiniMax M2.5 |
minimax-m2p5
|
$0.30 | $1.20 |
|
||
|
|
Alibaba | Qwen3 8B |
qwen3-8b
|
$0.20 | $0.20 | |||
|
|
qwen | Qwen3 Embedding 8B |
qwen3-embedding-8b
|
- | - | |||
|
|
qwen | Qwen3-Reranker-8B |
qwen3-reranker-8b
|
- | - | |||
|
|
Z.ai | GLM-5 |
glm-5
|
$1.00 | $3.20 |
|
||
|
|
qwen | Qwen3 VL 235B A22B Instruct |
qwen3-vl-235b-a22b-instruct
|
$0.22 | $0.88 | |||
|
|
qwen | Qwen3-Coder 480B A35B Instruct |
qwen3-coder-480b-a35b-instruct
|
$0.45 | $1.80 | |||
|
|
DeepSeek | DeepSeek-R1 |
deepseek-r1
|
$1.35 | $5.40 | |||
|
|
Allen Institute for AI | Molmo2 8B (free) |
molmo2-8b
|
$0.20 | $0.20 | |||
|
|
SiliconFlow | Qwen/Qwen3-Omni-30B-A3B-Instruct |
qwen3-omni-30b-a3b-instruct
|
$0.50 | $0.50 | |||
|
|
SiliconFlow | ByteDance-Seed/Seed-OSS-36B-Instruct |
seed-oss-36b-instruct
|
$0.90 | $0.90 | |||
|
|
qwen | Qwen3 VL 8B Instruct |
qwen3-vl-8b-instruct
|
$0.20 | $0.20 | |||
|
|
Nvidia | Mistral Large 3 675B Instruct 2512 |
mistral-large-3-fp8
|
$1.20 | $1.20 | |||
|
|
Mistral | MiniStral 3 (14B Instruct 2512) |
ministral-3-14b-instruct-2512
|
$0.20 | $0.20 | |||
|
|
Mistral | Ministral 3 (8B Instruct 2512) |
ministral-3-8b-instruct-2512
|
$0.20 | $0.20 | |||
|
|
Mistral | Ministral 3 (3B Instruct 2512) |
ministral-3-3b-instruct-2512
|
$0.10 | $0.10 | |||
|
|
OpenAI | gpt-oss-safeguard-20b |
gpt-oss-safeguard-20b
|
$0.50 | $0.50 | |||
|
|
qwen | Qwen3 Embedding 0.6B |
qwen3-embedding-0p6b
|
- | - | |||
|
|
qwen | Qwen3 Embedding 4B |
qwen3-embedding-4b
|
- | - | |||
|
|
qwen | Qwen3-Reranker-4B |
qwen3-reranker-4b
|
- | - | |||
|
|
qwen | Qwen3-Reranker-0.6B |
qwen3-reranker-0p6b
|
- | - | |||
|
|
DeepSeek | deepseek-v3.1-terminus |
deepseek-v3p1-terminus
|
$0.56 | $1.68 | |||
|
|
Z.ai | GLM-4.5V |
glm-4p5v
|
$1.20 | $1.20 | |||
|
|
SiliconFlow | Qwen/Qwen3-30B-A3B-Thinking-2507 |
qwen3-30b-a3b-thinking-2507
|
$0.90 | $0.90 | |||
|
|
Alibaba | qwen3-30b-a3b-instruct-2507 |
qwen3-30b-a3b-instruct-2507
|
$0.50 | $0.50 | |||
|
|
Mistral | Devstral Small 2505 |
devstral-small-2505
|
$0.90 | $0.90 | |||
|
|
DeepSeek | DeepSeek Prover V2 |
deepseek-prover-v2
|
$1.20 | $1.20 | |||
|
|
qwen | Qwen3 1.7B |
qwen3-1p7b
|
$0.10 | $0.10 | |||
|
|
qwen | Qwen3 4B (free) |
qwen3-4b
|
$0.20 | $0.20 | |||
|
|
qwen | Qwen3 32B |
qwen3-32b
|
$0.90 | $0.90 | |||
|
|
qwen | Qwen3 0.6B |
qwen3-0p6b
|
$0.10 | $0.10 | |||
|
|
Alibaba | Qwen3 14B |
qwen3-14b
|
$0.20 | $0.20 | |||
|
|
Alibaba | qwen2.5-vl-72b-instruct |
qwen2p5-vl-72b-instruct
|
$0.90 | $0.90 | |||
|
|
Alibaba | qwen2.5-vl-32b-instruct |
qwen2p5-vl-32b-instruct
|
$0.90 | $0.90 | |||
|
|
Alibaba | Qwen2.5-VL 7B Instruct |
qwen2p5-vl-7b-instruct
|
$0.20 | $0.20 | |||
|
|
qwen | Qwen2.5 VL 3B Instruct |
qwen2p5-vl-3b-instruct
|
$0.20 | $0.20 | |||
|
|
Meta | Llama 3 8B (Base) |
llama-v3-8b
|
$0.20 | $0.20 | |||
|
|
qwen | Qwen2.5-Coder 32B Instruct |
qwen2p5-coder-32b-instruct
|
$0.90 | $0.90 | |||
|
|
Groq | Llama Guard 3 8B |
llama-guard-3-8b
|
$0.20 | $0.20 | |||
|
|
Black Forest Labs | FLUX.1-schnell |
flux-1-schnell
|
- | - | |||
|
|
qwen | Qwen2.5-Coder 7B Instruct |
qwen2p5-coder-7b-instruct
|
$0.20 | $0.20 | |||
|
|
Nvidia | Llama 3.2 11b Vision Instruct |
llama-v3p2-11b-vision-instruct
|
$0.20 | $0.20 | |||
|
|
Azure | Llama-3.2-90B-Vision-Instruct |
llama-v3p2-90b-vision-instruct
|
$0.90 | $0.90 | |||
|
|
Meta | llama-3.2-1b-instruct |
llama-v3p2-1b-instruct
|
$0.10 | $0.10 | |||
|
|
DeepSeek | DeepSeek-V2.5 |
deepseek-v2p5
|
$1.20 | $1.20 | |||
|
|
Mistral | Mistral NeMo Instruct |
mistral-nemo-instruct-2407
|
$0.20 | $0.20 | |||
|
|
Gemma 2 9B |
gemma2-9b-it
|
$0.20 | $0.20 | ||||
|
|
Nvidia | Codegemma 7b |
codegemma-7b
|
$0.20 | $0.20 | |||
|
|
Mistral | Mistral 7B Instruct v0.3 |
mistral-7b-instruct-v3
|
$0.20 | $0.20 | |||
|
|
Microsoft | Phi-3.5-vision-instruct |
phi-3-vision-128k-instruct
|
$0.20 | $0.20 | |||
|
|
Microsoft | phi-3-mini-128k-instruct |
phi-3-mini-128k-instruct
|
$0.10 | $0.10 | |||
|
|
gemma-2b-it |
gemma-2b-it
|
$0.10 | $0.10 | ||||
|
|
Meta | LlamaGuard 2 8B |
llama-guard-2-8b
|
$0.20 | $0.20 | |||
|
|
Meta | llama-3-8b-instruct |
llama-v3-8b-instruct
|
$0.20 | $0.20 | |||
|
|
Meta | llama-3-70b-instruct |
llama-v3-70b-instruct
|
$0.90 | $0.90 | |||
|
|
Mistral | Mixtral 8x22B Instruct |
mixtral-8x22b-instruct
|
$1.20 | $1.20 | |||
|
|
Mistral | Mixtral 8x22B (base) |
mixtral-8x22b
|
$1.20 | $1.20 | |||
|
|
NousResearch | nous-hermes-2-mixtral-8x7b-dpo |
nous-hermes-2-mixtral-8x7b-dpo
|
$0.50 | $0.50 | |||
|
|
Meta | codellama-70b-instruct |
code-llama-70b-instruct
|
$0.90 | $0.90 | |||
|
|
Meta | codellama-34b-instruct |
code-llama-34b-instruct
|
$0.90 | $0.90 | |||
|
|
Nvidia | Codellama 70b |
code-llama-70b
|
$0.90 | $0.90 | |||
|
|
NousResearch | openhermes-2.5-mistral-7b |
openhermes-2p5-mistral-7b
|
$0.20 | $0.20 | |||
|
|
Mistral | mistral-7b-instruct-v0.2 |
mistral-7b-instruct-v0p2
|
$0.20 | $0.20 | |||
|
|
Alibaba | qwen1.5-72b-chat |
qwen1p5-72b-chat
|
$0.90 | $0.90 | |||
|
|
gemma-7b-it |
gemma-7b-it
|
$0.20 | $0.20 | ||||
|
|
Mistral | Mixtral 8x7B Instruct |
mixtral-8x7b-instruct
|
$0.50 | $0.50 | |||
|
|
Meta | llama-2-7b-chat |
llama-v2-7b-chat
|
$0.20 | $0.20 | |||
|
|
Meta | llama-2-13b-chat |
llama-v2-13b-chat
|
$0.20 | $0.20 | |||
|
|
WandB | NVIDIA Nemotron 3 Super 120B |
nvidia-nemotron-3-super-120b-a12b-fp8
|
$0.90 | $0.90 |