Fireworks AI
fireworks
Updated 26 minutes ago
Fireworks AI is a high-performance AI inference platform that provides fast, affordable access to over 200 open-source and proprietary AI models. The platform specializes in production-grade inference with ultra-low latency, offering models including Llama, Qwen, DeepSeek, Mistral, Google Gemma, FLUX image models, and more. Fireworks features serverless deployment, custom fine-tuning capabilities, and competitive pricing per 1M tokens. The platform is known for its speed and reliability, with models available through OpenAI-compatible APIs and dedicated instances for enterprise workloads.
Browse 19 LLM models available from Fireworks AI. Compare prices and features.
Models (19)
| Organization | Model Name | Original Model | Input | Output | Free | |||
|---|---|---|---|---|---|---|---|---|
|
|
Moonshot AI | Kimi K2.5 |
kimi-k2p5
|
$0.60 | $3.00 |
|
||
|
|
Z.ai | GLM-4.7 |
glm-4p7
|
$0.60 | $2.20 |
|
||
|
|
Minimax | MiniMax M2.1 |
minimax-m2p1
|
$0.30 | $1.20 |
|
||
|
|
OpenAI | GPT OSS 120B |
gpt-oss-120b
|
$0.15 | $0.60 |
|
||
|
|
Moonshot AI | Kimi K2-Instruct-0905 |
kimi-k2-instruct-0905
|
$0.60 | $2.50 | |||
|
|
DeepSeek | DeepSeek-V3.1 |
deepseek-v3p1
|
$0.56 | $1.68 | |||
|
|
qwen | Qwen3 VL 30B A3B Thinking |
qwen3-vl-30b-a3b-thinking
|
$0.15 | $0.60 | |||
|
|
OpenAI | GPT OSS 20B |
gpt-oss-20b
|
$0.07 | $0.30 |
|
||
|
|
qwen | Qwen3 VL 30B A3B Instruct |
qwen3-vl-30b-a3b-instruct
|
$0.15 | $0.60 | |||
|
|
Meta | Llama 3.3 70B Instruct |
llama-v3p3-70b-instruct
|
$0.90 | - | |||
|
|
Black Forest Labs | Flux Kontext Pro |
flux-kontext-pro
|
$0.04 | - | |||
|
|
Black Forest Labs | Flux Kontext Max |
flux-kontext-max
|
$0.08 | - | |||
|
|
DeepSeek | deepseek-v3.2 |
deepseek-v3p2
|
$0.56 | $1.68 | |||
|
|
Black Forest Labs | flux-1-dev-fp8 |
flux-1-dev-fp8
|
$0.00 | - | |||
|
|
Moonshot AI | Kimi K2 Thinking |
kimi-k2-thinking
|
$0.60 | $2.50 | |||
|
|
Minimax | MiniMax M2.5 |
minimax-m2p5
|
$0.30 | $1.20 |
|
||
|
|
Alibaba | Qwen3 8B |
qwen3-8b
|
$0.20 | - | |||
|
|
qwen | Qwen3 Embedding 8B |
qwen3-embedding-8b
|
$0.00 | - | |||
|
|
qwen | Qwen3-Reranker-8B |
qwen3-reranker-8b
|
$0.00 | - |