DeepInfra
deepinfra
Updated 6 minutes ago
DeepInfra is a cloud-based AI inference platform that provides scalable, cost-effective infrastructure for deploying and running machine learning models without requiring users to manage underlying infrastructure. The platform offers access to 100+ machine learning models across multiple categories including text-to-image generation, object detection, automatic speech recognition (ASR), and text-to-text generation. DeepInfra enables serverless deployment, is production-ready, and simplifies the process of deploying deep learning models. Their mission is to democratize access to top AI models by providing fast, affordable ML inference capabilities, with integration available through platforms like OpenRouter offering access to 90 models.
Browse 92 LLM models available from DeepInfra. Compare prices and features.
Models (92)
| Organization | Model Name | Original Model | Input | Output | Free | |||
|---|---|---|---|---|---|---|---|---|
|
|
DeepSeek | DeepSeek-V4-Pro-Max |
deepseek-ai/DeepSeek-V4-Pro
|
$1.74 | $3.48 |
|
||
|
|
Moonshot AI | Kimi K2.6 |
moonshotai/Kimi-K2.6
|
$0.75 | $3.50 |
|
||
|
|
DeepSeek | DeepSeek V4 Flash |
deepseek-ai/DeepSeek-V4-Flash
|
$0.14 | $0.28 |
|
||
|
|
Z.ai | GLM-5.1 |
zai-org/GLM-5.1
|
$1.05 | $3.50 |
|
||
|
|
qwen | Qwen3.6-35B-A3B |
Qwen/Qwen3.6-35B-A3B
|
$0.20 | $1.00 | |||
|
|
Gemma 4 31B |
google/gemma-4-31B-it
|
$0.13 | $0.38 | ||||
|
|
Gemma 4 26B-A4B |
google/gemma-4-26B-A4B-it
|
$0.07 | $0.34 | ||||
|
|
qwen | Qwen3.5-122B-A10B |
Qwen/Qwen3.5-122B-A10B
|
$0.29 | $2.90 | |||
|
|
qwen | Qwen3.5-27B |
Qwen/Qwen3.5-27B
|
$0.26 | $2.60 | |||
|
|
ByteDance Seed | Seed 2.0 Pro |
ByteDance/Seed-2.0-pro
|
$0.50 | $3.00 | |||
|
|
qwen | Qwen3.5-397B-A17B |
Qwen/Qwen3.5-397B-A17B
|
$0.54 | $3.40 | |||
|
|
Minimax | MiniMax M2.5 |
MiniMaxAI/MiniMax-M2.5
|
$0.15 | $1.15 |
|
||
|
|
qwen | Qwen3.5-35B-A3B |
Qwen/Qwen3.5-35B-A3B
|
$0.20 | $0.95 | |||
|
|
Z.ai | GLM-5 |
zai-org/GLM-5
|
$0.60 | $2.08 |
|
||
|
|
StepFun | Step-3.5-Flash |
stepfun-ai/Step-3.5-Flash
|
$0.10 | $0.30 |
|
||
|
|
Moonshot AI | Kimi K2.5 |
moonshotai/Kimi-K2.5
|
$0.45 | $2.25 | |||
|
|
qwen | Qwen3.5-4B |
Qwen/Qwen3.5-4B
|
$0.03 | $0.15 | |||
|
|
Z.ai | GLM-4.7 |
zai-org/GLM-4.7
|
$0.40 | $1.75 |
|
||
|
|
Z.ai | GLM-4.7-Flash |
zai-org/GLM-4.7-Flash
|
$0.06 | $0.40 | |||
|
|
qwen | Qwen3.5-2B |
Qwen/Qwen3.5-2B
|
$0.02 | $0.10 | |||
|
|
Nvidia | Nemotron 3 Nano (30B A3B) |
nvidia/Nemotron-3-Nano-30B-A3B
|
$0.05 | $0.20 | |||
|
|
qwen | Qwen3 Max Thinking |
Qwen/Qwen3-Max
|
$1.20 | $6.00 | |||
|
|
qwen | Qwen3 Max Thinking |
Qwen/Qwen3-Max-Thinking
|
$1.20 | $6.00 | |||
|
|
Z.ai | GLM-4.6 |
zai-org/GLM-4.6
|
$0.43 | $1.74 | |||
|
|
qwen | Qwen3.5-0.8B |
Qwen/Qwen3.5-0.8B
|
$0.01 | $0.05 | |||
|
|
qwen | Qwen3 VL 235B A22B Instruct |
Qwen/Qwen3-VL-235B-A22B-Instruct
|
$0.20 | $0.88 | |||
|
|
qwen | Qwen3-235B-A22B-Thinking-2507 |
Qwen/Qwen3-235B-A22B-Thinking-2507
|
$0.23 | $2.30 | |||
|
|
OpenAI | GPT OSS 120B |
openai/gpt-oss-120b
|
$0.04 | $0.19 |
|
||
|
|
Gemini 2.5 Pro Preview 06-05 |
google/gemini-2.5-pro
|
$1.25 | $10.00 | ||||
|
|
qwen | Qwen3 VL 30B A3B Instruct |
Qwen/Qwen3-VL-30B-A3B-Instruct
|
$0.15 | $0.60 | |||
|
|
Anthropic | Claude 4 Opus |
anthropic/claude-4-opus
|
$16.50 | $82.50 | |||
|
|
qwen | Qwen3-Next-80B-A3B-Instruct |
Qwen/Qwen3-Next-80B-A3B-Instruct
|
$0.09 | $1.10 | |||
|
|
OpenAI | GPT OSS 20B |
openai/gpt-oss-20b
|
$0.03 | $0.14 | |||
|
|
Anthropic | Claude 4 Sonnet |
anthropic/claude-4-sonnet
|
$3.30 | $16.50 | |||
|
|
DeepSeek | DeepSeek-R1-0528 |
deepseek-ai/DeepSeek-R1-0528
|
$0.50 | $2.15 | |||
|
|
Nvidia | NVIDIA Nemotron Nano 9B V2 |
nvidia/NVIDIA-Nemotron-Nano-9B-v2
|
$0.04 | $0.16 | |||
|
|
qwen | Qwen3-235B-A22B-Instruct-2507 |
Qwen/Qwen3-235B-A22B-Instruct-2507
|
$0.07 | $0.10 | |||
|
|
Gemini 2.5 Flash |
google/gemini-2.5-flash
|
$0.30 | $2.50 |
|
|||
|
|
qwen | Qwen3 32B |
Qwen/Qwen3-32B
|
$0.08 | $0.28 | |||
|
|
qwen | Qwen3 30B A3B |
Qwen/Qwen3-30B-A3B
|
$0.08 | $0.28 | |||
|
|
DeepSeek | DeepSeek-V3.1 |
deepseek-ai/DeepSeek-V3.1
|
$0.21 | $0.79 | |||
|
|
Mistral | Mistral Small 3.2 24B Instruct |
mistralai/Mistral-Small-3.2-24B-Instruct-2506
|
$0.08 | $0.20 | |||
|
|
DeepSeek | DeepSeek-V3 0324 |
deepseek-ai/DeepSeek-V3-0324
|
$0.20 | $0.77 | |||
|
|
DeepSeek | DeepSeek R1 Distill Llama 70B |
deepseek-ai/DeepSeek-R1-Distill-Llama-70B
|
$0.70 | $0.80 | |||
|
|
DeepSeek | DeepSeek-V3 |
deepseek-ai/DeepSeek-V3
|
$0.32 | $0.89 | |||
|
|
Gemma 3 27B |
google/gemma-3-27b-it
|
$0.08 | $0.16 | ||||
|
|
Gemma 3 12B |
google/gemma-3-12b-it
|
$0.04 | $0.13 | ||||
|
|
Mistral | Mistral Small 3 24B Instruct |
mistralai/Mistral-Small-24B-Instruct-2501
|
$0.05 | $0.08 | |||
|
|
Microsoft | Phi 4 |
microsoft/phi-4
|
$0.07 | $0.14 | |||
|
|
Meta | Llama 3.1 70B Instruct |
meta-llama/Meta-Llama-3.1-70B-Instruct
|
$0.40 | $0.40 | |||
|
|
Gemma 3 4B |
google/gemma-3-4b-it
|
$0.04 | $0.08 | ||||
|
|
qwen | Qwen2.5 72B Instruct |
Qwen/Qwen2.5-72B-Instruct
|
$0.36 | $0.40 | |||
|
|
Gemini 1.5 Flash |
google/gemini-1.5-flash
|
$0.08 | $0.30 | ||||
|
|
Meta | Llama 3.1 8B Instruct |
meta-llama/Meta-Llama-3.1-8B-Instruct
|
$0.02 | $0.05 | |||
|
|
Gemini 1.5 Flash 8B |
google/gemini-1.5-flash-8b
|
$0.04 | $0.15 | ||||
|
|
ByteDance Seed | Seed-2.0-Mini |
ByteDance/Seed-2.0-mini
|
$0.10 | $0.40 | |||
|
|
Black Forest Labs | flux-2-klein-4b |
black-forest-labs/FLUX-2-klein-4b
|
- | - | |||
|
|
Allen Institute for AI | Olmo 3.1 32B Instruct |
allenai/Olmo-3.1-32B-Instruct
|
$0.20 | $0.60 | |||
|
|
ByteDance Seed | Seedream 4.5 |
ByteDance/Seedream-4.5
|
- | - | |||
|
|
Black Forest Labs | Flux 2 Max |
black-forest-labs/FLUX-2-max
|
- | - | |||
|
|
Black Forest Labs | Flux 2 Pro |
black-forest-labs/FLUX-2-pro
|
- | - | |||
|
|
qwen | Qwen3 Embedding 4B |
Qwen/Qwen3-Embedding-4B
|
- | - | |||
|
|
qwen | Qwen3 Embedding 8B |
Qwen/Qwen3-Embedding-8B
|
- | - | |||
|
|
Nvidia | Llama 3.3 Nemotron Super 49b V1.5 |
nvidia/Llama-3.3-Nemotron-Super-49B-v1.5
|
$0.10 | $0.40 | |||
|
|
DeepSeek | deepseek-v3.1-terminus |
deepseek-ai/DeepSeek-V3.1-Terminus
|
$0.21 | $0.79 | |||
|
|
Groq | Llama Guard 4 12B |
meta-llama/Llama-Guard-4-12B
|
$0.18 | $0.18 | |||
|
|
Alibaba | Qwen3 14B |
Qwen/Qwen3-14B
|
$0.12 | $0.24 | |||
|
|
Nvidia | Llama 3.1 Nemotron 70B Instruct |
nvidia/Llama-3.1-Nemotron-70B-Instruct
|
$1.20 | $1.20 | |||
|
|
Nvidia | Llama 3.2 11b Vision Instruct |
meta-llama/Llama-3.2-11B-Vision-Instruct
|
$0.25 | $0.25 | |||
|
|
NousResearch | Hermes 3 70B Instruct |
NousResearch/Hermes-3-Llama-3.1-70B
|
$0.30 | $0.30 | |||
|
|
NousResearch | Hermes 3 405B Instruct (free) |
NousResearch/Hermes-3-Llama-3.1-405B
|
$1.00 | $1.00 | |||
|
|
Mistral | Mistral NeMo Instruct |
mistralai/Mistral-Nemo-Instruct-2407
|
$0.02 | $0.04 | |||
|
|
Meta | llama-3-8b-instruct |
meta-llama/Meta-Llama-3-8B-Instruct
|
$0.03 | $0.04 | |||
|
|
Anthropic | Claude Sonnet 3.7 (latest) |
anthropic/claude-3-7-sonnet-latest
|
$3.30 | $16.50 | |||
|
|
Clarifai | DeepSeek OCR |
deepseek-ai/DeepSeek-OCR
|
$0.03 | $0.10 | |||
|
|
DeepSeek | deepseek-v3.2 |
deepseek-ai/DeepSeek-V3.2
|
$0.26 | $0.38 | |||
|
|
Black Forest Labs | Flux 1.1 Pro |
black-forest-labs/FLUX-1.1-pro
|
- | - | |||
|
|
Black Forest Labs | flux-1-kontext-dev |
black-forest-labs/FLUX.1-Kontext-dev
|
- | - | |||
|
|
Black Forest Labs | flux-2-dev |
black-forest-labs/FLUX-2-dev
|
- | - | |||
|
|
Black Forest Labs | flux-2-klein-9b |
black-forest-labs/FLUX-2-klein-9b
|
- | - | |||
|
|
Nvidia | FLUX.1-dev |
black-forest-labs/FLUX-1-dev
|
- | - | |||
|
|
Black Forest Labs | FLUX.1-schnell |
black-forest-labs/FLUX-1-schnell
|
- | - | |||
|
|
Azure | Llama 4 Maverick 17B 128E Instruct FP8 |
meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8
|
$0.15 | $0.60 | |||
|
|
Meta | llama-4-scout-17b-16e-instruct |
meta-llama/Llama-4-Scout-17B-16E-Instruct
|
$0.08 | $0.30 | |||
|
|
Mistral | mixtral-8x7b-instruct-v0.1 |
mistralai/Mixtral-8x7B-Instruct-v0.1
|
$0.54 | $0.54 | |||
|
|
Nvidia | NVIDIA Nemotron 3 Super |
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B
|
$0.10 | $0.50 | |||
|
|
Nvidia | NVIDIA Nemotron Nano 12B v2 VL |
nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL
|
$0.20 | $0.60 | |||
|
|
qwen | Qwen Image Edit |
Qwen/Qwen-Image-Edit
|
- | - | |||
|
|
DeepInfra | Qwen3 Coder 480B A35B Instruct Turbo |
Qwen/Qwen3-Coder-480B-A35B-Instruct-Turbo
|
$0.30 | $1.00 | |||
|
|
qwen | Qwen3 Embedding 0.6B |
Qwen/Qwen3-Embedding-0.6B
|
- | - | |||
|
|
ByteDance Seed | Seedream 4 |
ByteDance/Seedream-4
|
- | - | |||
|
|
Alibaba | wan2.6-t2i |
Wan-AI/Wan2.6-T2I
|
- | - |