DeepInfra
deepinfra
Updated 54 minutes ago
DeepInfra is a cloud-based AI inference platform that provides scalable, cost-effective infrastructure for deploying and running machine learning models without requiring users to manage underlying infrastructure. The platform offers access to 100+ machine learning models across multiple categories including text-to-image generation, object detection, automatic speech recognition (ASR), and text-to-text generation. DeepInfra enables serverless deployment, is production-ready, and simplifies the process of deploying deep learning models. Their mission is to democratize access to top AI models by providing fast, affordable ML inference capabilities, with integration available through platforms like OpenRouter offering access to 90 models.
Browse 99 LLM models available from DeepInfra. Compare prices and features.
Models (99)
| Organization | Model Name | Original Model | Input | Output | Free | |||
|---|---|---|---|---|---|---|---|---|
|
|
Anthropic | Claude Opus 4.8 |
anthropic/claude-opus-4-8
|
$5.00 | $25.00 |
|
||
|
|
qwen | Qwen3.7 Max |
Qwen/Qwen3.7-Max
|
$2.50 | $7.50 |
|
||
|
|
DeepSeek | DeepSeek-V4-Pro-Max |
deepseek-ai/DeepSeek-V4-Pro
|
$1.30 | $2.60 |
|
||
|
|
Gemini 3.5 Flash |
google/gemini-3.5-flash
|
$1.50 | $9.00 |
|
|||
|
|
Xiaomi | MiMo-V2.5 |
XiaomiMiMo/MiMo-V2.5
|
$0.40 | $2.00 |
|
||
|
|
DeepSeek | DeepSeek V4 Flash |
deepseek-ai/DeepSeek-V4-Flash
|
$0.10 | $0.20 |
|
||
|
|
StepFun | Step 3.7 Flash |
stepfun-ai/Step-3.7-Flash
|
$0.20 | $1.15 |
|
||
|
|
Anthropic | Claude Opus 4.7 |
anthropic/claude-opus-4-7
|
$5.00 | $25.00 |
|
||
|
|
Moonshot AI | Kimi K2.6 |
moonshotai/Kimi-K2.6
|
$0.75 | $3.50 |
|
||
|
|
Xiaomi | MiMo-V2.5-Pro |
XiaomiMiMo/MiMo-V2.5-Pro
|
$1.00 | $3.00 |
|
||
|
|
Alibaba | Qwen3.6 27B |
Qwen/Qwen3.6-27B
|
$0.32 | $3.20 | |||
|
|
Z.ai | GLM-5.1 |
zai-org/GLM-5.1
|
$1.05 | $3.50 |
|
||
|
|
Minimax | MiniMax M2.7 |
MiniMaxAI/MiniMax-M2.7
|
$0.25 | $1.00 | |||
|
|
qwen | Qwen3.6 35B A3B |
Qwen/Qwen3.6-35B-A3B
|
$0.15 | $0.95 | |||
|
|
Gemma 4 31B |
google/gemma-4-31B-it
|
$0.13 | $0.38 |
|
|||
|
|
Gemini 3.1 Pro |
google/gemini-3.1-pro
|
$2.00 | $12.00 |
|
|||
|
|
Gemma 4 26B-A4B |
google/gemma-4-26B-A4B-it
|
$0.07 | $0.34 |
|
|||
|
|
Anthropic | Claude Sonnet 4.6 |
anthropic/claude-sonnet-4-6
|
$3.00 | $15.00 |
|
||
|
|
ByteDance Seed | Seed 2.0 Pro |
ByteDance/Seed-2.0-pro
|
$0.50 | $3.00 | |||
|
|
Gemini 3.1 Flash-Lite |
google/gemini-3.1-flash-lite
|
$0.25 | $1.50 |
|
|||
|
|
qwen | Qwen3.5-27B |
Qwen/Qwen3.5-27B
|
$0.26 | $2.60 | |||
|
|
Minimax | MiniMax M2.5 |
MiniMaxAI/MiniMax-M2.5
|
$0.15 | $1.15 | |||
|
|
qwen | Qwen3.5-35B-A3B |
Qwen/Qwen3.5-35B-A3B
|
$0.14 | $1.00 | |||
|
|
qwen | Qwen3.5-397B-A17B |
Qwen/Qwen3.5-397B-A17B
|
$0.45 | $3.00 | |||
|
|
StepFun | Step-3.5-Flash |
stepfun-ai/Step-3.5-Flash
|
$0.09 | $0.30 | |||
|
|
qwen | Qwen3.5 9B |
Qwen/Qwen3.5-9B
|
$0.10 | $0.15 | |||
|
|
Moonshot AI | Kimi K2.5 |
moonshotai/Kimi-K2.5
|
$0.45 | $2.25 |
|
||
|
|
Z.ai | GLM-5 |
zai-org/GLM-5
|
$0.60 | $2.08 |
|
||
|
|
Z.ai | GLM-4.7 |
zai-org/GLM-4.7
|
$0.40 | $1.75 | |||
|
|
Z.ai | GLM-4.7-Flash |
zai-org/GLM-4.7-Flash
|
$0.06 | $0.40 | |||
|
|
Nvidia | Nemotron 3 Nano (30B A3B) |
nvidia/Nemotron-3-Nano-30B-A3B
|
$0.05 | $0.20 | |||
|
|
qwen | Qwen3 Max Thinking |
Qwen/Qwen3-Max
|
$1.20 | $6.00 | |||
|
|
qwen | Qwen3 Max Thinking |
Qwen/Qwen3-Max-Thinking
|
$1.20 | $6.00 | |||
|
|
Anthropic | Claude 4.5 Haiku |
anthropic/claude-haiku-4-5
|
$1.00 | $5.00 |
|
||
|
|
qwen | Qwen3 VL 235B A22B Instruct |
Qwen/Qwen3-VL-235B-A22B-Instruct
|
$0.20 | $0.88 | |||
|
|
Z.ai | GLM-4.6 |
zai-org/GLM-4.6
|
$0.43 | $1.74 | |||
|
|
OpenAI | GPT OSS 120B |
openai/gpt-oss-120b
|
$0.04 | $0.19 |
|
||
|
|
qwen | Qwen3-235B-A22B-Thinking-2507 |
Qwen/Qwen3-235B-A22B-Thinking-2507
|
$0.23 | $2.30 | |||
|
|
qwen | Qwen3 VL 30B A3B Instruct |
Qwen/Qwen3-VL-30B-A3B-Instruct
|
$0.15 | $0.60 | |||
|
|
Gemini 2.5 Pro Preview 06-05 |
google/gemini-2.5-pro
|
$1.25 | $10.00 | ||||
|
|
qwen | Qwen3-Next-80B-A3B-Instruct |
Qwen/Qwen3-Next-80B-A3B-Instruct
|
$0.09 | $1.10 | |||
|
|
OpenAI | GPT OSS 20B |
openai/gpt-oss-20b
|
$0.03 | $0.14 |
|
||
|
|
DeepSeek | DeepSeek-R1-0528 |
deepseek-ai/DeepSeek-R1-0528
|
$0.50 | $2.15 | |||
|
|
qwen | Qwen3-235B-A22B-Instruct-2507 |
Qwen/Qwen3-235B-A22B-Instruct-2507
|
$0.09 | $0.10 | |||
|
|
qwen | Qwen3 32B |
Qwen/Qwen3-32B
|
$0.08 | $0.28 | |||
|
|
Gemini 2.5 Flash |
google/gemini-2.5-flash
|
$0.30 | $2.50 |
|
|||
|
|
qwen | Qwen3 30B A3B |
Qwen/Qwen3-30B-A3B
|
$0.12 | $0.50 | |||
|
|
DeepSeek | DeepSeek-V3.1 |
deepseek-ai/DeepSeek-V3.1
|
$0.21 | $0.79 | |||
|
|
Mistral | Mistral Small 3.2 24B Instruct |
mistralai/Mistral-Small-3.2-24B-Instruct-2506
|
$0.08 | $0.20 | |||
|
|
DeepSeek | DeepSeek-V3 0324 |
deepseek-ai/DeepSeek-V3-0324
|
$0.20 | $0.77 | |||
|
|
DeepSeek | DeepSeek-V3 |
deepseek-ai/DeepSeek-V3
|
$0.32 | $0.89 | |||
|
|
Meta | Llama 3.1 8B Instruct |
meta-llama/Meta-Llama-3.1-8B-Instruct
|
$0.02 | $0.05 |
|
||
|
|
Gemma 3 27B |
google/gemma-3-27b-it
|
$0.08 | $0.16 | ||||
|
|
Gemma 3 12B |
google/gemma-3-12b-it
|
$0.05 | $0.15 | ||||
|
|
Gemma 3 4B |
google/gemma-3-4b-it
|
$0.05 | $0.10 | ||||
|
|
Mistral | Mistral Small 3 24B Instruct |
mistralai/Mistral-Small-24B-Instruct-2501
|
$0.05 | $0.08 | |||
|
|
microsoft | Phi 4 |
microsoft/phi-4
|
$0.07 | $0.14 | |||
|
|
qwen | Qwen2.5 72B Instruct |
Qwen/Qwen2.5-72B-Instruct
|
$0.36 | $0.40 | |||
|
|
Gemini 1.5 Flash |
google/gemini-1.5-flash
|
$0.08 | $0.30 | ||||
|
|
Gemini 1.5 Flash 8B |
google/gemini-1.5-flash-8b
|
$0.04 | $0.15 | ||||
|
|
Nvidia | Whisper Large v3 |
openai/whisper-large-v3
|
- | - | |||
|
|
Groq | Whisper Large v3 Turbo |
openai/whisper-large-v3-turbo
|
- | - | |||
|
|
ByteDance Seed | Seed-2.0-Mini |
ByteDance/Seed-2.0-mini
|
$0.10 | $0.40 | |||
|
|
Black Forest Labs | flux-2-klein-4b |
black-forest-labs/FLUX-2-klein-4b
|
- | - | |||
|
|
Black Forest Labs | Flux 2 Max |
black-forest-labs/FLUX-2-max
|
- | - | |||
|
|
Black Forest Labs | Flux 2 Pro |
black-forest-labs/FLUX-2-pro
|
- | - | |||
|
|
Gemini 3 Pro Image |
google/gemini-3-pro-image
|
- | - | ||||
|
|
Mistral | Voxtral Small 24B 2507 |
mistralai/Voxtral-Small-24B-2507
|
- | - | |||
|
|
qwen | Qwen3 Embedding 4B |
Qwen/Qwen3-Embedding-4B
|
$0.02 | - | |||
|
|
qwen | Qwen3 Embedding 8B |
Qwen/Qwen3-Embedding-8B
|
$0.01 | - | |||
|
|
Nvidia | Llama 3.3 Nemotron Super 49b V1.5 |
nvidia/Llama-3.3-Nemotron-Super-49B-v1.5
|
$0.40 | $0.40 | |||
|
|
DeepSeek | deepseek-v3.1-terminus |
deepseek-ai/DeepSeek-V3.1-Terminus
|
$0.27 | $0.95 | |||
|
|
Groq | Llama Guard 4 12B |
meta-llama/Llama-Guard-4-12B
|
$0.18 | $0.18 | |||
|
|
Alibaba | Qwen3 14B |
Qwen/Qwen3-14B
|
$0.12 | $0.24 | |||
|
|
Nvidia | Llama 3.2 11b Vision Instruct |
meta-llama/Llama-3.2-11B-Vision-Instruct
|
$0.35 | $0.35 | |||
|
|
NousResearch | Hermes 3 70B Instruct |
NousResearch/Hermes-3-Llama-3.1-70B
|
$0.70 | $0.70 | |||
|
|
NousResearch | Hermes 3 405B Instruct (free) |
NousResearch/Hermes-3-Llama-3.1-405B
|
$1.00 | $1.00 | |||
|
|
Mistral | Mistral NeMo Instruct |
mistralai/Mistral-Nemo-Instruct-2407
|
$0.02 | $0.04 | |||
|
|
DigitalOcean | All-MiniLM-L6-v2 |
sentence-transformers/all-MiniLM-L6-v2
|
$0.01 | - | |||
|
|
DigitalOcean | BGE M3 |
BAAI/bge-m3
|
$0.01 | - | |||
|
|
DeepSeek | deepseek-v3.2 |
deepseek-ai/DeepSeek-V3.2
|
$0.26 | $0.38 | |||
|
|
DigitalOcean | E5 Large v2 |
intfloat/e5-large-v2
|
$0.01 | - | |||
|
|
Black Forest Labs | Flux 1.1 Pro |
black-forest-labs/FLUX-1.1-pro
|
- | - | |||
|
|
Black Forest Labs | flux-2-dev |
black-forest-labs/FLUX-2-dev
|
- | - | |||
|
|
Black Forest Labs | flux-2-klein-9b |
black-forest-labs/FLUX-2-klein-9b
|
- | - | |||
|
|
Nvidia | FLUX.1-dev |
black-forest-labs/FLUX-1-dev
|
- | - | |||
|
|
Black Forest Labs | FLUX.1-schnell |
black-forest-labs/FLUX-1-schnell
|
- | - | |||
|
|
Azure | Llama 4 Maverick 17B 128E Instruct FP8 |
meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8
|
$0.15 | $0.60 | |||
|
|
Meta | llama-4-scout-17b-16e-instruct |
meta-llama/Llama-4-Scout-17B-16E-Instruct
|
$0.10 | $0.30 | |||
|
|
Xiaomi | mimo-v2.5-tts-voicedesign |
XiaomiMiMo/MiMo-V2.5-tts-voicedesign
|
- | - | |||
|
|
DigitalOcean | Multi-QA-mpnet-base-dot-v1 |
sentence-transformers/multi-qa-mpnet-base-dot-v1
|
$0.01 | - | |||
|
|
Nvidia | Nemotron 3 Nano Omni 30B A3B Reasoning |
nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning
|
$0.20 | $0.80 | |||
|
|
Nvidia | Nemotron 3 Ultra |
nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B
|
$0.50 | $2.50 | |||
|
|
Nvidia | NVIDIA Nemotron 3 Super |
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B
|
$0.10 | $0.50 | |||
|
|
DeepInfra | Qwen3 Coder 480B A35B Instruct Turbo |
Qwen/Qwen3-Coder-480B-A35B-Instruct-Turbo
|
$0.30 | $1.00 | |||
|
|
qwen | Qwen3 Embedding 0.6B |
Qwen/Qwen3-Embedding-0.6B
|
$0.01 | - | |||
|
|
DigitalOcean | Qwen3 TTS VoiceDesign |
Qwen/Qwen3-TTS-VoiceDesign
|
- | - | |||
|
|
ByteDance Seed | Seedream 4 |
ByteDance/Seedream-4
|
- | - | |||
|
|
Alibaba | wan2.6-t2i |
Wan-AI/Wan2.6-T2I
|
- | - |