DeepInfra
deepinfra
Updated 1 hour ago
DeepInfra is a cloud-based AI inference platform that provides scalable, cost-effective infrastructure for deploying and running machine learning models without requiring users to manage underlying infrastructure. The platform offers access to 100+ machine learning models across multiple categories including text-to-image generation, object detection, automatic speech recognition (ASR), and text-to-text generation. DeepInfra enables serverless deployment, is production-ready, and simplifies the process of deploying deep learning models. Their mission is to democratize access to top AI models by providing fast, affordable ML inference capabilities, with integration available through platforms like OpenRouter offering access to 90 models.
Browse 81 LLM models available from DeepInfra. Compare prices and features.
Models (81)
| Organization | Model Name | Original Model | Input | Output | Free | |||
|---|---|---|---|---|---|---|---|---|
|
|
Moonshot AI | Kimi K2.5 |
moonshotai/Kimi-K2.5
|
$0.45 | $2.25 |
|
||
|
|
Gemini 2.5 Pro Preview 06-05 |
google/gemini-2.5-pro
|
$1.25 | $10.00 |
|
|||
|
|
Z.ai | GLM-4.7 |
zai-org/GLM-4.7
|
$0.40 | $1.75 |
|
||
|
|
Gemini 2.5 Flash |
google/gemini-2.5-flash
|
$0.30 | $2.50 |
|
|||
|
|
qwen | Qwen3-235B-A22B-Thinking-2507 |
Qwen/Qwen3-235B-A22B-Thinking-2507
|
$0.23 | $2.30 | |||
|
|
Minimax | MiniMax M2.1 |
MiniMaxAI/MiniMax-M2.1
|
$0.27 | $0.95 | |||
|
|
Z.ai | GLM-4.6 |
zai-org/GLM-4.6
|
$0.43 | $1.74 | |||
|
|
DeepSeek | DeepSeek-R1-0528 |
deepseek-ai/DeepSeek-R1-0528
|
$0.50 | $2.15 | |||
|
|
OpenAI | GPT OSS 120B |
openai/gpt-oss-120b
|
$0.04 | $0.19 |
|
||
|
|
Anthropic | Claude Opus 4 |
anthropic/claude-4-opus
|
$16.50 | $82.50 | |||
|
|
qwen | Qwen3-235B-A22B-Instruct-2507 |
Qwen/Qwen3-235B-A22B-Instruct-2507
|
$0.07 | $0.10 | |||
|
|
Anthropic | Claude Sonnet 4 |
anthropic/claude-4-sonnet
|
$3.30 | $16.50 | |||
|
|
Z.ai | GLM-4.7-Flash |
zai-org/GLM-4.7-Flash
|
$0.06 | $0.40 | |||
|
|
Moonshot AI | Kimi K2-Instruct-0905 |
moonshotai/Kimi-K2-Instruct-0905
|
$0.40 | $2.00 | |||
|
|
Nvidia | Nemotron 3 Nano (30B A3B) |
nvidia/Nemotron-3-Nano-30B-A3B
|
$0.05 | $0.20 | |||
|
|
DeepSeek | DeepSeek-V3.1 |
deepseek-ai/DeepSeek-V3.1
|
$0.21 | $0.79 | |||
|
|
qwen | Qwen3-Next-80B-A3B-Instruct |
Qwen/Qwen3-Next-80B-A3B-Instruct
|
$0.09 | $1.10 | |||
|
|
OpenAI | GPT OSS 20B |
openai/gpt-oss-20b
|
$0.03 | $0.14 | |||
|
|
qwen | Qwen3 VL 30B A3B Instruct |
Qwen/Qwen3-VL-30B-A3B-Instruct
|
$0.15 | $0.60 | |||
|
|
DeepSeek | DeepSeek-V3 0324 |
deepseek-ai/DeepSeek-V3-0324
|
$0.20 | $0.77 | |||
|
|
qwen | Qwen3 30B A3B |
Qwen/Qwen3-30B-A3B
|
$0.08 | $0.28 | |||
|
|
DeepSeek | DeepSeek R1 Distill Llama 70B |
deepseek-ai/DeepSeek-R1-Distill-Llama-70B
|
$0.70 | $0.80 | |||
|
|
Nvidia | Nemotron Nano 9B v2 |
nvidia/NVIDIA-Nemotron-Nano-9B-v2
|
$0.04 | $0.16 | |||
|
|
qwen | Qwen3 Max |
Qwen/Qwen3-Max
|
$1.20 | $6.00 | |||
|
|
qwen | Qwen3 Max |
Qwen/Qwen3-Max-Thinking
|
$1.20 | $6.00 | |||
|
|
DeepSeek | DeepSeek-V3 |
deepseek-ai/DeepSeek-V3
|
$0.32 | $0.89 | |||
|
|
Microsoft | Phi 4 |
microsoft/phi-4
|
$0.07 | $0.14 | |||
|
|
Gemini 1.5 Flash |
google/gemini-1.5-flash
|
$0.08 | $0.30 | ||||
|
|
qwen | Qwen2.5 72B Instruct |
Qwen/Qwen2.5-72B-Instruct
|
$0.12 | $0.39 | |||
|
|
Mistral | Mistral Small 3.2 24B Instruct |
mistralai/Mistral-Small-3.2-24B-Instruct-2506
|
$0.08 | $0.20 | |||
|
|
Mistral | Mistral Small 3 24B Instruct |
mistralai/Mistral-Small-24B-Instruct-2501
|
$0.05 | $0.08 | |||
|
|
Gemma 3 27B |
google/gemma-3-27b-it
|
$0.08 | $0.16 | ||||
|
|
Meta | Llama 3.1 70B Instruct |
meta-llama/Meta-Llama-3.1-70B-Instruct
|
$0.40 | $0.40 | |||
|
|
Gemma 3 12B |
google/gemma-3-12b-it
|
$0.04 | $0.13 | ||||
|
|
Gemini 1.5 Flash 8B |
google/gemini-1.5-flash-8b
|
$0.04 | $0.15 | ||||
|
|
Gemma 3 4B |
google/gemma-3-4b-it
|
$0.04 | $0.08 | ||||
|
|
Meta | Llama 3.1 8B Instruct |
meta-llama/Meta-Llama-3.1-8B-Instruct
|
$0.02 | $0.05 | |||
|
|
Moonshot AI | Kimi K2 Thinking |
moonshotai/Kimi-K2-Thinking
|
$0.47 | $2.00 | |||
|
|
qwen | Qwen3-Coder 480B A35B Instruct |
Qwen/Qwen3-Coder-480B-A35B-Instruct
|
$0.40 | $1.60 | |||
|
|
DeepInfra | Qwen3 Coder 480B A35B Instruct Turbo |
Qwen/Qwen3-Coder-480B-A35B-Instruct-Turbo
|
$0.22 | $1.00 | |||
|
|
Black Forest Labs | flux-1-kontext-dev |
black-forest-labs/FLUX.1-Kontext-dev
|
- | - | |||
|
|
Meta | llama-3-8b-instruct |
meta-llama/Meta-Llama-3-8B-Instruct
|
$0.03 | $0.04 | |||
|
|
Nvidia | FLUX.1-dev |
black-forest-labs/FLUX-1-dev
|
- | - | |||
|
|
Groq | Llama Guard 4 12B |
meta-llama/Llama-Guard-4-12B
|
$0.18 | $0.18 | |||
|
|
Alibaba | Qwen3 14B |
Qwen/Qwen3-14B
|
$0.12 | $0.24 | |||
|
|
qwen | Qwen3 Embedding 4B |
Qwen/Qwen3-Embedding-4B
|
- | - | |||
|
|
Meta | llama-4-scout-17b-16e-instruct |
meta-llama/Llama-4-Scout-17B-16E-Instruct
|
$0.08 | $0.30 | |||
|
|
Nvidia | Llama 3.2 11b Vision Instruct |
meta-llama/Llama-3.2-11B-Vision-Instruct
|
$0.05 | $0.05 | |||
|
|
Mistral | Mistral NeMo Instruct |
mistralai/Mistral-Nemo-Instruct-2407
|
$0.02 | $0.04 | |||
|
|
Black Forest Labs | Flux 2 Pro |
black-forest-labs/FLUX-2-pro
|
- | - | |||
|
|
Black Forest Labs | FLUX.1-schnell |
black-forest-labs/FLUX-1-schnell
|
- | - | |||
|
|
Black Forest Labs | Flux 2 Max |
black-forest-labs/FLUX-2-max
|
- | - | |||
|
|
Nvidia | Llama 3.3 Nemotron Super 49b V1.5 |
nvidia/Llama-3.3-Nemotron-Super-49B-v1.5
|
$0.10 | $0.40 | |||
|
|
Mistral | mixtral-8x7b-instruct-v0.1 |
mistralai/Mixtral-8x7B-Instruct-v0.1
|
$0.54 | $0.54 | |||
|
|
DeepSeek | deepseek-v3.2 |
deepseek-ai/DeepSeek-V3.2
|
$0.26 | $0.38 | |||
|
|
Anthropic | Claude Sonnet 3.7 (latest) |
anthropic/claude-3-7-sonnet-latest
|
$3.30 | $16.50 | |||
|
|
qwen | Qwen3 Embedding 0.6B |
Qwen/Qwen3-Embedding-0.6B
|
- | - | |||
|
|
Z.ai | glm-4.6v |
zai-org/GLM-4.6V
|
$0.30 | $0.90 | |||
|
|
qwen | Qwen3 Embedding 8B |
Qwen/Qwen3-Embedding-8B
|
- | - | |||
|
|
Alibaba | qwen2.5-vl-32b-instruct |
Qwen/Qwen2.5-VL-32B-Instruct
|
$0.20 | $0.60 | |||
|
|
qwen | Qwen3 VL 235B A22B Instruct |
Qwen/Qwen3-VL-235B-A22B-Instruct
|
$0.20 | $0.88 | |||
|
|
Black Forest Labs | flux-2-dev |
black-forest-labs/FLUX-2-dev
|
- | - | |||
|
|
qwen | Qwen Image Edit |
Qwen/Qwen-Image-Edit
|
- | - | |||
|
|
Black Forest Labs | Flux 1.1 Pro |
black-forest-labs/FLUX-1.1-pro
|
- | - | |||
|
|
Azure | Llama 4 Maverick 17B 128E Instruct FP8 |
meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8
|
$0.15 | $0.60 | |||
|
|
Allen Institute for AI | Olmo 3.1 32B Instruct |
allenai/Olmo-3.1-32B-Instruct
|
$0.20 | $0.60 | |||
|
|
Nvidia | Llama 3.1 Nemotron 70B Instruct |
nvidia/Llama-3.1-Nemotron-70B-Instruct
|
$1.20 | $1.20 | |||
|
|
DeepSeek | deepseek-v3.1-terminus |
deepseek-ai/DeepSeek-V3.1-Terminus
|
$0.21 | $0.79 | |||
|
|
ByteDance Seed | Seedream 4 |
ByteDance/Seedream-4
|
- | - | |||
|
|
qwen | Qwen3 32B |
Qwen/Qwen3-32B
|
$0.08 | $0.28 | |||
|
|
Black Forest Labs | flux-2-klein-4b |
black-forest-labs/FLUX-2-klein-4b
|
- | - | |||
|
|
Black Forest Labs | flux-2-klein-9b |
black-forest-labs/FLUX-2-klein-9b
|
- | - | |||
|
|
NousResearch | Hermes 3 70B Instruct |
NousResearch/Hermes-3-Llama-3.1-70B
|
$0.30 | $0.30 | |||
|
|
NousResearch | Hermes 3 405B Instruct (free) |
NousResearch/Hermes-3-Llama-3.1-405B
|
$1.00 | $1.00 | |||
|
|
Z.ai | GLM-5 |
zai-org/GLM-5
|
$0.80 | $2.56 |
|
||
|
|
ByteDance Seed | Seedream 4.5 |
ByteDance/Seedream-4.5
|
- | - | |||
|
|
Minimax | MiniMax M2.5 |
MiniMaxAI/MiniMax-M2.5
|
$0.27 | $0.95 |
|
||
|
|
Alibaba | wan2.6-t2i |
Wan-AI/Wan2.6-T2I
|
- | - | |||
|
|
ByteDance Seed | Seed-2.0-Mini |
ByteDance/Seed-2.0-mini
|
$0.10 | $0.40 | |||
|
|
Clarifai | DeepSeek OCR |
deepseek-ai/DeepSeek-OCR
|
$0.03 | $0.10 | |||
|
|
Nvidia | nvidia-nemotron-3-super-120b-a12b |
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B
|
$0.10 | $0.50 |