Nvidia
nvidia
Updated 1 hour ago
NVIDIA NIM (NVIDIA Inference Microservices) is a platform that provides optimized AI model inference containers featuring industry-leading APIs for running AI models across NVIDIA's accelerated infrastructure. NIM supports models from major providers including Meta (Llama), Google (Gemma), Mistral, xAI (Grok), DeepSeek, Microsoft (Phi), Qwen, and NVIDIA's own Nemotron family. The platform offers standard APIs across multiple deployment options including cloud, on-premises, and local workstations, with microservices optimized for NVIDIA GPUs. NIM provides an OpenAI-compatible API endpoint at integrate.api.nvidia.com for easy integration, featuring over 180 models from various AI companies hosted on NVIDIA's inference infrastructure.
Browse 56 LLM models available from Nvidia. Compare prices and features.
Models (56)
| Organization | Model Name | Original Model | Input | Output | Free | |||
|---|---|---|---|---|---|---|---|---|
|
|
DeepSeek | DeepSeek V4 Pro |
deepseek-ai/deepseek-v4-pro
|
$0.44 | $0.87 |
|
||
|
|
DeepSeek | DeepSeek V4 Flash |
deepseek-ai/deepseek-v4-flash
|
$0.14 | $0.28 |
|
||
|
|
Moonshot AI | Kimi K2.6 |
moonshotai/kimi-k2.6
|
$0.00 | $0.00 | Free |
|
|
|
|
Z.ai | GLM-5.1 |
z-ai/glm-5.1
|
$0.00 | $0.00 | Free |
|
|
|
|
Gemma 4 31B |
google/gemma-4-31b-it
|
$0.00 | $0.00 | Free |
|
||
|
|
Minimax | MiniMax M2.7 |
minimaxai/minimax-m2.7
|
$0.00 | $0.00 | Free |
|
|
|
|
Nvidia | Nemotron 3 Super (120B A12B) |
nvidia/nemotron-3-super-120b-a12b
|
$0.20 | $0.80 |
|
||
|
|
Minimax | MiniMax M2.5 |
minimaxai/minimax-m2.5
|
$0.00 | $0.00 | Free | ||
|
|
qwen | Qwen3.5-122B-A10B |
qwen/qwen3.5-122b-a10b
|
$0.00 | $0.00 | Free | ||
|
|
qwen | Qwen3.5-397B-A17B |
qwen/qwen3.5-397b-a17b
|
$0.00 | $0.00 | Free | ||
|
|
StepFun | Step-3.5-Flash |
stepfun-ai/step-3.5-flash
|
$0.00 | $0.00 | Free |
|
|
|
|
Nvidia | Nemotron 3 Nano (30B A3B) |
nvidia/nemotron-3-nano-30b-a3b
|
$0.00 | $0.00 | Free | ||
|
|
qwen | Qwen3-Next-80B-A3B-Thinking |
qwen/qwen3-next-80b-a3b-thinking
|
$0.00 | $0.00 | Free | ||
|
|
qwen | Qwen3-Next-80B-A3B-Instruct |
qwen/qwen3-next-80b-a3b-instruct
|
$0.00 | $0.00 | Free | ||
|
|
OpenAI | GPT OSS 120B |
openai/gpt-oss-120b
|
$0.00 | $0.00 | Free |
|
|
|
|
Nvidia | NVIDIA Nemotron Nano 9B V2 |
nvidia/nvidia-nemotron-nano-9b-v2
|
$0.00 | $0.00 | Free | ||
|
|
OpenAI | GPT OSS 20B |
openai/gpt-oss-20b
|
$0.00 | $0.00 | Free | ||
|
|
Mistral | Magistral Small 2506 |
mistralai/magistral-small-2506
|
$0.00 | $0.00 | Free | ||
|
|
Gemma 3n E2B Instructed |
google/gemma-3n-e2b-it
|
$0.00 | $0.00 | Free | |||
|
|
Gemma 3n E4B Instructed |
google/gemma-3n-e4b-it
|
$0.00 | $0.00 | Free | |||
|
|
Nvidia | Llama-3.3 Nemotron Super 49B v1 |
nvidia/llama-3.3-nemotron-super-49b-v1
|
$0.00 | $0.00 | Free | ||
|
|
qwen | Qwen3-Coder 480B A35B Instruct |
qwen/qwen3-coder-480b-a35b-instruct
|
$0.00 | $0.00 | Free | ||
|
|
Nvidia | Llama 3.1 Nemotron Nano 8B V1 |
nvidia/llama-3.1-nemotron-nano-8b-v1
|
- | - | |||
|
|
Meta | Llama 3.3 70B Instruct |
meta/llama-3.3-70b-instruct
|
$0.00 | $0.00 | Free | ||
|
|
Meta | Llama 3.1 70B Instruct |
meta/llama-3.1-70b-instruct
|
$0.00 | $0.00 | Free | ||
|
|
Meta | Llama 3.2 3B Instruct |
meta/llama-3.2-3b-instruct
|
$0.00 | $0.00 | Free | ||
|
|
Meta | Llama 3.1 8B Instruct |
meta/llama-3.1-8b-instruct
|
$0.00 | $0.00 | Free | ||
|
|
Nvidia | Whisper Large v3 |
openai/whisper-large-v3
|
$0.00 | $0.00 | Free | ||
|
|
Black Forest Labs | flux-2-klein-4b |
black-forest-labs/flux.2-klein-4b
|
$0.00 | $0.00 | Free | ||
|
|
Nvidia | Nemotron Nano 12B 2 VL (free) |
nvidia/nemotron-nano-12b-v2-vl
|
- | - | |||
|
|
Nvidia | Phi-4-Mini |
microsoft/phi-4-mini-instruct
|
$0.00 | $0.00 | Free | ||
|
|
Nvidia | Llama 3.3 Nemotron Super 49b V1.5 |
nvidia/llama-3.3-nemotron-super-49b-v1.5
|
$0.00 | $0.00 | Free | ||
|
|
Groq | Llama Guard 4 12B |
meta/llama-guard-4-12b
|
$0.00 | $0.00 | Free | ||
|
|
Microsoft | Phi-4-multimodal-instruct |
microsoft/phi-4-multimodal-instruct
|
$0.00 | $0.00 | Free | ||
|
|
Nvidia | Llama 3.2 11b Vision Instruct |
meta/llama-3.2-11b-vision-instruct
|
$0.00 | $0.00 | Free | ||
|
|
Meta | llama-3.2-1b-instruct |
meta/llama-3.2-1b-instruct
|
$0.00 | $0.00 | Free | ||
|
|
qwen | Qwen2.5-Coder 32B Instruct |
qwen/qwen2.5-coder-32b-instruct
|
$0.00 | $0.00 | Free | ||
|
|
Mistral | Mistral 7B Instruct v0.3 |
mistralai/mistral-7b-instruct-v0.3
|
$0.00 | $0.00 | Free | ||
|
|
DigitalOcean | BGE M3 |
baai/bge-m3
|
$0.00 | $0.00 | Free | ||
|
|
Black Forest Labs | flux-1-kontext-dev |
black-forest-labs/FLUX.1-Kontext-dev
|
$0.00 | $0.00 | Free | ||
|
|
Nvidia | FLUX.1-dev |
black-forest-labs/FLUX.1-dev
|
$0.00 | $0.00 | Free | ||
|
|
Black Forest Labs | FLUX.1-schnell |
black-forest-labs/FLUX.1-schnell
|
$0.00 | $0.00 | Free | ||
|
|
gemma-2-2b-it |
google/gemma-2-2b-it
|
$0.00 | $0.00 | Free | |||
|
|
Azure | Llama-3.2-90B-Vision-Instruct |
meta/llama-3.2-90b-vision-instruct
|
$0.00 | $0.00 | Free | ||
|
|
Meta | llama-4-maverick-17b-128e-instruct |
meta/llama-4-maverick-17b-128e-instruct
|
$0.00 | $0.00 | Free | ||
|
|
Nvidia | Ministral 3 14B Instruct 2512 |
mistralai/ministral-14b-instruct-2512
|
- | - | |||
|
|
Nvidia | Mistral Large 3 675B Instruct 2512 |
mistralai/mistral-large-3-675b-instruct-2512
|
$0.00 | $0.00 | Free | ||
|
|
Mistral | mixtral-8x22b-instruct-v0.1 |
mistralai/mixtral-8x22b-instruct-v0.1
|
- | - | |||
|
|
Mistral | mixtral-8x7b-instruct-v0.1 |
mistralai/mixtral-8x7b-instruct-v0.1
|
- | - | |||
|
|
Nvidia | NeMo Retriever OCR v1 |
nvidia/nemoretriever-ocr-v1
|
- | - | |||
|
|
Nvidia | Nemotron 3 Nano Omni 30B A3B Reasoning |
nvidia/nemotron-3-nano-omni-30b-a3b-reasoning
|
$0.00 | $0.00 | Free | ||
|
|
Nvidia | Parakeet TDT 0.6B v2 |
nvidia/parakeet-tdt-0.6b-v2
|
- | - | |||
|
|
qwen | Qwen Image |
qwen/qwen-image
|
$0.00 | $0.00 | Free | ||
|
|
qwen | Qwen Image Edit |
qwen/qwen-image-edit
|
$0.00 | $0.00 | Free | ||
|
|
SiliconFlow | Seed-OSS-36B-Instruct |
bytedance/seed-oss-36b-instruct
|
$0.00 | $0.00 | Free | ||
|
|
DigitalOcean | Stable Diffusion 3.5 Large |
stabilityai/stable-diffusion-3.5-large
|
- | - |