# Nvidia NVIDIA NIM (NVIDIA Inference Microservices) is a platform that provides optimized AI model inference containers featuring industry-leading APIs for running AI models across NVIDIA's accelerated infrastructure. NIM supports models from major providers including Meta (Llama), Google (Gemma), Mistral, xAI (Grok), DeepSeek, Microsoft (Phi), Qwen, and NVIDIA's own Nemotron family. The platform offers standard APIs across multiple deployment options including cloud, on-premises, and local workstations, with microservices optimized for NVIDIA GPUs. NIM provides an OpenAI-compatible API endpoint at integrate.api.nvidia.com for easy integration, featuring over 180 models from various AI companies hosted on NVIDIA's inference infrastructure. ## Provider Information - **Website**: - **Available Models**: 85 ## Models | Name | Original Name | $ Input Price (per 1M) | $ Output Price (per 1M) | Free | Link | |------|---------------|---------------------|----------------------|------|------| | kimi-k2-instruct-0905 | moonshotai/kimi-k2-instruct-0905 | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/moonshotai/kimi-k2-instruct-0905) | | kimi-k2-thinking | moonshotai/kimi-k2-thinking | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/moonshotai/kimi-k2-thinking) | | kimi-k2-instruct | moonshotai/kimi-k2-instruct | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/moonshotai/kimi-k2-instruct) | | nvidia-nemotron-nano-9b-v2 | nvidia/nvidia-nemotron-nano-9b-v2 | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/nvidia/nvidia-nemotron-nano-9b-v2) | | cosmos-nemotron-34b | nvidia/cosmos-nemotron-34b | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/nvidia/cosmos-nemotron-34b) | | nemotron-3-nano-30b-a3b | nvidia/nemotron-3-nano-30b-a3b | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/nvidia/nemotron-3-nano-30b-a3b) | | parakeet-tdt-0.6b-v2 | nvidia/parakeet-tdt-0.6b-v2 | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/nvidia/parakeet-tdt-0.6b-v2) | | nemoretriever-ocr-v1 | nvidia/nemoretriever-ocr-v1 | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/nvidia/nemoretriever-ocr-v1) | | llama-3.3-nemotron-super-49b-v1 | nvidia/llama-3.3-nemotron-super-49b-v1 | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/nvidia/llama-3.3-nemotron-super-49b-v1) | | llama-3.1-nemotron-ultra-253b-v1 | nvidia/llama-3.1-nemotron-ultra-253b-v1 | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/nvidia/llama-3.1-nemotron-ultra-253b-v1) | | llama-3.3-nemotron-super-49b-v1.5 | nvidia/llama-3.3-nemotron-super-49b-v1.5 | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/nvidia/llama-3.3-nemotron-super-49b-v1.5) | | minimax-m2 | minimaxai/minimax-m2 | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/minimaxai/minimax-m2) | | gemma-3n-e2b-it | google/gemma-3n-e2b-it | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/google/gemma-3n-e2b-it) | | gemma-3n-e4b-it | google/gemma-3n-e4b-it | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/google/gemma-3n-e4b-it) | | gemma-2-2b-it | google/gemma-2-2b-it | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/google/gemma-2-2b-it) | | gemma-3-1b-it | google/gemma-3-1b-it | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/google/gemma-3-1b-it) | | gemma-2-27b-it | google/gemma-2-27b-it | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/google/gemma-2-27b-it) | | gemma-3-27b-it | google/gemma-3-27b-it | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/google/gemma-3-27b-it) | | phi-3-medium-128k-instruct | microsoft/phi-3-medium-128k-instruct | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/microsoft/phi-3-medium-128k-instruct) | | phi-3-small-128k-instruct | microsoft/phi-3-small-128k-instruct | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/microsoft/phi-3-small-128k-instruct) | | phi-3.5-vision-instruct | microsoft/phi-3.5-vision-instruct | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/microsoft/phi-3.5-vision-instruct) | | phi-3-small-8k-instruct | microsoft/phi-3-small-8k-instruct | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/microsoft/phi-3-small-8k-instruct) | | phi-4-mini-instruct | microsoft/phi-4-mini-instruct | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/microsoft/phi-4-mini-instruct) | | phi-3-medium-4k-instruct | microsoft/phi-3-medium-4k-instruct | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/microsoft/phi-3-medium-4k-instruct) | | whisper-large-v3 | openai/whisper-large-v3 | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/openai/whisper-large-v3) | | gpt-oss-120b | openai/gpt-oss-120b | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/openai/gpt-oss-120b) | | qwen3-next-80b-a3b-instruct | qwen/qwen3-next-80b-a3b-instruct | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/qwen/qwen3-next-80b-a3b-instruct) | | qwen2.5-coder-32b-instruct | qwen/qwen2.5-coder-32b-instruct | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/qwen/qwen2.5-coder-32b-instruct) | | qwen2.5-coder-7b-instruct | qwen/qwen2.5-coder-7b-instruct | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/qwen/qwen2.5-coder-7b-instruct) | | qwen3-235b-a22b | qwen/qwen3-235b-a22b | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/qwen/qwen3-235b-a22b) | | qwen3-coder-480b-a35b-instruct | qwen/qwen3-coder-480b-a35b-instruct | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/qwen/qwen3-coder-480b-a35b-instruct) | | qwq-32b | qwen/qwq-32b | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/qwen/qwq-32b) | | qwen3-next-80b-a3b-thinking | qwen/qwen3-next-80b-a3b-thinking | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/qwen/qwen3-next-80b-a3b-thinking) | | devstral-2-123b-instruct-2512 | mistralai/devstral-2-123b-instruct-2512 | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/mistralai/devstral-2-123b-instruct-2512) | | mistral-large-3-675b-instruct-2512 | mistralai/mistral-large-3-675b-instruct-2512 | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/mistralai/mistral-large-3-675b-instruct-2512) | | ministral-14b-instruct-2512 | mistralai/ministral-14b-instruct-2512 | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/mistralai/ministral-14b-instruct-2512) | | mamba-codestral-7b-v0.1 | mistralai/mamba-codestral-7b-v0.1 | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/mistralai/mamba-codestral-7b-v0.1) | | mistral-small-3.1-24b-instruct-2503 | mistralai/mistral-small-3.1-24b-instruct-2503 | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/mistralai/mistral-small-3.1-24b-instruct-2503) | | llama-3.2-11b-vision-instruct | meta/llama-3.2-11b-vision-instruct | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/meta/llama-3.2-11b-vision-instruct) | | llama3-70b-instruct | meta/llama3-70b-instruct | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/meta/llama3-70b-instruct) | | llama-3.3-70b-instruct | meta/llama-3.3-70b-instruct | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/meta/llama-3.3-70b-instruct) | | llama-3.2-1b-instruct | meta/llama-3.2-1b-instruct | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/meta/llama-3.2-1b-instruct) | | llama-4-scout-17b-16e-instruct | meta/llama-4-scout-17b-16e-instruct | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/meta/llama-4-scout-17b-16e-instruct) | | llama-4-maverick-17b-128e-instruct | meta/llama-4-maverick-17b-128e-instruct | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/meta/llama-4-maverick-17b-128e-instruct) | | llama-3.1-405b-instruct | meta/llama-3.1-405b-instruct | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/meta/llama-3.1-405b-instruct) | | llama3-8b-instruct | meta/llama3-8b-instruct | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/meta/llama3-8b-instruct) | | llama-3.1-70b-instruct | meta/llama-3.1-70b-instruct | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/meta/llama-3.1-70b-instruct) | | deepseek-v3.1-terminus | deepseek-ai/deepseek-v3.1-terminus | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/deepseek-ai/deepseek-v3.1-terminus) | | deepseek-v3.1 | deepseek-ai/deepseek-v3.1 | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/deepseek-ai/deepseek-v3.1) | | glm4.7 | z-ai/glm4.7 | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/z-ai/glm4.7) | | deepseek-v3.2 | deepseek-ai/deepseek-v3.2 | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/deepseek-ai/deepseek-v3.2) | | kimi-k2.5 | moonshotai/kimi-k2.5 | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/moonshotai/kimi-k2.5) | | nemotron-nano-12b-v2-vl | nvidia/nemotron-nano-12b-v2-vl | | | | [View](https://build.nvidia.com/nvidia/nemotron-nano-12b-v2-vl) | | seed-oss-36b-instruct | bytedance/seed-oss-36b-instruct | | | | [View](https://build.nvidia.com/bytedance/seed-oss-36b-instruct) | | FLUX.1-Kontext-dev | black-forest-labs/FLUX.1-Kontext-dev | | | | [View](https://build.nvidia.com/black-forest-labs/FLUX.1-Kontext-dev) | | gpt-oss-20b | openai/gpt-oss-20b | | | | [View](https://build.nvidia.com/openai/gpt-oss-20b) | | mixtral-8x22b-instruct-v0.1 | mistralai/mixtral-8x22b-instruct-v0.1 | | | | [View](https://build.nvidia.com/mistralai/mixtral-8x22b-instruct-v0.1) | | mixtral-8x7b-instruct-v0.1 | mistralai/mixtral-8x7b-instruct-v0.1 | | | | [View](https://build.nvidia.com/mistralai/mixtral-8x7b-instruct-v0.1) | | magistral-small-2506 | mistralai/magistral-small-2506 | | | | [View](https://build.nvidia.com/mistralai/magistral-small-2506) | | granite-3.3-8b-instruct | ibm/granite-3.3-8b-instruct | | | | [View](https://build.nvidia.com/ibm/granite-3.3-8b-instruct) | | llama-3.1-8b-instruct | meta/llama-3.1-8b-instruct | | | | [View](https://build.nvidia.com/meta/llama-3.1-8b-instruct) | | deepseek-r1-distill-llama-8b | deepseek-ai/deepseek-r1-distill-llama-8b | | | | [View](https://build.nvidia.com/deepseek-ai/deepseek-r1-distill-llama-8b) | | llama-guard-4-12b | meta/llama-guard-4-12b | | | | [View](https://build.nvidia.com/meta/llama-guard-4-12b) | | llama-3.1-nemotron-nano-8b-v1 | nvidia/llama-3.1-nemotron-nano-8b-v1 | | | | [View](https://build.nvidia.com/nvidia/llama-3.1-nemotron-nano-8b-v1) | | FLUX.1-dev | black-forest-labs/FLUX.1-dev | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/black-forest-labs/FLUX.1-dev) | | FLUX.1-schnell | black-forest-labs/FLUX.1-schnell | | | | [View](https://build.nvidia.com/black-forest-labs/FLUX.1-schnell) | | mistral-7b-instruct-v0.3 | mistralai/mistral-7b-instruct-v0.3 | | | | [View](https://build.nvidia.com/mistralai/mistral-7b-instruct-v0.3) | | llama-3.2-90b-vision-instruct | meta/llama-3.2-90b-vision-instruct | | | | [View](https://build.nvidia.com/meta/llama-3.2-90b-vision-instruct) | | phi-4-multimodal-instruct | microsoft/phi-4-multimodal-instruct | | | | [View](https://build.nvidia.com/microsoft/phi-4-multimodal-instruct) | | phi-3-mini-128k-instruct | microsoft/phi-3-mini-128k-instruct | | | | [View](https://build.nvidia.com/microsoft/phi-3-mini-128k-instruct) | | gemma-2-9b-it | google/gemma-2-9b-it | | | | [View](https://build.nvidia.com/google/gemma-2-9b-it) | | deepseek-r1-distill-qwen-14b | deepseek-ai/deepseek-r1-distill-qwen-14b | | | | [View](https://build.nvidia.com/deepseek-ai/deepseek-r1-distill-qwen-14b) | | deepseek-r1-distill-qwen-32b | deepseek-ai/deepseek-r1-distill-qwen-32b | | | | [View](https://build.nvidia.com/deepseek-ai/deepseek-r1-distill-qwen-32b) | | deepseek-r1-distill-qwen-7b | deepseek-ai/deepseek-r1-distill-qwen-7b | | | | [View](https://build.nvidia.com/deepseek-ai/deepseek-r1-distill-qwen-7b) | | phi-3-mini-4k-instruct | microsoft/phi-3-mini-4k-instruct | | | | [View](https://build.nvidia.com/microsoft/phi-3-mini-4k-instruct) | | qwen2-7b-instruct | qwen/qwen2-7b-instruct | | | | [View](https://build.nvidia.com/qwen/qwen2-7b-instruct) | | qwen2.5-7b-instruct | qwen/qwen2.5-7b-instruct | | | | [View](https://build.nvidia.com/qwen/qwen2.5-7b-instruct) | | llama-3.2-3b-instruct | meta/llama-3.2-3b-instruct | | | | [View](https://build.nvidia.com/meta/llama-3.2-3b-instruct) | | mistral-7b-instruct-v0.2 | mistralai/mistral-7b-instruct-v0.2 | | | | [View](https://build.nvidia.com/mistralai/mistral-7b-instruct-v0.2) | | phi-3.5-mini-instruct | microsoft/phi-3.5-mini-instruct | | | | [View](https://build.nvidia.com/microsoft/phi-3.5-mini-instruct) | | step-3.5-flash | stepfun-ai/step-3.5-flash | | | | [View](https://build.nvidia.com/stepfun-ai/step-3.5-flash) | | minimax-m2.1 | minimaxai/minimax-m2.1 | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/minimaxai/minimax-m2.1) | | glm5 | z-ai/glm5 | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/z-ai/glm5) | | qwen3.5-397b-a17b | qwen/qwen3.5-397b-a17b | | | | [View](https://build.nvidia.com/qwen/qwen3.5-397b-a17b) | | minimax-m2.5 | minimaxai/minimax-m2.5 | | | | [View](https://build.nvidia.com/minimaxai/minimax-m2.5) | --- [← Back to all providers](/llm.txt)