# Nebius Token Factory

Nebius is a cloud platform that provides access to AI models through their TokenFactory inference service. The platform offers a wide range of open-source and proprietary models including DeepSeek, MiniMax, Kimi, Qwen, and others. Nebius focuses on providing fast, cost-effective AI inference with competitive pricing per 1M tokens and various quantization options (fp4, fp8) to optimize performance and cost.

## Provider Information

- **Website**: <https://nebius.com/>
- **Available Models**: 42

## Models

| Name | Original Name | $ Input Price (per 1M) | $ Output Price (per 1M) | Free | Link |
|------|---------------|---------------------|----------------------|------|------|
| gpt-oss-20b | openai/gpt-oss-20b | 0.05 | 0.20 |  | [View](https://huggingface.co/openai/gpt-oss-20b) |
| gpt-oss-120b | openai/gpt-oss-120b | 0.15 | 0.60 |  | [View](https://huggingface.co/openai/gpt-oss-120b) |
| MiniMax-M2.1 | MiniMaxAI/MiniMax-M2.1 | 0.30 | 1.20 |  | [View](https://huggingface.co/MiniMaxAI/MiniMax-M2.1) |
| DeepSeek-V3.2 | deepseek-ai/DeepSeek-V3.2 | 0.30 | 0.45 |  | [View](https://huggingface.co/deepseek-ai/DeepSeek-V3.2) |
| Kimi-K2-Thinking | moonshotai/Kimi-K2-Thinking | 0.60 | 2.50 |  | [View](https://huggingface.co/moonshotai/Kimi-K2-Thinking) |
| Qwen3-Coder-480B-A35B-Instruct | Qwen/Qwen3-Coder-480B-A35B-Instruct | 0.40 | 1.80 |  | [View](https://huggingface.co/Qwen/Qwen3-Coder-480B-A35B-Instruct) |
| Hermes-4-405B | NousResearch/Hermes-4-405B | 1.00 | 3.00 |  | [View](https://huggingface.co/NousResearch/Hermes-4-405B) |
| Hermes-4-70B | NousResearch/Hermes-4-70B | 0.13 | 0.40 |  | [View](https://huggingface.co/NousResearch/Hermes-4-70B) |
| GLM-4.5 | zai-org/GLM-4.5 | 0.60 | 2.20 |  | [View](https://huggingface.co/zai-org/GLM-4.5) |
| GLM-4.5-Air | zai-org/GLM-4.5-Air | 0.20 | 1.20 |  | [View](https://huggingface.co/zai-org/GLM-4.5-Air) |
| DeepSeek-R1-0528 | deepseek-ai/DeepSeek-R1-0528 | 0.80 | 2.40 |  | [View](https://huggingface.co/deepseek-ai/DeepSeek-R1-0528) |
| DeepSeek-R1-0528 | deepseek-ai/DeepSeek-R1-0528-fast | 2.00 | 6.00 |  | [View](https://huggingface.co/deepseek-ai/DeepSeek-R1-0528) |
| Qwen3-235B-A22B-Thinking-2507 | Qwen/Qwen3-235B-A22B-Thinking-2507 | 0.20 | 0.80 |  | [View](https://huggingface.co/Qwen/Qwen3-235B-A22B-Thinking-2507) |
| Qwen3-235B-A22B-Instruct-2507 | Qwen/Qwen3-235B-A22B-Instruct-2507 | 0.20 | 0.60 |  | [View](https://huggingface.co/Qwen/Qwen3-235B-A22B-Instruct-2507) |
| Qwen3-30B-A3B-Thinking-2507 | Qwen/Qwen3-30B-A3B-Thinking-2507 | 0.10 | 0.30 |  | [View](https://huggingface.co/Qwen/Qwen3-30B-A3B-Thinking-2507) |
| Qwen3-30B-A3B-Instruct-2507 | Qwen/Qwen3-30B-A3B-Instruct-2507 | 0.10 | 0.30 |  | [View](https://huggingface.co/Qwen/Qwen3-30B-A3B-Instruct-2507) |
| Qwen3-Coder-30B-A3B-Instruct | Qwen/Qwen3-Coder-30B-A3B-Instruct | 0.10 | 0.30 |  | [View](https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct) |
| Qwen3-32B | Qwen/Qwen3-32B | 0.10 | 0.30 |  | [View](https://huggingface.co/Qwen/Qwen3-32B) |
| Qwen3-32B | Qwen/Qwen3-32B-fast | 0.20 | 0.60 |  | [View](https://huggingface.co/Qwen/Qwen3-32B) |
| Llama-3_1-Nemotron-Ultra-253B-v1 | nvidia/Llama-3_1-Nemotron-Ultra-253B-v1 | 0.60 | 1.80 |  | [View](https://huggingface.co/nvidia/Llama-3_1-Nemotron-Ultra-253B-v1) |
| DeepSeek-V3-0324 | deepseek-ai/DeepSeek-V3-0324 | 0.50 | 1.50 |  | [View](https://huggingface.co/deepseek-ai/DeepSeek-V3-0324) |
| DeepSeek-V3-0324 | deepseek-ai/DeepSeek-V3-0324-fast | 0.75 | 2.25 |  | [View](https://huggingface.co/deepseek-ai/DeepSeek-V3-0324) |
| Llama-3.3-70B-Instruct | meta-llama/Llama-3.3-70B-Instruct | 0.13 | 0.40 |  | [View](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct) |
| Llama-3.3-70B-Instruct | meta-llama/Llama-3.3-70B-Instruct-fast | 0.25 | 0.75 |  | [View](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct) |
| Gemma-3-27b-it | google/gemma-3-27b-it-fast | 0.20 | 0.60 |  | [View](https://huggingface.co/google/gemma-3-27b-it) |
| Gemma-3-27b-it | google/gemma-3-27b-it | 0.10 | 0.30 |  | [View](https://huggingface.co/google/gemma-3-27b-it) |
| Meta-Llama-3.1-8B-Instruct | meta-llama/Meta-Llama-3.1-8B-Instruct-fast | 0.03 | 0.09 |  | [View](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct) |
| Meta-Llama-3.1-8B-Instruct | meta-llama/Meta-Llama-3.1-8B-Instruct | 0.02 | 0.06 |  | [View](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct) |
| Qwen2.5-Coder-7B | Qwen/Qwen2.5-Coder-7B-fast | 0.03 | 0.09 |  | [View](https://huggingface.co/Qwen/Qwen2.5-Coder-7B) |
| Qwen2.5-VL-72B-Instruct | Qwen/Qwen2.5-VL-72B-Instruct | 0.25 | 0.75 |  | [View](https://huggingface.co/Qwen/Qwen2.5-VL-72B-Instruct) |
| Gemma-2-2b-it | google/gemma-2-2b-it | 0.02 | 0.06 |  | [View](https://huggingface.co/google/gemma-2-2b-it) |
| Meta-Llama-Guard-3-8B | meta-llama/Llama-Guard-3-8B | 0.02 | 0.06 |  | [View](https://huggingface.co/meta-llama/Llama-Guard-3-8B) |
| Qwen3-Embedding-8B | Qwen/Qwen3-Embedding-8B | 0.01 | 0.00 |  | [View](https://huggingface.co/Qwen/Qwen3-Embedding-8B) |
| FLUX.1-schnell | black-forest-labs/flux-schnell |  |  |  | [View](https://huggingface.co/black-forest-labs/FLUX.1-schnell) |
| FLUX.1-dev | black-forest-labs/flux-dev |  |  |  | [View](https://huggingface.co/black-forest-labs/FLUX.1-dev) |
| Nemotron-Nano-V2-12b | nvidia/Nemotron-Nano-V2-12b | 0.07 | 0.20 |  |  |
| Nemotron-3-Nano-30B-A3B | nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B | 0.06 | 0.24 |  | [View](https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8) |
| GLM-4.7 | zai-org/GLM-4.7-FP8 | 0.40 | 2.00 |  | [View](https://huggingface.co/zai-org/GLM-4.7-FP8) |
| Qwen3-Next-80B-A3B-Thinking | Qwen/Qwen3-Next-80B-A3B-Thinking | 0.15 | 1.20 |  | [View](https://huggingface.co/Qwen/Qwen3-Next-80B-A3B-Thinking) |
| Gemma-2-9b-it | google/gemma-2-9b-it-fast | 0.03 | 0.09 |  | [View](https://huggingface.co/google/gemma-2-9b-it) |
| Kimi-K2.5 | moonshotai/Kimi-K2.5 | 0.50 | 2.50 |  | [View](https://huggingface.co/moonshotai/Kimi-K2.5) |
| Kimi-K2-Instruct | moonshotai/Kimi-K2-Instruct | 0.50 | 2.40 |  | [View](https://huggingface.co/moonshotai/Kimi-K2-Instruct) |

---

[← Back to all providers](/llm.txt)