Nebius Token Factory
nebius
Updated 24 minutes ago
Nebius is a cloud platform that provides access to AI models through their TokenFactory inference service. The platform offers a wide range of open-source and proprietary models including DeepSeek, MiniMax, Kimi, Qwen, and others. Nebius focuses on providing fast, cost-effective AI inference with competitive pricing per 1M tokens and various quantization options (fp4, fp8) to optimize performance and cost.
Browse 42 LLM models available from Nebius Token Factory. Compare prices and features.
Models (42)
| Organization | Model Name | Original Model | Input | Output | Free | |||
|---|---|---|---|---|---|---|---|---|
|
|
Moonshot AI | Kimi K2.5 |
moonshotai/Kimi-K2.5
|
$0.50 | $2.50 |
|
||
|
|
qwen | Qwen3-235B-A22B-Thinking-2507 |
Qwen/Qwen3-235B-A22B-Thinking-2507
|
$0.20 | $0.80 | |||
|
|
Minimax | MiniMax M2.1 |
MiniMaxAI/MiniMax-M2.1
|
$0.30 | $1.20 |
|
||
|
|
DeepSeek | DeepSeek-R1-0528 |
deepseek-ai/DeepSeek-R1-0528
|
$0.80 | $2.40 | |||
|
|
OpenAI | GPT OSS 120B |
openai/gpt-oss-120b
|
$0.15 | $0.60 |
|
||
|
|
Z.ai | GLM-4.5 |
zai-org/GLM-4.5
|
$0.60 | $2.20 | |||
|
|
qwen | Qwen3-235B-A22B-Instruct-2507 |
Qwen/Qwen3-235B-A22B-Instruct-2507
|
$0.20 | $0.60 | |||
|
|
qwen | Qwen3-Next-80B-A3B-Thinking |
Qwen/Qwen3-Next-80B-A3B-Thinking
|
$0.15 | $1.20 | |||
|
|
Nvidia | Llama 3.1 Nemotron Ultra 253B v1 |
nvidia/Llama-3_1-Nemotron-Ultra-253B-v1
|
$0.60 | $1.80 | |||
|
|
Moonshot AI | Kimi K2 Instruct |
moonshotai/Kimi-K2-Instruct
|
$0.50 | $2.40 | |||
|
|
Z.ai | GLM-4.5-Air |
zai-org/GLM-4.5-Air
|
$0.20 | $1.20 | |||
|
|
OpenAI | GPT OSS 20B |
openai/gpt-oss-20b
|
$0.05 | $0.20 | |||
|
|
DeepSeek | DeepSeek-V3 0324 |
deepseek-ai/DeepSeek-V3-0324
|
$0.50 | $1.50 | |||
|
|
Meta | Llama 3.3 70B Instruct |
meta-llama/Llama-3.3-70B-Instruct
|
$0.13 | $0.40 | |||
|
|
Gemma 3 27B |
google/gemma-3-27b-it
|
$0.10 | $0.30 | ||||
|
|
Meta | Llama 3.1 8B Instruct |
meta-llama/Meta-Llama-3.1-8B-Instruct
|
$0.02 | $0.06 | |||
|
|
DeepSeek | deepseek-v3.2 |
deepseek-ai/DeepSeek-V3.2
|
$0.30 | $0.45 | |||
|
|
Moonshot AI | Kimi K2 Thinking |
moonshotai/Kimi-K2-Thinking
|
$0.60 | $2.50 | |||
|
|
qwen | Qwen3-Coder 480B A35B Instruct |
Qwen/Qwen3-Coder-480B-A35B-Instruct
|
$0.40 | $1.80 | |||
|
|
Nebius | Hermes-4 405B |
NousResearch/Hermes-4-405B
|
$1.00 | $3.00 | |||
|
|
Nebius | Hermes 4 70B |
NousResearch/Hermes-4-70B
|
$0.13 | $0.40 | |||
|
|
DeepSeek | DeepSeek-R1-0528 |
deepseek-ai/DeepSeek-R1-0528-fast
|
$2.00 | $6.00 | |||
|
|
SiliconFlow | Qwen/Qwen3-30B-A3B-Thinking-2507 |
Qwen/Qwen3-30B-A3B-Thinking-2507
|
$0.10 | $0.30 | |||
|
|
Alibaba | qwen3-30b-a3b-instruct-2507 |
Qwen/Qwen3-30B-A3B-Instruct-2507
|
$0.10 | $0.30 | |||
|
|
Alibaba | Qwen3-Coder 30B-A3B Instruct |
Qwen/Qwen3-Coder-30B-A3B-Instruct
|
$0.10 | $0.30 | |||
|
|
qwen | Qwen3 32B |
Qwen/Qwen3-32B
|
$0.10 | $0.30 | |||
|
|
qwen | Qwen3-32B |
Qwen/Qwen3-32B-fast
|
$0.20 | $0.60 | |||
|
|
DeepSeek | DeepSeek-V3-0324 |
deepseek-ai/DeepSeek-V3-0324-fast
|
$0.75 | $2.25 | |||
|
|
Nebius | Llama-3.3-70B-Instruct (Fast) |
meta-llama/Llama-3.3-70B-Instruct-fast
|
$0.25 | $0.75 | |||
|
|
Gemma-3-27b-it |
google/gemma-3-27b-it-fast
|
$0.20 | $0.60 | ||||
|
|
Meta | Meta-Llama-3.1-8B-Instruct |
meta-llama/Meta-Llama-3.1-8B-Instruct-fast
|
$0.03 | $0.09 | |||
|
|
qwen | Qwen2.5-Coder-7B |
Qwen/Qwen2.5-Coder-7B-fast
|
$0.03 | $0.09 | |||
|
|
Alibaba | qwen2.5-vl-72b-instruct |
Qwen/Qwen2.5-VL-72B-Instruct
|
$0.25 | $0.75 | |||
|
|
gemma-2-2b-it |
google/gemma-2-2b-it
|
$0.02 | $0.06 | ||||
|
|
Groq | Llama Guard 3 8B |
meta-llama/Llama-Guard-3-8B
|
$0.02 | $0.06 | |||
|
|
qwen | Qwen3 Embedding 8B |
Qwen/Qwen3-Embedding-8B
|
$0.01 | $0.00 | |||
|
|
Black Forest Labs | FLUX.1-schnell |
black-forest-labs/flux-schnell
|
- | - | |||
|
|
Black Forest Labs | FLUX.1-dev |
black-forest-labs/flux-dev
|
- | - | |||
|
|
Nvidia | Nemotron-Nano-V2-12b |
nvidia/Nemotron-Nano-V2-12b
|
$0.07 | $0.20 | |||
|
|
Nvidia | Nemotron-3-Nano-30B-A3B |
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B
|
$0.06 | $0.24 | |||
|
|
Z.ai | GLM-4.7 |
zai-org/GLM-4.7-FP8
|
$0.40 | $2.00 | |||
|
|
Gemma-2-9b-it |
google/gemma-2-9b-it-fast
|
$0.03 | $0.09 |