# Llama 3.1 Nemotron Ultra 253B v1

Llama-3.1-Nemotron-Ultra-253B-v1 is a large language model (LLM) optimized for advanced reasoning, human-interactive chat, retrieval-augmented generation (RAG), and tool-calling tasks. Derived from Meta’s Llama-3.1-405B-Instruct, it has been significantly customized using Neural Architecture Search (NAS), resulting in enhanced efficiency, reduced memory usage, and improved inference latency. The model supports a context length of up to 128K tokens and can operate efficiently on an 8x NVIDIA H100 node.

Note: you must include `detailed thinking on` in the system prompt to enable reasoning. Please see [Usage Recommendations](https://huggingface.co/nvidia/Llama-3_1-Nemotron-Ultra-253B-v1#quick-start-and-usage-recommendations) for more.

## Model Information

- **Organization**: [Nvidia](/llm.txt)
- **Slug**: llama-3-1-nemotron-ultra-253b-v1
- **Available at Providers**: 10
- **Release Date**: April 7, 2025

### Benchmark Scores
- AIME 2025: 0.725
- GPQA: 0.7601

## Providers

| Provider | Name | $ Input (per 1M) | $ Output (per 1M) | Free | Link |
|----------|------|-----------------|------------------|------|------|
| [AIHubMix](/llm/aihubmix.txt) | Llama-3_1-Nemotron-Ultra-253B-v1 | 0.50 | 0.50 |  | [View](https://aihubmix.com/model/nvidia/Llama-3_1-Nemotron-Ultra-253B-v1) |
| [Nvidia](/llm/nvidia.txt) | llama-3.1-nemotron-ultra-253b-v1 | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/nvidia/llama-3.1-nemotron-ultra-253b-v1) |
| [Nano-GPT](/llm/nanogpt.txt) | Nvidia Nemotron Ultra 253B |  |  |  |  |
| [ValorGPT](/llm/valorgpt.txt) | Llama 3.1 Nemotron Ultra 253B v1 |  |  |  | [View](https://www.valorgpt.com/models/nvidia-llama-3.1-nemotron-ultra-253b-v1) |
| [Nebius Token Factory](/llm/nebius.txt) | Llama-3_1-Nemotron-Ultra-253B-v1 | 0.60 | 1.80 |  | [View](https://huggingface.co/nvidia/Llama-3_1-Nemotron-Ultra-253B-v1) |
| [LangDB](/llm/langdb.txt) | llama-3.1-nemotron-ultra-253b-v1 |  |  |  | [View](https://langdb.ai/app/models) |
| [Kilo Code](/llm/kilocode.txt) | NVIDIA: Llama 3.1 Nemotron Ultra 253B v1 | 0.60 | 1.80 |  |  |
| [Blackbox AI](/llm/blackboxai.txt) | NVIDIA: Llama 3.1 Nemotron Ultra 253B v1 | 0.60 | 1.80 |  |  |
| [WaveSpeed AI](/llm/wavespeed.txt) | llama-3.1-nemotron-ultra-253b-v1 | 0.66 | 1.98 |  |  |
| [Writingmate](/llm/writingmate.txt) | NVIDIA: Llama 3.1 Nemotron Ultra 253B v1 |  |  |  | [View](https://writingmate.ai/models/nvidia/llama-3.1-nemotron-ultra-253b-v1) |

---

[← Back to all providers](/llm.txt)