# Llama 3.1 Nemotron Ultra 253B v1 Llama-3.1-Nemotron-Ultra-253B-v1 is a large language model (LLM) optimized for advanced reasoning, human-interactive chat, retrieval-augmented generation (RAG), and tool-calling tasks. Derived from Meta’s Llama-3.1-405B-Instruct, it has been significantly customized using Neural Architecture Search (NAS), resulting in enhanced efficiency, reduced memory usage, and improved inference latency. The model supports a context length of up to 128K tokens and can operate efficiently on an 8x NVIDIA H100 node. Note: you must include `detailed thinking on` in the system prompt to enable reasoning. Please see [Usage Recommendations](https://huggingface.co/nvidia/Llama-3_1-Nemotron-Ultra-253B-v1#quick-start-and-usage-recommendations) for more. ## Model Information - **Organization**: [Nvidia](/llm.txt) - **Slug**: llama-3-1-nemotron-ultra-253b-v1 - **Available at Providers**: 10 - **Release Date**: April 7, 2025 ### Benchmark Scores - AIME 2025: 0.725 - GPQA: 0.7601 ## Providers | Provider | Name | $ Input (per 1M) | $ Output (per 1M) | Free | Link | |----------|------|-----------------|------------------|------|------| | [AIHubMix](/llm/aihubmix.txt) | Llama-3_1-Nemotron-Ultra-253B-v1 | 0.50 | 0.50 | | [View](https://aihubmix.com/model/nvidia/Llama-3_1-Nemotron-Ultra-253B-v1) | | [Nvidia](/llm/nvidia.txt) | llama-3.1-nemotron-ultra-253b-v1 | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/nvidia/llama-3.1-nemotron-ultra-253b-v1) | | [Nano-GPT](/llm/nanogpt.txt) | Nvidia Nemotron Ultra 253B | | | | | | [ValorGPT](/llm/valorgpt.txt) | Llama 3.1 Nemotron Ultra 253B v1 | | | | [View](https://www.valorgpt.com/models/nvidia-llama-3.1-nemotron-ultra-253b-v1) | | [Nebius Token Factory](/llm/nebius.txt) | Llama-3_1-Nemotron-Ultra-253B-v1 | 0.60 | 1.80 | | [View](https://huggingface.co/nvidia/Llama-3_1-Nemotron-Ultra-253B-v1) | | [LangDB](/llm/langdb.txt) | llama-3.1-nemotron-ultra-253b-v1 | | | | [View](https://langdb.ai/app/models) | | [Kilo Code](/llm/kilocode.txt) | NVIDIA: Llama 3.1 Nemotron Ultra 253B v1 | 0.60 | 1.80 | | | | [Blackbox AI](/llm/blackboxai.txt) | NVIDIA: Llama 3.1 Nemotron Ultra 253B v1 | 0.60 | 1.80 | | | | [WaveSpeed AI](/llm/wavespeed.txt) | llama-3.1-nemotron-ultra-253b-v1 | 0.66 | 1.98 | | | | [Writingmate](/llm/writingmate.txt) | NVIDIA: Llama 3.1 Nemotron Ultra 253B v1 | | | | [View](https://writingmate.ai/models/nvidia/llama-3.1-nemotron-ultra-253b-v1) | --- [← Back to all providers](/llm.txt)