# Llama 3.3 Nemotron Super 49b V1.5

Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model derived from Meta’s Llama-3.3-70B-Instruct with a 128K context. It’s post-trained for agentic workflows (RAG, tool calling) via SFT across math, code, science, and multi-turn chat, followed by multiple RL stages; Reward-aware Preference Optimization (RPO) for alignment, RL with Verifiable Rewards (RLVR) for step-wise reasoning, and iterative DPO to refine tool-use behavior. A distillation-driven Neural Architecture Search (“Puzzle”) replaces some attention blocks and varies FFN widths to shrink memory footprint and improve throughput, enabling single-GPU (H100/H200) deployment while preserving instruction following and CoT quality.

In internal evaluations (NeMo-Skills, up to 16 runs, temp = 0.6, top_p = 0.95), the model reports strong reasoning/coding results, e.g., MATH500 pass@1 = 97.4, AIME-2024 = 87.5, AIME-2025 = 82.71, GPQA = 71.97, LiveCodeBench (24.10–25.02) = 73.58, and MMLU-Pro (CoT) = 79.53. The model targets practical inference efficiency (high tokens/s, reduced VRAM) with Transformers/vLLM support and explicit “reasoning on/off” modes (chat-first defaults, greedy recommended when disabled). Suitable for building agents, assistants, and long-context retrieval systems where balanced accuracy-to-cost and reliable tool use matter.


## Model Information

- **Organization**: [Nvidia](/llm.txt)
- **Slug**: llama-3-3-nemotron-super-49b-v1-5
- **Available at Providers**: 11
- **Release Date**: October 10, 2025

## Providers

| Provider | Name | $ Input (per 1M) | $ Output (per 1M) | Free | Link |
|----------|------|-----------------|------------------|------|------|
| [Nvidia](/llm/nvidia.txt) | llama-3.3-nemotron-super-49b-v1.5 | 0.00 | 0.00 | Yes | [View](https://build.nvidia.com/nvidia/llama-3.3-nemotron-super-49b-v1.5) |
| [Nano-GPT](/llm/nanogpt.txt) | Nvidia Nemotron Super 49B v1.5 |  |  |  |  |
| [OpenRouter](/llm/openrouter.txt) | Llama 3.3 Nemotron Super 49B V1.5 | 0.10 | 0.40 |  | [View](https://openrouter.ai/nvidia/llama-3.3-nemotron-super-49b-v1.5) |
| [ValorGPT](/llm/valorgpt.txt) | Llama 3.3 Nemotron Super 49B V1.5 |  |  |  | [View](https://www.valorgpt.com/models/nvidia-llama-3.3-nemotron-super-49b-v1.5) |
| [Yupp](/llm/yupp.txt) | Llama 3.3 Nemotron Super 49B V1.5 (OpenRouter) |  |  |  |  |
| [DeepInfra](/llm/deepinfra.txt) | Llama-3.3-Nemotron-Super-49B-v1.5 | 0.10 | 0.40 |  |  |
| [LangDB](/llm/langdb.txt) | llama-3.3-nemotron-super-49b-v1.5 |  |  |  | [View](https://langdb.ai/app/models) |
| [Kilo Code](/llm/kilocode.txt) | NVIDIA: Llama 3.3 Nemotron Super 49B V1.5 | 0.10 | 0.40 |  |  |
| [Blackbox AI](/llm/blackboxai.txt) | blackboxai/nvidia/llama-3.3-nemotron-super-49b-v1.5 |  |  |  |  |
| [WaveSpeed AI](/llm/wavespeed.txt) | llama-3.3-nemotron-super-49b-v1.5 | 0.11 | 0.44 |  |  |
| [Writingmate](/llm/writingmate.txt) | NVIDIA: Llama 3.3 Nemotron Super 49B V1.5 |  |  |  | [View](https://writingmate.ai/models/nvidia/llama-3.3-nemotron-super-49b-v1.5) |

---

[← Back to all providers](/llm.txt)