# inclusionAI: LLaDA2-flash-CAP

LLaDA2.0-flash-CAP is an enhanced version of LLaDA2.0-flash, which significantly improves inference efficiency by incorporating Confidence-Aware Parallelism (CAP) training technology. Based on a 100B total parameter Mixture of Experts (MoE) diffusion architecture, this model achieves faster parallel decoding speeds while maintaining exceptional performance across various benchmark tests.

## Model Information

- **Organization**: [InclusionAI](/llm.txt)
- **Slug**: llada2-0-flash-cap
- **Available at Providers**: 1

## Providers

| Provider | Name | $ Input (per 1M) | $ Output (per 1M) | Free | Link |
|----------|------|-----------------|------------------|------|------|
| [ZenMUX](/llm/zenmux.txt) | inclusionAI: LLaDA2-flash-CAP | 0.28 | 2.85 |  |  |

---

[← Back to all providers](/llm.txt)