# inclusionAI: LLaDA2-flash-CAP LLaDA2.0-flash-CAP is an enhanced version of LLaDA2.0-flash, which significantly improves inference efficiency by incorporating Confidence-Aware Parallelism (CAP) training technology. Based on a 100B total parameter Mixture of Experts (MoE) diffusion architecture, this model achieves faster parallel decoding speeds while maintaining exceptional performance across various benchmark tests. ## Model Information - **Organization**: [InclusionAI](/llm.txt) - **Slug**: llada2-0-flash-cap - **Available at Providers**: 1 ## Providers | Provider | Name | $ Input (per 1M) | $ Output (per 1M) | Free | Link | |----------|------|-----------------|------------------|------|------| | [ZenMUX](/llm/zenmux.txt) | inclusionAI: LLaDA2-flash-CAP | 0.28 | 2.85 | | | --- [← Back to all providers](/llm.txt)