InclusionAI

inclusionAI: LLaDA2-flash-CAP

InclusionAI llada2-0-flash-cap

Model Information
Slug llada2-0-flash-cap
Aliases llada2-0-flash-cap llada20flashcap

LLaDA2.0-flash-CAP is an enhanced version of LLaDA2.0-flash, which significantly improves inference efficiency by incorporating Confidence-Aware Parallelism (CAP) training technology. Based on a 100B total parameter Mixture of Experts (MoE) diffusion architecture, this model achieves faster parallel decoding speeds while maintaining exceptional performance across various benchmark tests.

Available at 1 Provider
Provider Model Name Original Model Input ($/1M) Output ($/1M) Free
ZenMUX icon ZenMUX inclusionAI: LLaDA2-flash-CAP inclusionai/llada2.0-flash-cap $0.28 $2.85 Visit