# Cerebras Cerebras Systems is known for their Wafer-Scale Engine (WSE), the largest chip ever built featuring massive parallel processing capabilities. Their CS-2 systems use the WSE for both training and inference, providing high-performance AI compute with specialized LLM inference services. Cerebras offers inference through Cerebras Cloud as well as on-premises hardware solutions, with optimized inference for models like Meta's Llama family. The company focuses on delivering exceptional AI inference performance for large language models, particularly for enterprises requiring high throughput and low latency for production AI deployments. ## Provider Information - **Website**: - **Available Models**: 4 ## Models | Name | Original Name | $ Input Price (per 1M) | $ Output Price (per 1M) | Free | Link | |------|---------------|---------------------|----------------------|------|------| | Z.AI GLM-4.7 | zai-glm-4.7 | 2.25 | 2.75 | | | | Qwen 3 235B Instruct | qwen-3-235b-a22b-instruct-2507 | 0.60 | 1.20 | | | | GPT OSS 120B | gpt-oss-120b | 0.25 | 0.69 | | | | Llama 3.1 8B | llama3.1-8b | 0.10 | 0.10 | | | --- [← Back to all providers](/llm.txt)