Cerebras
cerebras
Updated 1 hour ago
Cerebras Systems is known for their Wafer-Scale Engine (WSE), the largest chip ever built featuring massive parallel processing capabilities. Their CS-2 systems use the WSE for both training and inference, providing high-performance AI compute with specialized LLM inference services. Cerebras offers inference through Cerebras Cloud as well as on-premises hardware solutions, with optimized inference for models like Meta's Llama family. The company focuses on delivering exceptional AI inference performance for large language models, particularly for enterprises requiring high throughput and low latency for production AI deployments.
Browse 4 LLM models available from Cerebras. Compare prices and features.
Models (4)
| Organization | Model Name | Original Model | Input | Output | Free | |||
|---|---|---|---|---|---|---|---|---|
|
|
Z.ai | GLM-4.7 |
zai-glm-4.7
|
$2.25 | $2.75 |
|
||
|
|
OpenAI | GPT OSS 120B |
gpt-oss-120b
|
$0.25 | $0.69 |
|
||
|
|
qwen | Qwen3-235B-A22B-Instruct-2507 |
qwen-3-235b-a22b-instruct-2507
|
$0.60 | $1.20 | |||
|
|
Cerebras | Llama 3.1 8B |
llama3.1-8b
|
$0.10 | $0.10 |