Groq
groq
Updated 32 minutes ago
Groq is an AI inference company that pioneered the LPU (Language Processing Unit) in 2016, the first chip purpose-built for AI inference. Their proprietary LPU inference engine delivers ultra-low latency AI inference, with benchmarks showing Llama 2 70B running at 300 tokens per second—reportedly 10x faster than NVIDIA H100 clusters and 18x faster on Anyscale's LLMPerf Leaderboard. Groq focuses on making AI inference fast and affordable at scale, offering both cloud services and on-premises deployment options, with a mission to enable real-time AI applications that were previously impossible due to latency constraints.
Browse 27 LLM models available from Groq. Compare prices and features.
Models (27)
| Organization | Model Name | Original Model | Input | Output | Free | |||
|---|---|---|---|---|---|---|---|---|
|
|
Moonshot AI | Kimi K2-Instruct-0905 |
moonshotai/kimi-k2-instruct-0905
|
$1.00 | $3.00 | |||
|
|
OpenAI | GPT OSS 120B |
openai/gpt-oss-120b
|
$0.15 | $0.60 |
|
||
|
|
OpenAI | GPT OSS 20B |
openai/gpt-oss-20b
|
$0.08 | $0.30 | |||
|
|
Moonshot AI | Kimi K2 Instruct |
moonshotai/kimi-k2-instruct
|
$1.00 | $3.00 | |||
|
|
qwen | Qwen3 32B |
qwen/qwen3-32b
|
$0.29 | $0.59 | |||
|
|
DeepSeek | DeepSeek R1 Distill Llama 70B |
deepseek-r1-distill-llama-70b
|
$0.75 | $0.99 | |||
|
|
OpenAI | gpt-oss-safeguard-20b |
openai/gpt-oss-safeguard-20b
|
$0.08 | $0.30 | |||
|
|
Groq | Llama Guard 4 12B |
meta-llama/llama-guard-4-12b
|
$0.20 | $0.20 | |||
|
|
Groq | Llama Guard 3 8B |
llama-guard-3-8b
|
$0.20 | $0.20 | |||
|
|
Gemma 2 9B |
gemma2-9b-it
|
$0.20 | $0.20 | ||||
|
|
Groq | ALLaM-2-7b |
allam-2-7b
|
$0.00 | $0.00 | Free | ||
|
|
Groq | Compound |
groq/compound
|
$0.00 | $0.00 | Free | ||
|
|
Groq | Compound Mini |
groq/compound-mini
|
$0.00 | $0.00 | Free | ||
|
|
Groq | Llama 3 70B |
llama3-70b-8192
|
$0.59 | $0.79 | |||
|
|
Groq | Llama 3 8B |
llama3-8b-8192
|
$0.05 | $0.08 | |||
|
|
Groq | Llama 3.1 8B Instant |
llama-3.1-8b-instant
|
$0.05 | $0.08 | |||
|
|
Groq | Llama 3.3 70B Versatile |
llama-3.3-70b-versatile
|
$0.59 | $0.79 | |||
|
|
Groq | Llama Prompt Guard 2 22M |
meta-llama/llama-prompt-guard-2-22m
|
$0.03 | $0.03 | |||
|
|
Groq | Llama Prompt Guard 2 86M |
meta-llama/llama-prompt-guard-2-86m
|
$0.04 | $0.04 | |||
|
|
Meta | llama-4-maverick-17b-128e-instruct |
meta-llama/llama-4-maverick-17b-128e-instruct
|
$0.20 | $0.60 | |||
|
|
Meta | llama-4-scout-17b-16e-instruct |
meta-llama/llama-4-scout-17b-16e-instruct
|
$0.11 | $0.34 | |||
|
|
Groq | Mistral Saba 24B |
mistral-saba-24b
|
$0.79 | $0.79 | |||
|
|
Groq | Orpheus Arabic Saudi |
canopylabs/orpheus-arabic-saudi
|
$40.00 | $0.00 | |||
|
|
Groq | Orpheus V1 English |
canopylabs/orpheus-v1-english
|
$0.00 | $0.00 | Free | ||
|
|
Groq | Qwen QwQ 32B |
qwen-qwq-32b
|
$0.29 | $0.39 | |||
|
|
Nvidia | Whisper Large v3 |
whisper-large-v3
|
$0.00 | $0.00 | Free | ||
|
|
Groq | Whisper Large v3 Turbo |
whisper-large-v3-turbo
|
$0.00 | $0.00 | Free |