# Vercel AI Gateway Vercel AI Gateway is a powerful observability and routing layer for AI applications that provides analytics, cost tracking, and caching for requests to major AI providers. The platform supports unified access to models from OpenAI, Anthropic, Google, Meta, Mistral, and other providers through a single API endpoint. Vercel's AI Gateway enables rate limiting, request caching, and fallback mechanisms to improve reliability and reduce costs for AI-powered applications. ## Provider Information - **Website**: - **Available Models**: 165 ## Models | Name | Original Name | $ Input Price (per 1M) | $ Output Price (per 1M) | Free | Link | |------|---------------|---------------------|----------------------|------|------| | Grok Code Fast 1 | grok-code-fast-1 | 0.20 | 1.50 | | | | Claude Sonnet 4.5 | claude-sonnet-4.5 | 3.00 | 15.00 | | | | Gemini 3 Flash | gemini-3-flash | 0.50 | 3.00 | | | | DeepSeek V3.2 | deepseek-v3.2 | 0.26 | 0.38 | | | | Gemini 2.5 Flash Lite | gemini-2.5-flash-lite | 0.10 | 0.40 | | | | GPT 5.2 | gpt-5.2 | 1.75 | 14.00 | | | | Claude Haiku 4.5 | claude-haiku-4.5 | 1.00 | 5.00 | | | | Claude Opus 4.5 | claude-opus-4.5 | 5.00 | 25.00 | | | | Gemini 2.5 Flash | gemini-2.5-flash | 0.30 | 2.50 | | | | GLM 4.7 | glm-4.7 | 0.43 | 1.75 | | | | Claude 3.7 Sonnet | claude-3.7-sonnet | 3.00 | 15.00 | | | | Grok 4.1 Fast Non-Reasoning | grok-4.1-fast-non-reasoning | 0.20 | 0.50 | | | | GPT-5 Chat | gpt-5-chat | 1.25 | 10.00 | | | | Claude Sonnet 4 | claude-sonnet-4 | 3.00 | 15.00 | | | | Gemini 3 Pro Preview | gemini-3-pro-preview | 2.00 | 12.00 | | | | GPT-5.2-Codex | gpt-5.2-codex | 1.75 | 14.00 | | | | GPT-4.1 mini | gpt-4.1-mini | 0.40 | 1.60 | | | | Gemini 2.5 Pro | gemini-2.5-pro | 1.25 | 10.00 | | | | GPT-5 | gpt-5 | 1.25 | 10.00 | | | | Grok 4.1 Fast Reasoning | grok-4.1-fast-reasoning | 0.20 | 0.50 | | | | GPT 5.1 Thinking | gpt-5.1-thinking | 1.25 | 10.00 | | | | GPT-5 nano | gpt-5-nano | 0.05 | 0.40 | | | | GPT-5.1 Instant | gpt-5.1-instant | 1.25 | 10.00 | | | | GPT-5 mini | gpt-5-mini | 0.25 | 2.00 | | | | GPT-4o mini | gpt-4o-mini | 0.15 | 0.60 | | | | gpt-oss-120b | gpt-oss-120b | 0.25 | 0.69 | | | | gpt-oss-20b | gpt-oss-20b | 0.07 | 0.30 | | | | MiniMax M2.1 | minimax-m2.1 | 0.30 | 1.20 | | | | Grok 4 Fast Non-Reasoning | grok-4-fast-non-reasoning | 0.20 | 0.50 | | | | Gemini 2.0 Flash | gemini-2.0-flash | 0.15 | 0.60 | | | | gpt-oss-safeguard-20b | gpt-oss-safeguard-20b | 0.08 | 0.30 | | | | text-embedding-3-small | text-embedding-3-small | 0.02 | | | | | Mistral Small | mistral-small | 0.10 | 0.30 | | | | GPT-4o | gpt-4o | 2.50 | 10.00 | | | | GLM 4.6 | glm-4.6 | 0.45 | 1.80 | | | | Nano Banana Pro (Gemini 3 Pro Image) | gemini-3-pro-image | 2.00 | 120.00 | | | | Nano Banana (Gemini 2.5 Flash Image) | gemini-2.5-flash-image | 0.30 | 2.50 | | | | Ministral 3B | ministral-3b | 0.04 | 0.04 | | | | GPT-4.1 | gpt-4.1 | 2.00 | 8.00 | | | | DeepSeek V3.2 Thinking | deepseek-v3.2-thinking | 0.28 | 0.42 | | | | Mistral Embed | mistral-embed | 0.10 | | | | | GPT-5.2 Chat | gpt-5.2-chat | 1.75 | 14.00 | | | | Grok 4 Fast Reasoning | grok-4-fast-reasoning | 0.20 | 0.50 | | | | Nova Lite | nova-lite | 0.06 | 0.24 | | | | Claude 3.5 Haiku | claude-3.5-haiku | 0.80 | 4.00 | | | | o3 | o3 | 2.00 | 8.00 | | | | Kimi K2 0905 | kimi-k2-0905 | 1.00 | 3.00 | | | | GPT-4.1 nano | gpt-4.1-nano | 0.10 | 0.40 | | | | GPT-5.1 Codex mini | gpt-5.1-codex-mini | 0.25 | 2.00 | | | | GPT-5-Codex | gpt-5-codex | 1.25 | 10.00 | | | | GPT-5.1-Codex | gpt-5.1-codex | 1.25 | 10.00 | | | | MiniMax M2 | minimax-m2 | 0.30 | 1.20 | | | | Gemini 2.5 Flash Lite Preview 09-2025 | gemini-2.5-flash-lite-preview-09-2025 | 0.10 | 0.40 | | | | DeepSeek V3 0324 | deepseek-v3 | 0.77 | 0.77 | | | | o4-mini | o4-mini | 1.10 | 4.40 | | | | Gemini 2.5 Flash Preview 09-2025 | gemini-2.5-flash-preview-09-2025 | 0.30 | 2.50 | | | | text-embedding-3-large | text-embedding-3-large | 0.13 | | | | | MiMo V2 Flash | mimo-v2-flash | 0.10 | 0.30 | | | | DeepSeek R1 0528 | deepseek-r1 | 1.35 | 5.40 | | | | Gemini Embedding 001 | gemini-embedding-001 | 0.15 | | | | | Kimi K2 | kimi-k2 | 0.50 | 2.00 | | | | Grok 4 | grok-4 | 3.00 | 15.00 | | | | Claude Opus 4.1 | claude-opus-4.1 | 15.00 | 75.00 | | | | Mistral Large 3 | mistral-large-3 | 0.50 | 1.50 | | | | Sonar | sonar | 1.00 | 1.00 | | | | Sonar Reasoning Pro | sonar-reasoning-pro | 2.00 | 8.00 | | | | DeepSeek V3.1 | deepseek-v3.1 | 0.50 | 1.50 | | | | Claude 3.5 Sonnet | claude-3.5-sonnet | 3.00 | 15.00 | | | | Qwen3 Max | qwen3-max | 1.20 | 6.00 | | | | GPT 5.1 Codex Max | gpt-5.1-codex-max | 1.25 | 10.00 | | | | Gemini 2.0 Flash Lite | gemini-2.0-flash-lite | 0.08 | 0.30 | | | | GLM 4.5 Air | glm-4.5-air | 0.20 | 1.10 | | | | Morph V3 Fast | morph-v3-fast | 0.80 | 1.20 | | | | Llama 4 Scout 17B 16E Instruct | llama-4-scout | 0.17 | 0.66 | | | | DeepSeek V3.1 Terminus | deepseek-v3.1-terminus | 0.27 | 1.00 | | | | Llama 4 Maverick 17B 128E Instruct | llama-4-maverick | 0.24 | 0.97 | | | | Qwen3 Coder 480B A35B Instruct | qwen3-coder | 0.40 | 1.60 | | | | Qwen 3 32B | qwen-3-32b | 0.10 | 0.30 | | | | Kimi K2 Thinking | kimi-k2-thinking | 0.60 | 2.50 | | | | Grok 3 Mini Beta | grok-3-mini | 0.30 | 0.50 | | | | Sonar Pro | sonar-pro | 3.00 | 15.00 | | | | o3-mini | o3-mini | 1.10 | 4.40 | | | | GPT 5.2 | gpt-5.2-pro | 21.00 | 168.00 | | | | Pixtral 12B 2409 | pixtral-12b | 0.15 | 0.15 | | | | Grok 2 Vision | grok-2-vision | 2.00 | 10.00 | | | | Qwen3 Coder Plus | qwen3-coder-plus | 1.00 | 5.00 | | | | Nvidia Nemotron Nano 12B V2 VL | nemotron-nano-12b-v2-vl | 0.20 | 0.60 | | | | Kimi K2 Thinking Turbo | kimi-k2-thinking-turbo | 1.15 | 8.00 | | | | Qwen3 Embedding 8B | qwen3-embedding-8b | 0.05 | | | | | Ministral 8B | ministral-8b | 0.10 | 0.10 | | | | GLM-4.6V-Flash | glm-4.6v-flash | | | | | | Claude 3 Haiku | claude-3-haiku | 0.25 | 1.25 | | | | Nova Micro | nova-micro | 0.04 | 0.14 | | | | Claude Opus 4 | claude-opus-4 | 15.00 | 75.00 | | | | text-embedding-ada-002 | text-embedding-ada-002 | 0.10 | | | | | GLM 4.7 FlashX | glm-4.7-flashx | 0.06 | 0.40 | | | | Qwen3 Next 80B A3B Instruct | qwen3-next-80b-a3b-instruct | 0.09 | 1.10 | | | | GLM 4.7 Flash | glm-4.7-flash | | | | | | Claude 3.5 Sonnet (2024-06-20) | claude-3.5-sonnet-20240620 | 3.00 | 15.00 | | | | GLM 4.5 | glm-4.5 | 0.60 | 2.20 | | | | Grok 3 Beta | grok-3 | 3.00 | 15.00 | | | | GPT-3.5 Turbo | gpt-3.5-turbo | 0.50 | 1.50 | | | | GPT-4 Turbo | gpt-4-turbo | 10.00 | 30.00 | | | | Qwen3 Embedding 0.6B | qwen3-embedding-0.6b | 0.01 | | | | | Nova 2 Lite | nova-2-lite | 0.30 | 2.50 | | | | Mistral Medium 3.1 | mistral-medium | 0.40 | 2.00 | | | | GPT-5 pro | gpt-5-pro | 15.00 | 120.00 | | | | Qwen3 Next 80B A3B Thinking | qwen3-next-80b-a3b-thinking | 0.15 | 1.50 | | | | Grok 3 Fast Beta | grok-3-fast | 5.00 | 25.00 | | | | o1 | o1 | 15.00 | 60.00 | | | | Morph V3 Large | morph-v3-large | 0.90 | 1.90 | | | | Nvidia Nemotron Nano 9B V2 | nemotron-nano-9b-v2 | 0.06 | 0.23 | | | | GLM-4.6V | glm-4.6v | 0.30 | 0.90 | | | | Seed 1.6 | seed-1.6 | 0.25 | 2.00 | | | | Qwen3-14B | qwen-3-14b | 0.06 | 0.24 | | | | Grok 3 Mini Fast Beta | grok-3-mini-fast | 0.60 | 4.00 | | | | Codex Mini | codex-mini | 1.50 | 6.00 | | | | Command A | command-a | 2.50 | 10.00 | | | | GLM 4.5V | glm-4.5v | 0.60 | 1.80 | | | | Magistral Medium 2509 | magistral-medium | 2.00 | 5.00 | | | | o3 Pro | o3-pro | 20.00 | 80.00 | | | | Trinity Mini | trinity-mini | 0.05 | 0.15 | | | | Nova Pro | nova-pro | 0.80 | 3.20 | | | | Qwen3 Embedding 4B | qwen3-embedding-4b | 0.02 | | | | | Nemotron 3 Nano 30B A3B | nemotron-3-nano-30b-a3b | 0.06 | 0.24 | | | | Mistral Nemo | mistral-nemo | 0.15 | 0.15 | | | | Pixtral Large | pixtral-large | 2.00 | 6.00 | | | | Mixtral MoE 8x22B Instruct | mixtral-8x22b-instruct | 1.20 | 1.20 | | | | Claude 3 Opus | claude-3-opus | 15.00 | 75.00 | | | | Magistral Small 2509 | magistral-small | 0.50 | 1.50 | | | | Qwen3 Max Preview | qwen3-max-preview | 1.20 | 6.00 | | | | Devstral Small 1.1 | devstral-small | 0.10 | 0.30 | | | | o3-deep-research | o3-deep-research | 10.00 | 40.00 | | | | Imagen 4 Ultra | imagen-4.0-ultra-generate-001 | | | | | | FLUX.1 Kontext Max | flux-kontext-max | | | | | | FLUX.2 [klein] 9B | flux-2-klein-9b | | | | | | FLUX.2 [max] | flux-2-max | | | | | | Sonar Reasoning | sonar-reasoning | 1.00 | 5.00 | | | | FLUX.1 Kontext Pro | flux-kontext-pro | | | | | | Imagen 4 Fast | imagen-4.0-fast-generate-001 | | | | | | Imagen 4 | imagen-4.0-generate-001 | | | | | | GPT-3.5 Turbo Instruct | gpt-3.5-turbo-instruct | 1.50 | 2.00 | | | | Recraft V3 | recraft-v3 | | | | | | FLUX.2 [flex] | flux-2-flex | | | | | | FLUX.2 [pro] | flux-2-pro | | | | | | FLUX.2 [klein] 4B | flux-2-klein-4b | | | | | | GPT 4o Mini Search Preview | gpt-4o-mini-search-preview | 0.15 | 0.60 | | | | Kimi K2.5 | kimi-k2.5 | 0.50 | 2.80 | | | | Trinity Large Preview | trinity-large-preview | 0.25 | 1.00 | | | | Devstral 2 | devstral-2 | | | | | | Qwen3 Coder Next | qwen3-coder-next | 0.50 | 1.20 | | | | Claude Opus 4.6 | claude-opus-4.6 | 5.00 | 25.00 | | | | Qwen3 235B A22B Thinking 2507 | qwen3-235b-a22b-thinking | 0.30 | 2.90 | | | | Qwen 3 Max Thinking | qwen3-max-thinking | 1.20 | 6.00 | | | | GLM 5 | glm-5 | 1.00 | 3.20 | | | | Llama 3.1 8B | llama-3.1-8b | 0.10 | 0.10 | | | | MiniMax M2.5 | minimax-m2.5 | 0.30 | 1.20 | | | | Grok Imagine Image | grok-imagine-image | | | | | | Grok Imagine Image Pro | grok-imagine-image-pro | | | | | | LongCat Flash Chat | longcat-flash-chat | | | | | | LongCat Flash Thinking 2601 | longcat-flash-thinking-2601 | | | | | | LongCat Flash Thinking | longcat-flash-thinking | 0.15 | 1.50 | | | | Qwen 3.5 Plus | qwen3.5-plus | 0.40 | 2.40 | | | | Claude Sonnet 4.6 | claude-sonnet-4.6 | 3.00 | 15.00 | | | | Gemini 3.1 Pro Preview | gemini-3.1-pro-preview | 2.00 | 12.00 | | | --- [← Back to all providers](/llm.txt)