# GMI Cloud GMI Cloud is an AI model hosting platform that provides access to leading large language models including Kimi K2.5, Claude (Haiku 4.5, Opus 4.1, Sonnet 4, 3.7 Sonnet), GPT-5.1, Gemini 2.5, Grok 2, and DeepSeek models. The platform offers serverless deployment with transparent pricing per 1M tokens, GPU hardware options (H200), and model metadata including context lengths, quantization (int4, fp8), and provider information. GMI Cloud features an OpenAI-compatible API at api.gmi-serving.com for easy integration. ## Provider Information - **Website**: - **Available Models**: 49 ## Models | Name | Original Name | $ Input Price (per 1M) | $ Output Price (per 1M) | Free | Link | |------|---------------|---------------------|----------------------|------|------| | moonshotai/Kimi-K2.5 | moonshotai/Kimi-K2.5 | 0.60 | 3.00 | | | | Claude Haiku 4.5 | anthropic/claude-haiku-4.5 | 1.00 | 5.00 | | | | Claude Opus 4.1 | anthropic/claude-opus-4.1 | 15.00 | 75.00 | | | | Claude Sonnet 4 | anthropic/claude-sonnet-4 | 3.00 | 15.00 | | | | Claude Opus 4.5 | anthropic/claude-opus-4.5 | 5.00 | 25.00 | | | | Claude Sonnet 4.5 | anthropic/claude-sonnet-4.5 | 3.00 | 15.00 | | | | GPT-4o-mini | openai/gpt-4o-mini | 0.15 | 0.60 | | | | GPT-4o | openai/gpt-4o | 2.50 | 10.00 | | | | GPT-5 | openai/gpt-5 | 1.25 | 10.00 | | | | GPT-5.1 | openai/gpt-5.1 | 1.25 | 10.00 | | | | GPT-5.1-chat | openai/gpt-5.1-chat | 1.25 | 10.00 | | | | GPT-5.2-chat | openai/gpt-5.2-chat | 1.75 | 14.00 | | | | GPT-5.2 | openai/gpt-5.2 | 1.75 | 14.00 | | | | MiniMax-M2.1 | MiniMaxAI/MiniMax-M2.1 | 0.30 | 1.20 | | | | DeepSeek-V3.2-Speciale | deepseek-ai/DeepSeek-V3.2-Speciale | 0.28 | 0.40 | | | | deepseek-ai/DeepSeek-V3.2 | deepseek-ai/DeepSeek-V3.2 | 0.20 | 0.32 | | | | Kimi-K2-Thinking | moonshotai/Kimi-K2-Thinking | 0.80 | 1.20 | | | | MiniMax-M2 | MiniMaxAI/MiniMax-M2 | 0.30 | 1.20 | | | | DeepSeek V3.1 | deepseek-ai/DeepSeek-V3.1 | 0.27 | 1.00 | | | | GLM-4.6 | zai-org/GLM-4.6 | 0.60 | 2.00 | | | | DeepSeek-V3.2-Exp | deepseek-ai/DeepSeek-V3.2-Exp | 0.27 | 0.41 | | | | DeepSeek V3 0324 | deepseek-ai/DeepSeek-V3-0324 | 0.18 | 0.60 | | | | deepseek-ai/DeepSeek-R1-0528 | deepseek-ai/DeepSeek-R1-0528 | 0.40 | 1.80 | | | | DeepSeek R1 | deepseek-ai/DeepSeek-R1 | 0.50 | 2.18 | | | | DeepSeek R1 Distill Llama 70B | deepseek-ai/DeepSeek-R1-Distill-Llama-70B | 0.25 | 0.75 | | | | DeepSeek R1 Distill Llama 8B | deepseek-ai/DeepSeek-R1-Distill-Llama-8B | 0.14 | 0.39 | | | | DeepSeek R1 Distill Qwen 14B | deepseek-ai/DeepSeek-R1-Distill-Qwen-14B | 0.20 | 0.20 | | | | DeepSeek R1 Distill Qwen 32B | deepseek-ai/DeepSeek-R1-Distill-Qwen-32B | 0.50 | 0.90 | | | | DeepSeek R1 Distill Qwen 7B | deepseek-ai/DeepSeek-R1-Distill-Qwen-7B | 0.10 | 0.20 | | | | DeepSeek-V3.1-Terminus | deepseek-ai/DeepSeek-V3.1-Terminus | 0.27 | 1.00 | | | | Llama 3.3 70B Instruct | meta-llama/Llama-3.3-70B-Instruct | 0.25 | 0.75 | | | | Llama-4 Maverick 17B 128E Instruct FP8 | meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 | 0.25 | 0.80 | | | | Llama-4 Scout 17B 16E Instruct | meta-llama/Llama-4-Scout-17B-16E-Instruct | 0.08 | 0.50 | | | | moonshotai/Kimi-K2-Instruct-0905 | moonshotai/Kimi-K2-Instruct-0905 | 0.30 | 1.70 | | | | Qwen3-30B-A3B | Qwen/Qwen3-30B-A3B | 0.08 | 0.25 | | | | Proxy: zai-org/GLM-4.7 | zai-org/GLM-4.7-FP8 | 0.33 | 1.50 | | | | ZAI: GLM-4.7-Flash | zai-org/GLM-4.7-Flash | 0.07 | 0.40 | | | | Claude Opus 4.6 | anthropic/claude-opus-4.6 | 5.00 | 25.00 | | | | MiniMaxAI/MiniMax-M2.5 | MiniMaxAI/MiniMax-M2.5 | 0.30 | 1.20 | | | | Claude Sonnet 4.6 | anthropic/claude-sonnet-4.6 | 3.00 | 15.00 | | | | Gemini-3.1-Pro | google/gemini-3.1-pro-preview | 2.00 | 12.00 | | | | GPT-5.2-codex | openai/gpt-5.2-codex | 1.75 | 14.00 | | | | GPT-5.3-codex | openai/gpt-5.3-codex | 1.75 | 14.00 | | | | Qwen3.5 122B A10B | Qwen/Qwen3.5-122B-A10B | 0.40 | 3.20 | | | | Qwen3.5 27B | Qwen/Qwen3.5-27B | 0.30 | 2.40 | | | | Qwen3.5 35B A3B | Qwen/Qwen3.5-35B-A3B | 0.25 | 2.00 | | | | Qwen3.5 397B A17B | Qwen/Qwen3.5-397B-A17B | 0.60 | 3.60 | | | | Qwen3 Next 80B A3B Instruct | Qwen/Qwen3-Next-80B-A3B-Instruct | 0.15 | 1.50 | | | | Qwen3 Next 80B A3B Thinking | Qwen/Qwen3-Next-80B-A3B-Thinking | 0.15 | 1.50 | | | --- [← Back to all providers](/llm.txt)