# GMI Cloud GMI Cloud is an AI model hosting platform that provides access to leading large language models including Kimi K2.5, Claude (Haiku 4.5, Opus 4.1, Sonnet 4, 3.7 Sonnet), GPT-5.1, Gemini 2.5, Grok 2, and DeepSeek models. The platform offers serverless deployment with transparent pricing per 1M tokens, GPU hardware options (H200), and model metadata including context lengths, quantization (int4, fp8), and provider information. GMI Cloud features an OpenAI-compatible API at api.gmi-serving.com for easy integration. ## Provider Information - **Website**: - **Available Models**: 45 ## Models | Name | Original Name | $ Input Price (per 1M) | $ Output Price (per 1M) | Free | Link | |------|---------------|---------------------|----------------------|------|------| | moonshotai/Kimi-K2.5 | moonshotai/Kimi-K2.5 | 0.60 | 3.00 | | | | Claude Haiku 4.5 | anthropic/claude-haiku-4.5 | 1.00 | 5.00 | | | | Claude Opus 4.1 | anthropic/claude-opus-4.1 | 15.00 | 75.00 | | | | Claude 3.7 Sonnet | anthropic/claude-3.7-sonnet | 3.00 | 15.00 | | | | Claude Sonnet 4 | anthropic/claude-sonnet-4 | 3.00 | 15.00 | | | | Claude Opus 4 | anthropic/claude-opus-4 | 15.00 | 75.00 | | | | Claude Opus 4.5 | anthropic/claude-opus-4.5 | 5.00 | 25.00 | | | | Claude Sonnet 4.5 | anthropic/claude-sonnet-4.5 | 3.00 | 15.00 | | | | GPT-4o-mini | openai/gpt-4o-mini | 0.15 | 0.60 | | | | GPT-4o | openai/gpt-4o | 2.50 | 10.00 | | | | GPT-5 | openai/gpt-5 | 1.25 | 10.00 | | | | GPT-5.1 | openai/gpt-5.1 | 1.25 | 10.00 | | | | GPT-5.1-chat | openai/gpt-5.1-chat | 1.25 | 10.00 | | | | GPT-5.2-chat | openai/gpt-5.2-chat | 1.75 | 14.00 | | | | GPT-5.2 | openai/gpt-5.2 | 1.75 | 14.00 | | | | MiniMax-M2.1 | MiniMaxAI/MiniMax-M2.1 | 0.30 | 1.20 | | | | DeepSeek-V3.2-Speciale | deepseek-ai/DeepSeek-V3.2-Speciale | 0.28 | 0.40 | | | | DeepSeek-V3.2 | deepseek-ai/DeepSeek-V3.2 | 0.28 | 0.40 | | | | Kimi-K2-Thinking | moonshotai/Kimi-K2-Thinking | 0.80 | 1.20 | | | | MiniMax-M2 | MiniMaxAI/MiniMax-M2 | 0.30 | 1.20 | | | | DeepSeek V3.1 | deepseek-ai/DeepSeek-V3.1 | 0.27 | 1.00 | | | | GLM-4.6 | zai-org/GLM-4.6 | 0.60 | 2.00 | | | | DeepSeek-V3.2-Exp | deepseek-ai/DeepSeek-V3.2-Exp | 0.27 | 0.41 | | | | DeepSeek V3 0324 | deepseek-ai/DeepSeek-V3-0324 | 0.28 | 0.88 | | | | deepseek-ai/DeepSeek-R1-0528 | deepseek-ai/DeepSeek-R1-0528 | 0.39 | 1.60 | | | | DeepSeek R1 | deepseek-ai/DeepSeek-R1 | 0.50 | 2.18 | | | | DeepSeek R1 Distill Llama 70B | deepseek-ai/DeepSeek-R1-Distill-Llama-70B | 0.25 | 0.75 | | | | DeepSeek R1 Distill Llama 8B | deepseek-ai/DeepSeek-R1-Distill-Llama-8B | 0.14 | 0.39 | | | | DeepSeek R1 Distill Qwen 14B | deepseek-ai/DeepSeek-R1-Distill-Qwen-14B | 0.20 | 0.20 | | | | DeepSeek R1 Distill Qwen 32B | deepseek-ai/DeepSeek-R1-Distill-Qwen-32B | 0.50 | 0.90 | | | | DeepSeek R1 Distill Qwen 7B | deepseek-ai/DeepSeek-R1-Distill-Qwen-7B | 0.10 | 0.20 | | | | DeepSeek-V3.1-Terminus | deepseek-ai/DeepSeek-V3.1-Terminus | 0.27 | 1.00 | | | | Llama 3.3 70B Instruct | meta-llama/Llama-3.3-70B-Instruct | 0.25 | 0.75 | | | | Llama-4 Maverick 17B 128E Instruct FP8 | meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 | 0.25 | 0.80 | | | | Llama-4 Scout 17B 16E Instruct | meta-llama/Llama-4-Scout-17B-16E-Instruct | 0.08 | 0.50 | | | | moonshotai/Kimi-K2-Instruct-0905 | moonshotai/Kimi-K2-Instruct-0905 | 0.39 | 1.60 | | | | Qwen3-30B-A3B | Qwen/Qwen3-30B-A3B | 0.08 | 0.25 | | | | Qwen3 Next 80B A3B Instruct | Qwen/Qwen3-Next-80B-A3B-Instruct | 0.15 | 1.50 | | | | Qwen3 Next 80B A3B Thinking | Qwen/Qwen3-Next-80B-A3B-Thinking | 0.15 | 1.50 | | | | ZAI: GLM-4.7-FP8 | zai-org/GLM-4.7-FP8 | 0.40 | 2.00 | | | | ZAI: GLM-4.7-Flash | zai-org/GLM-4.7-Flash | 0.07 | 0.40 | | | | Claude Opus 4.6 | anthropic/claude-opus-4.6 | 5.00 | 25.00 | | | | MiniMaxAI/MiniMax-M2.5 | MiniMaxAI/MiniMax-M2.5 | 0.30 | 1.20 | | | | Claude Sonnet 4.6 | anthropic/claude-sonnet-4.6 | 3.00 | 15.00 | | | | Gemini-3.1-Pro | google/gemini-3.1-pro-preview | 2.00 | 12.00 | | | --- [← Back to all providers](/llm.txt)