# LLM Providers Complete index of all LLM providers and their available models. ## Providers | Provider | Models | Description | Website | |----------|--------|-------------|---------| | [Yupp](/llm/yupp.txt) | 482 models | Yupp is an AI platform that provides access to over 800 AI models from OpenAI, Google, Anthropic, Meta, and other leading providers at no cost. The platform features a discovery and evaluation leaderboard for comparing AI models based on human feedback, with gamified evaluation to improve large language models. Yupp aggregates models from multiple providers and offers detailed model information including descriptions, ratings, aliases, and active status for each model. | [Yupp](https://yupp.ai/) | | [AIHubMix](/llm/aihubmix.txt) | 397 models | AIHubMix is a unified LLM API router and proxy service that provides a single, consistent interface to access multiple major AI providers including OpenAI, Anthropic's Claude, Google Gemini, DeepSeek, Alibaba's Qwen, and ByteDance's Doubao. The service uses an OpenAI-compatible API standard, making it easy to integrate with existing applications. AIHubMix offers unlimited concurrency support and a more economical way to access premium LLM APIs by eliminating the need to directly apply for and manage individual API keys from each provider, simplifying AI model access for developers. | [AIHubMix](https://aihubmix.com/) | | [Requesty](/llm/requesty.txt) | 368 models | Requesty is a unified API router and proxy service that provides access to multiple AI model providers through a single OpenAI-compatible interface. The platform aggregates models from providers including Moonshot (Kimi), Alibaba (Qwen), Together AI (Llama, DeepSeek), and others, offering streamlined access to diverse LLM capabilities. Requesty simplifies AI model integration by eliminating the need to manage multiple API keys and endpoints, while providing consistent pricing and standardized model metadata across providers. | [Requesty](https://requesty.ai/) | | [Blackbox AI](/llm/blackboxai.txt) | 353 models | Blackbox AI is an AI platform that provides access to a wide range of large language models from leading providers including Anthropic (Claude Sonnet 4.5, Claude Opus 4.5, Claude 3.5/3.7 series), OpenAI (GPT-4.1, GPT-4o, GPT-5.1/5.2, o1/o3/o4 series), Google (Gemini 2.5/3 Pro/Flash, Gemma 3), xAI (Grok 3/3 Mini/2 Vision), Meta (Llama 3/3.1/3.2/3.3/4), Mistral, DeepSeek (R1, V3, distill variants), Qwen3, Microsoft Phi 4, and many others. Blackbox AI offers competitive pricing per 1M tokens with transparent input/output pricing, context length information for each model, and supports both paid and free tiers for popular models. The platform provides an OpenAI-compatible API for easy integration. | [Blackbox AI](https://www.blackbox.ai/) | | [Nano-GPT](/llm/nanogpt.txt) | 326 models | Nano-GPT is an AI API aggregation service that provides access to multiple large language models through a unified OpenAI-compatible interface. The platform offers models from various providers including OpenAI, Anthropic, and others, allowing developers to access diverse AI capabilities through a single API endpoint. Nano-GPT simplifies AI model integration by providing standardized access to multiple models without the need to manage separate API keys and endpoints for each provider. | [Nano-GPT](https://nano-gpt.com/) | | [Kilo Code](/llm/kilocode.txt) | 316 models | Kilo Code provides access to a wide range of AI models through their unified API, including free models from providers like MiniMax, Z.AI (GLM), MoonshotAI, and more. The platform offers models optimized for coding, reasoning, and agentic workflows with features like image support, prompt caching, and tools integration. | [Kilo Code](https://kilo.ai/) | | [OpenRouter](/llm/openrouter.txt) | 300 models | OpenRouter is a unified API gateway that provides access to hundreds of AI models from dozens of providers through a single OpenAI-compatible interface. The platform aggregates models from Anthropic (Claude), OpenAI (GPT), Google (Gemini), Meta (Llama), xAI (Grok), Mistral, DeepSeek, Qwen, and many others. OpenRouter features transparent pricing per 1M tokens, provider fallbacks for reliability, and supports both chat and code models. The platform is widely used by AI applications including Cline, Roo Code, and other tools that need access to multiple models through a unified API. | [OpenRouter](https://openrouter.ai/models) | | [ValorGPT](/llm/valorgpt.txt) | 290 models | ValorGPT is an AI model aggregation platform that provides access to over 300 AI models from leading providers including OpenAI, Anthropic, Google, Meta, Mistral, xAI, DeepSeek, Qwen, AI21, Cohere, and many more. The platform offers a unified interface to access diverse LLM capabilities including reasoning models, vision models, and coding models. ValorGPT features model comparison, filtering by provider and capabilities, and provides direct links to each model's page for detailed information. | [ValorGPT](https://www.valorgpt.com/) | | [LangDB](/llm/langdb.txt) | 288 models | LangDB is an AI gateway platform that provides access to multiple large language models including GPT-5.1, GPT-5.2, Claude Opus 4.5, Claude Sonnet 4.5, Claude Haiku 4.5, DeepSeek Chat, DeepSeek Reasoner, Gemini models, Mistral models, and others. The platform offers a unified API interface for accessing diverse LLM capabilities with model-specific endpoints for each model. LangDB focuses on simplifying AI model integration by providing consistent access to multiple providers through a single platform. | [LangDB](https://langdb.ai/) | | [302.AI](/llm/302ai.txt) | 212 models | N/A | [302.AI](https://doc.302.ai) | | [Arena AI](/llm/arenaai.txt) | 200 models | Arena AI is an AI model comparison and benchmarking platform that allows users to directly compare and chat with multiple state-of-the-art language models side-by-side. The platform features models from major providers including OpenAI (GPT series), Anthropic (Claude Opus, Sonnet, Haiku), Google (Gemini), xAI (Grok), Meta (Llama), and many others. Arena AI provides direct chat access to these models through an intuitive interface, making it easy to compare their responses, capabilities, and performance on various tasks in real-time. | [Arena AI](https://arena.ai/?mode=direct) | | [CometAPI](/llm/cometapi.txt) | 166 models | CometAPI is a unified AI model aggregation platform that provides access to hundreds of large language models from leading providers including OpenAI, Anthropic, Google, Meta, xAI, DeepSeek, Qwen, Mistral, and many others. The platform offers a simple OpenAI-compatible API interface for accessing diverse LLM capabilities with transparent pricing per 1M tokens. CometAPI aggregates models from multiple providers, making it easy to integrate AI capabilities into applications without managing multiple provider integrations. | [CometAPI](https://apidoc.cometapi.com/) | | [Vercel AI Gateway](/llm/vercel.txt) | 162 models | Vercel AI Gateway is a powerful observability and routing layer for AI applications that provides analytics, cost tracking, and caching for requests to major AI providers. The platform supports unified access to models from OpenAI, Anthropic, Google, Meta, Mistral, and other providers through a single API endpoint. Vercel's AI Gateway enables rate limiting, request caching, and fallback mechanisms to improve reliability and reduce costs for AI-powered applications. | [Vercel AI Gateway](https://vercel.com/ai-gateway/models) | | [AIMLAPI](/llm/aimlapi.txt) | 161 models | AIMLAPI is a unified API gateway that provides access to over 1000 AI models from multiple providers including OpenAI, Anthropic, Google, Meta, Mistral, and many others. The platform offers a single API endpoint for accessing diverse LLM capabilities, with transparent pricing per 1M tokens. AIMLAPI supports text generation, code completion, image generation, and audio models, making it easy to integrate AI capabilities into applications without managing multiple provider integrations. | [AIMLAPI](https://aimlapi.com/) | | [LLM Stats](/llm/llmstats.txt) | 128 models | LLM Stats is an AI model aggregator and benchmarking platform that provides comprehensive data on language models across multiple modalities including text chat, image generation, and video generation. The platform tracks models from major organizations like Anthropic, OpenAI, Google, Meta, and many others, offering standardized model names, display names, and metadata. LLM Stats serves as a centralized resource for discovering and comparing AI models across different providers and use cases. | [LLM Stats](https://llm-stats.com/) | | [Poe](/llm/poe.txt) | 119 models | N/A | [Poe](https://creator.poe.com/docs/external-applications/openai-compatible-api) | | [Together AI](/llm/togetherai.txt) | 117 models | Together AI is an AI Acceleration Cloud platform that provides API access to over 200 open-source large language models including Meta's Llama family, Google's Gemma, Mistral, Qwen, and many more. The platform eliminates the need for infrastructure management while offering fine-tuning capabilities to customize models with your own data. Together AI delivers blazing fast inference at low cost, making professional-grade AI accessible to developers and enterprises who need scalable, cost-effective AI model deployment without the complexity of managing their own infrastructure. | [Together AI](https://www.together.ai/) | | [FastRouter](/llm/fastrouter.txt) | 116 models | FastRouter.ai is a unified API gateway designed to simplify access to the world's leading large language models and multimodal models through a single interface. The service provides access to models like GPT-5, Claude, Grok, and various other LLMs while supporting multiple capabilities including chat, image generation, video processing, embeddings, and audio models. FastRouter helps solve the challenge of choosing between different AI models by providing intelligent routing that considers factors like which model is smartest for the task, cost-effectiveness, and speed requirements. The platform enables richer and more versatile AI applications with superior speed and high uptime. | [FastRouter](https://fastrouter.ai/) | | [Helicone](/llm/helicone.txt) | 100 models | Helicone is an observability and monitoring platform for LLM applications that provides detailed analytics, cost tracking, and performance metrics for AI model usage. Their public model registry serves as a comprehensive catalog of available LLMs with pricing information, context lengths, and provider details. Helicone helps developers understand their AI application costs, track usage patterns, and optimize model selection. The platform integrates with OpenAI-compatible APIs to provide real-time monitoring without code changes, offering insights into latency, token usage, and costs across different models and providers. | [Helicone](https://www.helicone.ai/) | | [ZenMUX](/llm/zenmux.txt) | 96 models | ZenMUX is an enterprise-grade AI model aggregation platform that serves as a unified API router for multiple AI providers. The platform supports 75+ providers with auto failover, intelligent routing capabilities, error compensation mechanisms, and transparent pricing across providers. ZenMUX focuses on eliminating hallucination risks through intelligent routing, offers context-aware routing for different scenarios and long context requirements, and provides provider insurance as a safety net for AI model reliability. The platform is built for enterprise use with focus on stability and quick model deployment, supporting both local models and cloud-based providers. | [ZenMUX](https://zenmux.ai/) | | [Azure AI Services](/llm/azurecognitiveservices.txt) | 95 models | Azure AI Services (formerly Cognitive Services) is Microsoft's comprehensive suite of cloud-based APIs and services that enable developers to integrate artificial intelligence capabilities into applications without requiring extensive AI expertise. The services provide a wide range of AI features including Computer Vision for image analysis and processing, Natural Language Processing for text understanding and analysis, Speech Recognition for voice-to-text conversion, Machine Translation for language translation, and Decision services. A single Azure AI services resource allows access to multiple services with one set of credentials, supporting both prebuilt and customizable models with a layered security model including virtual network configuration. | [Azure AI Services](https://azure.microsoft.com/en-us/services/ai-services) | | [SiliconFlow](/llm/siliconflow.txt) | 90 models | SiliconFlow is a lightning-fast AI platform for developers that provides access to 200+ optimized LLMs and multimodal models through simple APIs. The platform enables deployment, fine-tuning, and running of models including DeepSeek, Qwen, Llama, GLM, FLUX, and many others. SiliconFlow focuses on providing fast, cost-effective AI infrastructure with support for text generation, image generation, video generation, audio models, embeddings, and rerankers. | [SiliconFlow](https://cloud.siliconflow.com/models) | | [Routeway](/llm/routeway.txt) | 86 models | Routeway is a unified AI API platform providing access to 30+ AI models from OpenAI, Anthropic, Google, and other leading providers through a single API endpoint. The platform offers transparent, competitive pricing per 1M tokens and includes both paid and free model tiers. Routeway simplifies AI model integration by eliminating the need to manage multiple API keys and endpoints, while providing consistent pricing and standardized model metadata across providers. | [Routeway](https://routeway.ai/) | | [Glama](/llm/glama.txt) | 84 models | Glama is a privacy-first AI aggregation platform that provides access to leading LLM models through a unified gateway. The platform offers enterprise-grade security, features like agents, MCP (Model Context Protocol) support, prompt templates, and more. Glama aggregates models from major providers including OpenAI (GPT-5, GPT-4.1, O3/O4), Anthropic (Claude Opus 4.5, Sonnet 4), Google (Gemini 2.5/3), xAI (Grok 3/4), Mistral, DeepSeek, Moonshot (Kimi), Meta (Llama), and many others. The platform provides transparent pricing per 1M tokens with options for both per-token and per-million billing. | [Glama](https://glama.ai/) | | [Nvidia](/llm/nvidia.txt) | 82 models | NVIDIA NIM (NVIDIA Inference Microservices) is a platform that provides optimized AI model inference containers featuring industry-leading APIs for running AI models across NVIDIA's accelerated infrastructure. NIM supports models from major providers including Meta (Llama), Google (Gemma), Mistral, xAI (Grok), DeepSeek, Microsoft (Phi), Qwen, and NVIDIA's own Nemotron family. The platform offers standard APIs across multiple deployment options including cloud, on-premises, and local workstations, with microservices optimized for NVIDIA GPUs. NIM provides an OpenAI-compatible API endpoint at integrate.api.nvidia.com for easy integration, featuring over 180 models from various AI companies hosted on NVIDIA's inference infrastructure. | [Nvidia](https://docs.api.nvidia.com/nim/) | | [DeepInfra](/llm/deepinfra.txt) | 76 models | DeepInfra is a cloud-based AI inference platform that provides scalable, cost-effective infrastructure for deploying and running machine learning models without requiring users to manage underlying infrastructure. The platform offers access to 100+ machine learning models across multiple categories including text-to-image generation, object detection, automatic speech recognition (ASR), and text-to-text generation. DeepInfra enables serverless deployment, is production-ready, and simplifies the process of deploying deep learning models. Their mission is to democratize access to top AI models by providing fast, affordable ML inference capabilities, with integration available through platforms like OpenRouter offering access to 90 models. | [DeepInfra](https://deepinfra.com/) | | [Novita AI](/llm/novita.txt) | 73 models | Novita AI is a comprehensive AI platform providing access to over 200 AI models including LLMs, image generation, video creation, and audio models from leading providers like DeepSeek, Meta (Llama), Google (Gemini), Qwen, Anthropic, Mistral, OpenAI, and many others. The platform features an open model library with custom deployment options and GPU Instances for scalable inference. Novita specializes in state-of-the-art video generation models including Kling V1.6, V2.1, V2.5 Turbo, MiniMax Video, Hailuo 2.3, Vidu Q1, Vidu 2.0, PixVerse V4.5, Seedance V1, Wan 2.1/2.2/2.5, Hunyuan, and SynthesSeed, with transparent pricing per video/image generation. The platform also offers dedicated GPU cloud services for custom model deployment and training. | [Novita AI](https://novita.ai/) | | [Replicate](/llm/replicate.txt) | 68 models | Replicate is a platform for running machine learning models with a simple API, providing access to thousands of public models from the AI community. The platform specializes in diffusion models for image generation, along with text-to-speech, speech recognition, super-resolution, and more. Replicate handles infrastructure, GPU management, and scaling automatically, allowing developers to focus on building applications rather than managing servers. The platform features official collections of curated models from providers like OpenAI, Black Forest Labs (FLUX), Google, and others, with transparent per-second pricing for inference. | [Replicate](https://replicate.com/) | | [SiliconFlow (China)](/llm/siliconflowcn.txt) | 60 models | N/A | [SiliconFlow (China)](https://cloud.siliconflow.com/models) | | [Jiekou.AI](/llm/jiekou.txt) | 57 models | N/A | [Jiekou.AI](https://docs.jiekou.ai/docs/support/quickstart?utm_source=github_models.dev) | | [Alibaba (China)](/llm/alibabacn.txt) | 53 models | N/A | [Alibaba (China)](https://www.alibabacloud.com/help/en/model-studio/models) | | [GitHub Models](/llm/githubmodels.txt) | 52 models | N/A | [GitHub Models](https://docs.github.com/en/github-models) | | [Abacus](/llm/abacus.txt) | 51 models | N/A | [Abacus](https://abacus.ai/help/api) | | [Google Vertex AI](/llm/googlevertex.txt) | 49 models | Google Vertex AI is Google Cloud's integrated machine learning platform that combines data engineering, data science, and ML workflows for training, tuning, and deploying AI models and applications. The platform provides access to Google's Gemini models as well as Anthropic Claude models through GCP Vertex AI, enabling developers to easily build and deploy enterprise-ready generative AI experiences. Vertex AI supports model customization, batch predictions, online predictions, and features like Vertex AI Model Garden for discovering models. Requirements include a GCP account with billing enabled and proper API configuration, with the platform designed for enterprise-grade AI development. | [Google Vertex AI](https://cloud.google.com/vertex-ai) | | [NetMind](/llm/netmind.txt) | 49 models | NetMind is an AI inference platform that provides access to leading large language models including MiniMax M2.5, Moonshot AI (Kimi K2.5), Z.ai (GLM-5), DeepSeek, Qwen, and many others. The platform offers transparent pricing per 1M tokens with context lengths up to 262K tokens. NetMind supports both text chat models and image generation models through their unified API, with detailed model specifications including capabilities like function calling, reasoning, and multimodal support. | [NetMind](https://www.netmind.ai/) | | [RedPill](/llm/redpill.txt) | 49 models | RedPill is an AI model aggregation platform that provides access to a wide range of large language models from major providers including Anthropic (Claude), OpenAI (GPT), Google (Gemini), Meta (Llama), xAI (Grok), DeepSeek, and many others. The platform offers an OpenAI-compatible API interface with transparent pricing per 1M tokens, context length information, and detailed model specifications including input/output modalities, supported sampling parameters, and feature support. | [RedPill](https://redpill.ai/) | | [Monica AI](/llm/monica.txt) | 46 models | Monica AI is an AI aggregation platform that provides access to leading large language models and image generation models through a unified OpenAI-compatible API. The platform offers chat models including GPT-5, GPT-4.1, GPT-4o, O4-mini, O3, Claude Opus 4/Sonnet 4/Haiku 3.5, Gemini 2.5 Pro/Flash/Lite, Grok 3 Beta, DeepSeek Reasoner/Chat, Meta Llama 3/3.1/3.3, and NVIDIA Nemotron. For image generation, Monica provides Stable Diffusion XL/3/3.5, Flux Schnell/Dev/Pro, DALL·E 3, Playground V2.5, and Ideogram V2. The platform features transparent per-token pricing for chat models and per-image pricing for image models, with upscaling and object removal APIs also available. | [Monica AI](https://platform.monica.im/) | | [GMI Cloud](/llm/gmi.txt) | 43 models | GMI Cloud is an AI model hosting platform that provides access to leading large language models including Kimi K2.5, Claude (Haiku 4.5, Opus 4.1, Sonnet 4, 3.7 Sonnet), GPT-5.1, Gemini 2.5, Grok 2, and DeepSeek models. The platform offers serverless deployment with transparent pricing per 1M tokens, GPU hardware options (H200), and model metadata including context lengths, quantization (int4, fp8), and provider information. GMI Cloud features an OpenAI-compatible API at api.gmi-serving.com for easy integration. | [GMI Cloud](https://console.gmicloud.ai/) | | [Cloudflare AI Gateway](/llm/cloudflareaigateway.txt) | 42 models | N/A | [Cloudflare AI Gateway](https://developers.cloudflare.com/ai-gateway/) | | [Nebius Token Factory](/llm/nebius.txt) | 42 models | Nebius is a cloud platform that provides access to AI models through their TokenFactory inference service. The platform offers a wide range of open-source and proprietary models including DeepSeek, MiniMax, Kimi, Qwen, and others. Nebius focuses on providing fast, cost-effective AI inference with competitive pricing per 1M tokens and various quantization options (fp4, fp8) to optimize performance and cost. | [Nebius Token Factory](https://docs.tokenfactory.nebius.com/) | | [OpenAI](/llm/openai.txt) | 42 models | OpenAI is a leading AI research company focused on developing safe artificial general intelligence (AGI). They offer GPT-5, GPT-5 mini, and GPT-5 nano models through their API platform, with GPT-5.2 being their most capable model for coding and agentic tasks. OpenAI provides comprehensive documentation, API access, and has expanded into specialized sectors including healthcare with OpenAI for Healthcare. Their models are renowned for exceptional coding capabilities, complex front-end generation, and effective debugging, making them the go-to choice for developers building production AI applications. | [OpenAI](https://platform.openai.com/docs/) | | [RouterLink](/llm/routerlink.txt) | 42 models | RouterLink is an AI model aggregation platform by WORLD3 that provides access to leading large language models including Anthropic (Claude Opus 4.6, Sonnet 4.5, Haiku 4.5), OpenAI (GPT-5, GPT-5.2), Google (Gemini 3 Pro Preview), Z.AI (GLM-5), and many others. The platform offers models from various providers with transparent model metadata including descriptions, tags, and provider information. RouterLink features regional routing options (North America, Asia Pacific) for optimized latency. | [RouterLink](https://routerlink.world3.ai/) | | [MegaNova](/llm/meganova.txt) | 40 models | MegaNova is the first AI Character Cloud, built to power living AI characters for chat, games, and virtual worlds. It combines role-play-tuned LLMs with real-time voice, expressive images, and a video-ready pipeline so characters can think, speak, and react like living entities. Under the hood, MegaNova runs an accelerated inference layer that makes character interactions faster and up to 10x cheaper than generic AI clouds. The platform features the Manta model family (Mini, Flash, Pro) optimized for different latency and complexity requirements, plus a wide library of character-tuned models from providers like DeepSeek, Qwen, and others. | [MegaNova](https://meganova.ai/) | | [Runware](/llm/runware.txt) | 40 models | Runware is a comprehensive AI model hosting platform that provides access to 300K+ AI models for image and video generation. The platform supports model types including text-to-image, image-to-image, image editing, video generation, image-to-video, and audio-to-video. Runware aggregates models from major providers including Black Forest Labs (FLUX), Google (Imagen, Nano Banana), Sourceful (Riverflow), ByteDance (Seedream, SeedEdit), Ideogram, OpenAI (GPT Image), Alibaba (Qwen), Kling AI, MiniMax, Vidu, Pixverse, and many others. The platform features a simple API for instant model testing and deployment with global high-performance hosting for low-latency inference. | [Runware](https://runware.ai/) | | [Alibaba](/llm/alibaba.txt) | 39 models | N/A | [Alibaba](https://www.alibabacloud.com/help/en/model-studio/models) | | [Chutes.ai](/llm/chutes.txt) | 34 models | Chutes.ai is an AI inference platform that provides access to a wide variety of open-source large language models through an OpenAI-compatible API. The platform offers models from leading providers including Qwen, DeepSeek, Mistral, Google, NousResearch, Meta, and others. Chutes.ai features transparent pricing per 1M tokens, context length information, and model capabilities including JSON mode, tools/function calling, structured outputs, and reasoning. The platform supports both standard and confidential compute variants of models. | [Chutes.ai](https://llm.chutes.ai/) | | [JetBrains AI](/llm/jetbrains.txt) | 33 models | JetBrains AI Assistant provides access to a variety of cloud-based LLMs through the JetBrains AI service, directly integrated into JetBrains IDEs. The platform supports models from major providers including Anthropic (Claude 4.5/4.1/3.7 series), Google (Gemini 3/2.5/2.0), OpenAI (GPT-5.2, GPT-5.1-Codex, GPT-5, GPT-4.1, GPT-4o, o1/o3/o4), and xAI (Grok 4). JetBrains AI simplifies AI-powered development by eliminating the need for separate API keys while providing enterprise-grade security with no data retention or model training on user code. | [JetBrains AI](https://www.jetbrains.com/ai-assistant/) | | [Mammouth AI](/llm/mammouth.txt) | 32 models | Mammouth AI is a French AI aggregation platform that provides access to the best generative AI models through a single subscription starting from €10/month. The platform offers industry-leading models including GPT-5.1, Claude Sonnet 4.5, Gemini 2.5 Flash/Pro, Mistral Large 24.11, Grok 4, DeepSeek V3.1/V3.2 Exp, Llama 3 70B, and image models like FLUX, GPT Image, Stable Diffusion, and Nano Banana. Mammouth features one-click reprompting across models, custom Mammouth Projects for tailored context, image and file uploads, multi-device support (Android, iPhone, desktop), AI web search with well-sourced answers, and voice chat. The platform emphasizes GDPR compliance, zero data retention, and no model training on user content. | [Mammouth AI](https://mammouth.ai/) | | [CommonStack](/llm/commonstack.txt) | 30 models | CommonStack (by Gradient) is an AI model aggregation platform that provides access to leading large language models through a unified API. The platform offers models from major providers including OpenAI (GPT-4.1), Z.ai (GLM-5), Moonshot AI (Kimi K2.5), and others. CommonStack features transparent pricing per 1M tokens with detailed model specifications including context lengths, capabilities, and provider information. | [CommonStack](https://commonstack.ai/) | | [Cursor](/llm/cursor.txt) | 30 models | Cursor is an AI-powered code editor built on VS Code that provides integrated access to frontier coding models from all major providers. The platform supports Claude 4.5/4.1/3.7 series (Opus, Sonnet, Haiku), GPT-5.2/5.1/5/4.1 series, Gemini 3/2.5 series (Pro, Flash), Grok 3/4, DeepSeek V3/R1, and other leading models. Cursor features intelligent model selection, Auto mode for optimal model routing, and Max Mode for extended context windows up to 1M tokens on select models. The editor provides AI-assisted development with higher rate limits than direct API access, seamless integration into the development workflow, and enterprise-grade security with no data retention or model training on user code. | [Cursor](https://cursor.com/) | | [Okara](/llm/okara.txt) | 30 models | Okara is an AI model directory platform that aggregates and indexes AI/LLM providers and their models. The platform scrapes data from various provider APIs, normalizes it, and serves it through a web interface. Okara provides a comprehensive model directory featuring models from OpenAI, Anthropic, Google, Meta, DeepSeek, Qwen, Mistral, and many others. Users can compare features and capabilities, then start chatting with the best fit for their needs. The directory includes model details like purpose, company/organization, and capabilities for each listed model. | [Okara](https://okara.ai/ai-models) | | [OpenCode Zen](/llm/opencode.txt) | 29 models | OpenCode Zen is an AI-powered code editor and development platform that provides access to leading large language models optimized for software development tasks. The platform offers frontier models from major providers including Anthropic (Claude), OpenAI (GPT), Google (Gemini), xAI (Grok), and others, with features designed for agentic coding workflows. OpenCode Zen focuses on providing AI-assisted development with models featuring reasoning, tool use, vision capabilities, and extended context windows for complex programming tasks. | [OpenCode Zen](https://opencode.ai/docs/zen) | | [Baidu AI Studio](/llm/baidu.txt) | 28 models | Baidu AI Studio (Qianfan) is Baidu's large language model platform providing access to the ERNIE (Enhanced Representation through Knowledge Integration) series of models. The platform offers an OpenAI-compatible API interface for accessing Baidu's language models including ERNIE-Bot, ERNIE-Speed, and other specialized models. Baidu's LLMs are optimized for Chinese language understanding and generation, with support for various natural language processing tasks including conversational AI, content creation, and knowledge-based question answering. | [Baidu AI Studio](https://aistudio.baidu.com/) | | [Chats-LLM](/llm/chatsllm.txt) | 28 models | Chats-LLM is an AI model aggregation platform that provides access to leading large language models through an OpenAI-compatible API interface. The platform aggregates models from multiple providers including OpenRouter (Aurora, Free Models Router), StepFun, and many others. Chats-LLM features transparent pricing per 1M tokens, context length information, and model capabilities including reasoning, tools/function calling, and structured outputs. | [Chats-LLM](https://chats-llm.com/) | | [Google Gemini](/llm/gemini.txt) | 26 models | Google Gemini is Google's state-of-the-art multimodal large language model that handles text, audio, images, and more simultaneously. Available through both the Gemini Developer API and Google Cloud's Vertex AI, Gemini powers Google's AI assistant at gemini.google.com and offers enterprise solutions through Gemini Enterprise for business AI agents. The latest Gemini 3 series features exceptional zero-shot generation for software development, while Gemini Enterprise enables discovering, creating, sharing, and running AI agents. The model integrates deeply with Google's ecosystem including Workspace products. | [Google Gemini](https://ai.google.dev/gemini-api/docs) | | [Mistral AI](/llm/mistral.txt) | 26 models | Mistral AI is a pioneering French AI startup founded in April 2023, specializing in high-performance large language models with a fundamental commitment to openness and transparency. They provide an AI platform for customizing, fine-tuning, and deploying AI assistants, autonomous agents, and multimodal AI using open models. Mistral offers flexible deployment options including cloud providers (Google Cloud, AWS, Azure, IBM, Snowflake, NVIDIA, Outscale), private VPC, and on-premises. Known for competitive pricing and European 'sovereign AI' positioning as an alternative to US providers, Mistral delivers models like mistral-large-latest through their chat API. | [Mistral AI](https://mistral.ai/) | | [Anthropic](/llm/anthropic.txt) | 22 models | Anthropic is an AI safety company that has developed Claude, a family of next-generation AI assistants trained using Constitutional AI to be helpful, honest, and harmless. Claude models are renowned for their long context windows (200K+ tokens), advanced reasoning capabilities, and strong tool use features. Anthropic offers direct API access through their platform at console.claude.com, with Claude also available through major cloud platforms including Google Vertex AI and Amazon Bedrock. The company focuses on AI safety research and responsible AI practices, making Claude a trusted choice for enterprises. | [Anthropic](https://docs.anthropic.com/) | | [Z.AI](/llm/zai.txt) | 22 models | N/A | [Z.AI](https://docs.z.ai/guides/overview/pricing) | | [xAI](/llm/xai.txt) | 22 models | xAI is an AI company founded with the mission of advancing scientific discovery and gaining deeper understanding of our universe. Their flagship model Grok features both text and vision support, with full function calling capabilities that make it excellent for agentic applications. The xAI provider offers an OpenAI-compatible API interface for easier integration, and is particularly noted for customer support applications, deep research tasks, and real-world agentic use cases. Grok models are also available through multiple platforms including OpenRouter and support voice, video, and physical AI agents. | [xAI](https://docs.x.ai/) | | [Parasail](/llm/parasail.txt) | 21 models | Parasail is an AI inference platform that provides access to optimized versions of leading open-source models including DeepSeek, Llama, Gemma, GLM, Kimi, Qwen, Olmo, Mistral, and others. The platform offers competitive pricing per million tokens (MTok) with models ranging from small 7B parameter models to large 235B parameter models. Parasail features both standard and specialized model variants (like DeepSeek V32 Speciale for math/reasoning), vision-language models (Qwen VL), and even offers free access to some models like Molmo2 8B. | [Parasail](https://www.saas.parasail.io/pricing) | | [Venice](/llm/venice.txt) | 21 models | Venice is an AI model hosting platform that provides access to a wide range of large language models including frontier models like Claude Opus 4.5, Gemini 3 Pro/Flash, GPT-5.2, and Grok 4.1, as well as open-source models like Llama, Qwen, Mistral, and DeepSeek. The platform offers uncensored models, optimized code models, and thinking/reasoning models with transparent pricing per 1M tokens. Venice focuses on privacy and provides detailed model specifications including context lengths, capabilities, and pricing information for each model. | [Venice](https://venice.ai/) | | [Firmware](/llm/firmware.txt) | 20 models | N/A | [Firmware](https://docs.firmware.ai) | | [GitHub Copilot](/llm/githubcopilot.txt) | 20 models | N/A | [GitHub Copilot](https://docs.github.com/en/copilot) | | [Inference](/llm/inference.txt) | 20 models | Inference.net is a high-performance AI inference platform that provides access to leading large language models including Meta's Llama family, DeepSeek, Mistral, Google's Gemma, Qwen, and OpenAI models. The platform focuses on delivering fast, efficient inference with various precision options (fp-16, fp-8, bf-16) to optimize performance and cost. Inference.net offers an OpenAI-compatible API interface, making it easy to integrate with existing applications while providing access to state-of-the-art models through their unified endpoint. | [Inference](https://inference.net/) | | [IO.NET](/llm/ionet.txt) | 18 models | N/A | [IO.NET](https://io.net/docs/guides/intelligence/io-intelligence) | | [KIE.AI](/llm/kieai.txt) | 18 models | KIE.AI is an AI model aggregation platform that provides access to a wide variety of AI models from leading providers including Google, OpenAI, ByteDance, Kling, Hailuo, Wan, Qwen, Grok, and more. The platform specializes in video generation models with offerings from multiple providers like text-to-video, image-to-video, and video-to-video capabilities. KIE.AI offers a unified interface to access diverse AI video generation models through a single platform, with transparent pricing per video generation rather than per token. | [KIE.AI](https://kie.ai/) | | [Roo Code](/llm/roocode.txt) | 18 models | Roo Code is an AI-powered coding assistant platform that provides access to leading large language models optimized for software development tasks. The platform offers frontier models including Claude Opus 4.5/4.1/3.7, Sonnet 4.5, GPT-5.2/5.1/5/5-mini, Gemini 3 Pro/Flash and 2.5 Pro, Grok models, DeepSeek V3.1, GLM 4.6/4.7, MiniMax M2.1, Kimi K2 Turbo, and MoonshotAI models. Roo Code focuses on agentic coding workflows with models featuring reasoning, tool use, vision capabilities, and extended context windows up to 1M tokens. The platform provides transparent pricing per 1M tokens and includes both free and paid model tiers. | [Roo Code](https://roocode.com/provider) | | [Groq](/llm/groq.txt) | 17 models | Groq is an AI inference company that pioneered the LPU (Language Processing Unit) in 2016, the first chip purpose-built for AI inference. Their proprietary LPU inference engine delivers ultra-low latency AI inference, with benchmarks showing Llama 2 70B running at 300 tokens per second—reportedly 10x faster than NVIDIA H100 clusters and 18x faster on Anyscale's LLMPerf Leaderboard. Groq focuses on making AI inference fast and affordable at scale, offering both cloud services and on-premises deployment options, with a mission to enable real-time AI applications that were previously impossible due to latency constraints. | [Groq](https://groq.com/) | | [Hugging Face](/llm/huggingface.txt) | 17 models | N/A | [Hugging Face](https://huggingface.co/docs/inference-providers) | | [Cloudflare Workers AI](/llm/cloudflareworkersai.txt) | 16 models | Cloudflare Workers AI allows you to run AI models in a serverless way on Cloudflare's global edge network, eliminating the need to worry about scaling, maintaining, or paying for unused infrastructure. The service supports various AI models for text generation, embeddings, and other tasks through a simple API or environment binding. It integrates seamlessly with Cloudflare's AI Gateway for analytics and optimization. Key benefits include pay-per-use pricing, easy API and SDK integration through their community provider for Vercel AI SDK, and the ability to execute AI inference at the edge for ultra-low latency applications worldwide. | [Cloudflare Workers AI](https://developers.cloudflare.com/workers-ai/) | | [Ollama Cloud](/llm/ollama.txt) | 16 models | Ollama is a platform that enables users to run large language models locally on their own hardware, providing complete data privacy and security without requiring cloud services or API keys. It offers an OpenAI-compatible RESTful API for easy integration and supports a wide variety of open-source models including Llama, Olmo, and many others. Ollama enables multimodal support for text chat, PDF integration with RAG (Retrieval Augmented Generation), voice chat, and image-based interactions. The platform eliminates API costs, works offline after initial model download, and is ideal for privacy-sensitive work and local prototyping. | [Ollama Cloud](https://docs.ollama.com/cloud) | | [StreamLake](/llm/streamlake.txt) | 16 models | StreamLake is an AI platform that provides access to leading large language models through their WanQing console. The platform offers models from major providers including Moonshot AI (Kimi series), DeepSeek, Qwen, and other Chinese AI companies. StreamLake features transparent pricing per 1M tokens with detailed model specifications including context lengths, capabilities, and provider information. | [StreamLake](https://console.streamlake.ai/) | | [Vivgrid](/llm/vivgrid.txt) | 16 models | Vivgrid provides access to a range of powerful AI models for building enterprise-grade AI agents including GPT-5, GPT-5-mini, GPT-4.1, GPT-4o, Gemini 2.5 Pro, Gemini 2.5 Flash, DeepSeek R1, DeepSeek V3.1, and DeepSeek V3.2 Exp. The platform intelligently accelerates model inference by automatically routing API requests to the nearest available compute region, minimizing latency and maximizing throughput while maintaining data-residency compliance. Vivgrid's geo-distributed architecture continuously synchronizes AI Tools and model states across multiple data zones, with sub-50ms latency worldwide for geo-distributed models. Models are managed on the backend through the Vivgrid Console, eliminating the need for code changes when switching models. | [Vivgrid](https://vivgrid.com/) | | [Warp](/llm/warp.txt) | 16 models | Warp is a revolutionary AI-powered terminal that provides an Agentic Development Environment with access to curated Large Language Models for coding and terminal workflows. Warp supports OpenAI models including GPT-5.2, GPT-5.2 Codex, GPT-5.1 Codex Max, GPT-5.1 Codex, GPT-5.1, and GPT-5 (each with low, medium, high, and extra high reasoning options), as well as Anthropic's Claude Opus 4.6, Claude Opus 4.5, and Claude Sonnet 4.5 with thinking mode. The platform features three Auto modes: Cost-efficient for lower credit consumption, Responsive for highest-quality results, and Genius for maximum reasoning quality on complex tasks like deep debugging and architecture decisions. Warp includes model fallback for reliability and Zero Data Retention agreements with all providers. | [Warp](https://www.warp.dev/) | | [Cortecs](/llm/cortecs.txt) | 15 models | N/A | [Cortecs](https://api.cortecs.ai/v1/models) | | [Fireworks AI](/llm/fireworks.txt) | 14 models | Fireworks AI is a high-performance AI inference platform that provides fast, affordable access to over 200 open-source and proprietary AI models. The platform specializes in production-grade inference with ultra-low latency, offering models including Llama, Qwen, DeepSeek, Mistral, Google Gemma, FLUX image models, and more. Fireworks features serverless deployment, custom fine-tuning capabilities, and competitive pricing per 1M tokens. The platform is known for its speed and reliability, with models available through OpenAI-compatible APIs and dedicated instances for enterprise workloads. | [Fireworks AI](https://fireworks.ai/) | | [Perplexity AI](/llm/perplexity.txt) | 14 models | Perplexity AI is an American AI company that has developed an innovative AI-powered conversational search engine combining large language models with real-time web search capabilities. Unlike traditional search engines that return links, Perplexity processes user queries by understanding natural language, searching live web sources, synthesizing information with AI, and providing direct, cited answers with proper source references. The platform maintains context across multiple queries and is valued at $18 billion with approximately 1,600 employees, positioning itself as 'a direct line to the world's knowledge — compressed, cited, and made clear.' | [Perplexity AI](https://www.perplexity.ai/docs/pricing) | | [Factory.ai](/llm/factoryai.txt) | 13 models | Factory.ai is an AI-powered autonomous software development platform that leverages multiple Large Language Models (LLMs) to automate complex coding tasks. Their flagship product 'Droid' is a terminal-based autonomous agent capable of solving the majority of solvable development tasks. The platform is provider-agnostic, integrating with various LLM providers, and offers enterprise capabilities including usage analytics, cost control with OpenTelemetry, and optional monitoring dashboards. | [Factory.ai](https://factory.ai/) | | [OVHcloud AI Endpoints](/llm/ovhcloud.txt) | 13 models | N/A | [OVHcloud AI Endpoints](https://www.ovhcloud.com/en/public-cloud/ai-endpoints/catalog//) | | [Scaleway](/llm/scaleway.txt) | 13 models | N/A | [Scaleway](https://www.scaleway.com/en/docs/generative-apis/) | | [Synthetic.new](/llm/synthetic.txt) | 13 models | Synthetic.new is an AI model hosting platform that provides access to various large language models including always-on models and LoRA-adaptable models. The platform offers a straightforward pricing page listing available models with their capabilities. Synthetic.new focuses on providing reliable access to popular LLMs through a simple interface, supporting both standard pre-trained models and custom fine-tuned models using LoRA (Low-Rank Adaptation) technology for specialized use cases. | [Synthetic.new](https://synthetic.new/) | | [Zed](/llm/zed.txt) | 13 models | Zed is a high-performance, collaborative code editor built for speed and designed for multiplayer programming. Zed offers hosted versions of major LLMs including Claude (Opus 4.5, Opus 4.1, Sonnet 4.5, Sonnet 4, Sonnet 3.7, Haiku 4.5), GPT-5 series (GPT-5, GPT-5 mini, GPT-5 nano), Google Gemini (3.0 Pro, 2.5 Pro, 2.5 Flash), and xAI Grok 4 models. These hosted models generally feature higher rate limits than using your own API keys, with context windows up to 400k tokens for GPT-5 models and 200k for Claude and Gemini models. Zed's AI integration is built directly into the editor for seamless AI-assisted development. | [Zed](https://zed.dev/) | | [Cohere](/llm/cohere.txt) | 12 models | N/A | [Cohere]() | | [SambaNova AI](/llm/sambanova.txt) | 12 models | SambaNova AI is a full-stack AI systems company that builds advanced hardware and software solutions for generative AI. Their cloud platform provides access to leading open-source models including DeepSeek (R1, V3, V3.1, V3.2), Meta Llama (3.1, 3.3, 4-Maverick), Qwen3 (32B, 235B), and other specialized models. SambaNova offers an OpenAI-compatible API with competitive pricing, featuring models with context windows up to 131K tokens. The company is known for their DataScale systems and SN40L AI chip designed specifically for large language model training and inference. | [SambaNova AI](https://sambanova.ai/) | | [Weights & Biases](/llm/wandb.txt) | 12 models | N/A | [Weights & Biases](https://weave-docs.wandb.ai/guides/integrations/inference/) | | [iFlow](/llm/iflowcn.txt) | 12 models | N/A | [iFlow](https://platform.iflow.cn/en/docs) | | [Zhipu AI](/llm/zhipuai.txt) | 10 models | N/A | [Zhipu AI](https://docs.z.ai/guides/overview/pricing) | | [Ampcode](/llm/ampcode.txt) | 9 models | Ampcode is an AI-powered development platform that uses multiple specialized models for different tasks. The platform features Agent Modes like Smart (using Claude Opus 4.5) and Rush (using Claude Haiku 4.5), Feature Models like Amp-Tab-4 for next-action predictions and Review for bug identification, and Specialized Subagents like Search, Oracle, and Librarian. Ampcode intelligently routes tasks to the most appropriate model based on complexity and requirements. | [Ampcode](https://ampcode.com/) | | [Junie](/llm/junie.txt) | 9 models | Junie is JetBrains' AI platform that provides access to leading large language models through a command-line interface. The platform supports models from major providers including OpenAI (GPT-5, GPT-5.2 Codex), Anthropic (Claude Sonnet 4.5, Claude Opus 4.6), Google (Gemini 3 Pro, Gemini 3 Flash), xAI (Grok), and others. Junie is available as a downloadable binary for Linux, macOS, and Windows. | [Junie](https://www.jetbrains.com/de-de/junie/) | | [Near AI](/llm/near.txt) | 9 models | Near AI is a decentralized AI platform built on the NEAR blockchain that provides access to leading large language models through an OpenAI-compatible API. The platform aggregates models from major providers including Anthropic (Claude), OpenAI (GPT), Google (Gemini), DeepSeek, Qwen, Black Forest Labs, and others. Near AI focuses on providing transparent pricing per 1M tokens with context length information for each model. The platform offers both proprietary models and imports from other providers, making it easy to access diverse AI capabilities through a unified endpoint. | [Near AI](https://near.ai/) | | [Friendli](/llm/friendli.txt) | 8 models | N/A | [Friendli](https://friendli.ai/docs/guides/serverless_endpoints/introduction) | | [Baseten](/llm/baseten.txt) | 7 models | N/A | [Baseten](https://docs.baseten.co/development/model-apis/overview) | | [Llama](/llm/llama.txt) | 7 models | N/A | [Llama](https://llama.developer.meta.com/docs/models) | | [ModelScope](/llm/modelscope.txt) | 7 models | N/A | [ModelScope](https://modelscope.cn/docs/model-service/API-Inference/intro) | | [submodel](/llm/submodel.txt) | 7 models | N/A | [submodel](https://submodel.gitbook.io) | | [Moonshot AI](/llm/moonshot.txt) | 6 models | Moonshot AI is a Chinese AI company developing the Kimi series of large language models, with Kimi K2 being their flagship state-of-the-art mixture-of-experts (MoE) model featuring 1 trillion total parameters with 32 billion activated parameters. Kimi K2 supports an impressive 256K token context window and includes advanced capabilities like tool calling, online search integration, deep thinking/reasoning, multimodal reasoning, and long-form conversations. The model's performance is said to match or surpass Western rivals, marking another significant milestone for Chinese AI development according to Nature. | [Moonshot AI](https://platform.moonshot.ai/docs/api/chat) | | [Moonshot AI (China)](/llm/moonshotaicn.txt) | 6 models | N/A | [Moonshot AI (China)](https://platform.moonshot.cn/docs/api/chat) | | [SAP AI Core](/llm/sapaicore.txt) | 5 models | SAP AI Core is a service within the SAP Business Technology Platform (SAP BTP) designed to manage the execution and operations of AI assets at scale. It provides an orchestration platform running AI workflows using container-based infrastructure, with access to leading AI models from various providers including OpenAI, Anthropic, Google (Gemini), Amazon, NVIDIA, Mistral, and SAP's own LLMs. The platform offers enterprise-grade scalability, unified interface for multiple AI model providers, seamless integration with internal and external tools, and lifecycle management through SAP AI Launchpad. Active SAP BTP contract and proper configuration are required to use the service. | [SAP AI Core](https://help.sap.com/docs/sap-ai-core/) | | [Vultr](/llm/vultr.txt) | 5 models | N/A | [Vultr](https://api.vultrinference.com/) | | [Berget.AI](/llm/berget.txt) | 4 models | N/A | [Berget.AI](https://api.berget.ai) | | [Cerebras](/llm/cerebras.txt) | 4 models | Cerebras Systems is known for their Wafer-Scale Engine (WSE), the largest chip ever built featuring massive parallel processing capabilities. Their CS-2 systems use the WSE for both training and inference, providing high-performance AI compute with specialized LLM inference services. Cerebras offers inference through Cerebras Cloud as well as on-premises hardware solutions, with optimized inference for models like Meta's Llama family. The company focuses on delivering exceptional AI inference performance for large language models, particularly for enterprises requiring high throughput and low latency for production AI deployments. | [Cerebras](https://www.cerebras.ai/) | | [MiniMax](/llm/minimax.txt) | 3 models | N/A | [MiniMax](https://platform.minimax.io/docs/guides/quickstart) | | [MiniMax (China)](/llm/minimaxcn.txt) | 3 models | N/A | [MiniMax (China)](https://platform.minimaxi.com/docs/guides/quickstart) | | [Morph](/llm/morph.txt) | 3 models | N/A | [Morph](https://docs.morphllm.com/api-reference/introduction) | | [Privatemode AI](/llm/privatemodeai.txt) | 3 models | N/A | [Privatemode AI](https://docs.privatemode.ai/api/overview) | | [Upstage](/llm/upstage.txt) | 3 models | N/A | [Upstage](https://developers.upstage.ai/docs/apis/chat) | | [Bailing](/llm/bailing.txt) | 2 models | N/A | [Bailing](https://alipaytbox.yuque.com/sxs0ba/ling/intro) | | [DeepSeek](/llm/deepseek.txt) | 2 models | N/A | [DeepSeek](https://platform.deepseek.com/api-docs/pricing) | | [Inception](/llm/inception.txt) | 2 models | N/A | [Inception](https://platform.inceptionlabs.ai/docs) | | [Moark](/llm/moark.txt) | 2 models | N/A | [Moark](https://moark.com/docs/openapi/v1#tag/%E6%96%87%E6%9C%AC%E7%94%9F%E6%88%90) | | [STACKIT](/llm/stackit.txt) | 2 models | N/A | [STACKIT](https://docs.stackit.cloud/products/data-and-ai/ai-model-serving/basics/available-shared-models) | | [Amazon Bedrock](/llm/amazonbedrock.txt) | 1 models | Amazon Bedrock is a fully managed AWS service that provides access to high-performing foundation models from leading AI companies including AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and Amazon's own Titan models. The service offers a serverless experience with no infrastructure to manage, featuring unified API access, model customization through fine-tuning, model evaluation tools, knowledge base integration for RAG applications, and AI agents that can complete tasks using reasoning, APIs, and data. Bedrock Guardrails can block up to 88% of harmful content, with the service designed to help organizations accelerate generative AI development while maintaining security, compliance, and responsible AI practices. | [Amazon Bedrock](https://docs.aws.amazon.com/bedrock/) | | [Kimi For Coding](/llm/kimiforcoding.txt) | 1 models | N/A | [Kimi For Coding](https://www.kimi.com/coding/docs/en/third-party-agents.html) | | [LMStudio](/llm/lmstudio.txt) | 1 models | N/A | [LMStudio](https://lmstudio.ai/models) | | [Modal](/llm/modal.txt) | 1 models | Modal is a cloud platform for running AI and machine learning workloads at scale. The platform provides serverless GPU computing infrastructure optimized for AI inference and training, with a focus on simplicity and performance. Modal offers an OpenAI-compatible API interface for accessing hosted models, including specialized endpoints for models like GLM-5. The platform handles infrastructure management, auto-scaling, and cold-start optimization, allowing developers to deploy AI models without managing servers or containers. | [Modal](https://modal.com/) | | [Nova](/llm/nova.txt) | 1 models | N/A | [Nova](https://nova.amazon.com/dev/documentation) | | [StepFun](/llm/stepfun.txt) | 1 models | StepFun is a Chinese AI company developing large language models and multimodal AI systems. Their StepFun Open Platform provides OpenAI-compatible API access to their AI models including the Step series of language models designed for Chinese and multilingual applications. | [StepFun](https://platform.stepfun.ai) | | [Xiaomi](/llm/xiaomi.txt) | 1 models | Xiaomi Mimo is Xiaomi's AI platform providing access to large language models through an OpenAI-compatible API interface. | [Xiaomi](https://platform.xiaomimimo.com/#/docs) | --- ## All Provider Pages - [Yupp](/llm/yupp.txt) - [AIHubMix](/llm/aihubmix.txt) - [Requesty](/llm/requesty.txt) - [Blackbox AI](/llm/blackboxai.txt) - [Nano-GPT](/llm/nanogpt.txt) - [Kilo Code](/llm/kilocode.txt) - [OpenRouter](/llm/openrouter.txt) - [ValorGPT](/llm/valorgpt.txt) - [LangDB](/llm/langdb.txt) - [302.AI](/llm/302ai.txt) - [Arena AI](/llm/arenaai.txt) - [CometAPI](/llm/cometapi.txt) - [Vercel AI Gateway](/llm/vercel.txt) - [AIMLAPI](/llm/aimlapi.txt) - [LLM Stats](/llm/llmstats.txt) - [Poe](/llm/poe.txt) - [Together AI](/llm/togetherai.txt) - [FastRouter](/llm/fastrouter.txt) - [Helicone](/llm/helicone.txt) - [ZenMUX](/llm/zenmux.txt) - [Azure AI Services](/llm/azurecognitiveservices.txt) - [SiliconFlow](/llm/siliconflow.txt) - [Routeway](/llm/routeway.txt) - [Glama](/llm/glama.txt) - [Nvidia](/llm/nvidia.txt) - [DeepInfra](/llm/deepinfra.txt) - [Novita AI](/llm/novita.txt) - [Replicate](/llm/replicate.txt) - [SiliconFlow (China)](/llm/siliconflowcn.txt) - [Jiekou.AI](/llm/jiekou.txt) - [Alibaba (China)](/llm/alibabacn.txt) - [GitHub Models](/llm/githubmodels.txt) - [Abacus](/llm/abacus.txt) - [Google Vertex AI](/llm/googlevertex.txt) - [NetMind](/llm/netmind.txt) - [RedPill](/llm/redpill.txt) - [Monica AI](/llm/monica.txt) - [GMI Cloud](/llm/gmi.txt) - [Cloudflare AI Gateway](/llm/cloudflareaigateway.txt) - [Nebius Token Factory](/llm/nebius.txt) - [OpenAI](/llm/openai.txt) - [RouterLink](/llm/routerlink.txt) - [MegaNova](/llm/meganova.txt) - [Runware](/llm/runware.txt) - [Alibaba](/llm/alibaba.txt) - [Chutes.ai](/llm/chutes.txt) - [JetBrains AI](/llm/jetbrains.txt) - [Mammouth AI](/llm/mammouth.txt) - [CommonStack](/llm/commonstack.txt) - [Cursor](/llm/cursor.txt) - [Okara](/llm/okara.txt) - [OpenCode Zen](/llm/opencode.txt) - [Baidu AI Studio](/llm/baidu.txt) - [Chats-LLM](/llm/chatsllm.txt) - [Google Gemini](/llm/gemini.txt) - [Mistral AI](/llm/mistral.txt) - [Anthropic](/llm/anthropic.txt) - [Z.AI](/llm/zai.txt) - [xAI](/llm/xai.txt) - [Parasail](/llm/parasail.txt) - [Venice](/llm/venice.txt) - [Firmware](/llm/firmware.txt) - [GitHub Copilot](/llm/githubcopilot.txt) - [Inference](/llm/inference.txt) - [IO.NET](/llm/ionet.txt) - [KIE.AI](/llm/kieai.txt) - [Roo Code](/llm/roocode.txt) - [Groq](/llm/groq.txt) - [Hugging Face](/llm/huggingface.txt) - [Cloudflare Workers AI](/llm/cloudflareworkersai.txt) - [Ollama Cloud](/llm/ollama.txt) - [StreamLake](/llm/streamlake.txt) - [Vivgrid](/llm/vivgrid.txt) - [Warp](/llm/warp.txt) - [Cortecs](/llm/cortecs.txt) - [Fireworks AI](/llm/fireworks.txt) - [Perplexity AI](/llm/perplexity.txt) - [Factory.ai](/llm/factoryai.txt) - [OVHcloud AI Endpoints](/llm/ovhcloud.txt) - [Scaleway](/llm/scaleway.txt) - [Synthetic.new](/llm/synthetic.txt) - [Zed](/llm/zed.txt) - [Cohere](/llm/cohere.txt) - [SambaNova AI](/llm/sambanova.txt) - [Weights & Biases](/llm/wandb.txt) - [iFlow](/llm/iflowcn.txt) - [Zhipu AI](/llm/zhipuai.txt) - [Ampcode](/llm/ampcode.txt) - [Junie](/llm/junie.txt) - [Near AI](/llm/near.txt) - [Friendli](/llm/friendli.txt) - [Baseten](/llm/baseten.txt) - [Llama](/llm/llama.txt) - [ModelScope](/llm/modelscope.txt) - [submodel](/llm/submodel.txt) - [Moonshot AI](/llm/moonshot.txt) - [Moonshot AI (China)](/llm/moonshotaicn.txt) - [SAP AI Core](/llm/sapaicore.txt) - [Vultr](/llm/vultr.txt) - [Berget.AI](/llm/berget.txt) - [Cerebras](/llm/cerebras.txt) - [MiniMax](/llm/minimax.txt) - [MiniMax (China)](/llm/minimaxcn.txt) - [Morph](/llm/morph.txt) - [Privatemode AI](/llm/privatemodeai.txt) - [Upstage](/llm/upstage.txt) - [Bailing](/llm/bailing.txt) - [DeepSeek](/llm/deepseek.txt) - [Inception](/llm/inception.txt) - [Moark](/llm/moark.txt) - [STACKIT](/llm/stackit.txt) - [Amazon Bedrock](/llm/amazonbedrock.txt) - [Kimi For Coding](/llm/kimiforcoding.txt) - [LMStudio](/llm/lmstudio.txt) - [Modal](/llm/modal.txt) - [Nova](/llm/nova.txt) - [StepFun](/llm/stepfun.txt) - [Xiaomi](/llm/xiaomi.txt)