Inference icon

Inference

inference

Updated 24 minutes ago

Inference.net is a high-performance AI inference platform that provides access to leading large language models including Meta's Llama family, DeepSeek, Mistral, Google's Gemma, Qwen, and OpenAI models. The platform focuses on delivering fast, efficient inference with various precision options (fp-16, fp-8, bf-16) to optimize performance and cost. Inference.net offers an OpenAI-compatible API interface, making it easy to integrate with existing applications while providing access to state-of-the-art models through their unified endpoint.

Browse 20 LLM models available from Inference. Compare prices and features.

Models (20)

Organization Model Name Original Model Input Output Free
DeepSeek
DeepSeek DeepSeek-R1-0528 deepseek/deepseek-r1-0528/fp-8 - -
OpenAI
OpenAI GPT OSS 120B openai/gpt-oss-120b - -
OpenAI
OpenAI GPT OSS 20B openai/gpt-oss-20b - -
DeepSeek
DeepSeek DeepSeek-V3 0324 deepseek/deepseek-v3-0324/fp-8 - -
qwen
qwen Qwen3 30B A3B qwen/qwen3-30b-a3b/fp8 - -
qwen
qwen QwQ-32B qwen/qwq-32b/fp-8 - -
DeepSeek
DeepSeek DeepSeek-V3 deepseek/deepseek-v3/fp-8 - -
Meta
Meta Llama 3.3 70B Instruct meta-llama/llama-3.3-70b-instruct/fp-8 - -
Meta
Meta Llama 3.1 70B Instruct meta-llama/llama-3.1-70b-instruct/fp-16 - -
qwen
qwen Qwen2.5 7B Instruct qwen/qwen2.5-7b-instruct/bf-16 - -
Meta
Meta Llama 3.2 3B Instruct meta/llama-3.2-3b-instruct $0.02 $0.02
Meta
Meta Llama 3.2 3B Instruct meta-llama/llama-3.2-3b-instruct/fp-16 - -
Meta
Meta Llama 3.2 11B Instruct meta-llama/llama-3.2-11b-instruct/fp-16 - -
Meta
Meta Llama 3.1 8B Instruct meta/llama-3.1-8b-instruct $0.03 $0.03
Meta
Meta Llama 3.1 8B Instruct meta-llama/llama-3.1-8b-instruct/fp-16 - -
qwen
qwen Qwen3 Embedding 4B qwen/qwen3-embedding-4b $0.01 $0.00
Nvidia
Nvidia Llama 3.2 11b Vision Instruct meta/llama-3.2-11b-vision-instruct $0.06 $0.06
Meta
Meta llama-3.2-1b-instruct meta/llama-3.2-1b-instruct $0.01 $0.01
Meta
Meta llama-3.2-1b-instruct meta-llama/llama-3.2-1b-instruct/fp-16 - -
DeepSeek
DeepSeek DeepSeek-R1 deepseek/deepseek-r1/fp-8 - -