Qwen2.5 VL 3B Instruct

qwen • qwen2-5-vl-3b-instruct

Model Information

Slug	qwen2-5-vl-3b-instruct
LLMs.txt	View

Organization

Name	qwen
Website	https://qwen.ai/

Model Description

Qwen2.5 VL 3B is a multimodal LLM from the Qwen Team with the following key enhancements:

- SoTA understanding of images of various resolution & ratio: Qwen2.5-VL achieves state-of-the-art performance on visual understanding benchmarks, including MathVista, DocVQA, RealWorldQA, MTVQA, etc.

- Agent that can operate your mobiles, robots, etc.: with the abilities of complex reasoning and decision making, Qwen2.5-VL can be integrated with devices like mobile phones, robots, etc., for automatic operation based on visual environment and text instructions.

- Multilingual Support: to serve global users, besides English and Chinese, Qwen2.5-VL now supports the understanding of texts in different languages inside images, including most European languages, Japanese, Korean, Arabic, Vietnamese, etc.

For more details, see this [blog post](https://qwenlm.github.io/blog/qwen2-vl/) and [GitHub repo](https://github.com/QwenLM/Qwen2-VL).

Usage of this model is subject to [Tongyi Qianwen LICENSE AGREEMENT](https://huggingface.co/Qwen/Qwen1.5-110B-Chat/blob/main/LICENSE).

Available at 5 Providers

Provider	Type	Model Name	Original Model	Input ($/1M)	Output ($/1M)	Actions
Fireworks AI		Qwen2.5-VL 3B Instruct	`qwen2p5-vl-3b-instruct`	$0.20	$0.20	Visit
Yupp	Chat	Qwen 2.5 VL 3B Instruct	`qwen2.5-vl-3b-instruct`	-	-	Visit
Runware		Qwen2.5-VL-3B-Instruct	`alibaba-qwen2-5-vl-3b-instruct`	-	-	Visit
Writingmate	Chat Code	Qwen: Qwen2.5 VL 3B Instruct	`qwen/qwen2.5-vl-3b-instruct`	-	-	Model
Featherless		Qwen/Qwen2.5-VL-3B-Instruct	`Qwen/Qwen2.5-VL-3B-Instruct`	-	-	Visit

Back to Models View Organization