Leaderboard | Text

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

Topics

Choose topic

All topics Facts and Information Creative Writing and Ideation Logic and Problem-Solving Task Completion Coding

Choose language

All languages English Chinese Arabic Spanish Indonesian Japanese

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

373

Qwen 2.5 VL 3B Instruct

613

Inception Mercury

635

Qwen 2.5 VL 72B Instruct

674

Pixtral 12B

719

Llama 3.3 70B

724

Grok 3 Mini Fast

752

OpenAI o3-mini-low

765

Magistral Medium 2509

777

OpenAI o3-mini

778

OpenAI o3-mini-high

782

Llama 4 Scout

787

Qwen3 30B A3B Thinking 2507

793

Mistral Small 3.2 24B

796

Pixtral Large

799

Grok 3 Mini

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
1	288	Qwen 2.5 VL 3B Instruct	373	±46	955	7.7%	3.0%	44 tps	2.5s	128K	$0.21	$0.63
2	179	Inception Mercury	613	±25	500	6.5%	0.4%	257 tps	1.1s	32K	$0.25	$1.00
3	265	Qwen 2.5 VL 72B Instruct	635	±34	505	5.6%	5.3%	25 tps	3.7s	128K	$1.01	$2.79
4	274	Pixtral 12B	674	±33	720	5.9%	2.2%	101 tps	1.2s	131K	$0.08	$0.08
5	194	Llama 3.3 70B	719	±18	550	3.5%	0.3%	500 tps	0.5s	8K	$0.48	$0.66
6	186	Grok 3 Mini Fast	724	±15	1.3K	4.3%	1.6%	44 tps	0.5s	131K	$0.60	$4.00
7	175	OpenAI o3-mini-low	752	±17	1.4K	4.3%	0.7%	139 tps	1.5s	200K	$1.10	$4.40
8	229	Magistral Medium 2509	765	±18	610	3.9%	4.0%	58 tps	0.9s	131K	$2.00	$5.00
9	177	OpenAI o3-mini	777	±12	2K	3.6%	0.8%	143 tps	3.3s	200K	$1.10	$4.40
10	214	OpenAI o3-mini-high	778	±17	630	3.1%	2.4%	231 tps	10.5s	200K	$1.10	$4.40
11	160	Llama 4 Scout	782	±15	1.6K	4.1%	0.6%	88 tps	5.1s	131K	$0.18	$0.46
12	148	Qwen3 30B A3B Thinking 2507	787	±14	655	4.4%	0.5%	124 tps	1.2s	131K	$0.16	$1.70
13	170	Mistral Small 3.2 24B	793	±28	475	5.9%	2.8%	141 tps	0.7s	33K	$0.02	$0.08
14	165	Pixtral Large	796	±26	610	7.6%	2.5%	57 tps	1.3s	128K	$1.50	$4.50
15	186	Grok 3 Mini	799	±23	1.9K	2.6%	1.2%	43 tps	0.5s	131K	$0.30	$0.50
16	161	Qwen3 8B	813	±23	610	5.4%	2.4%	61 tps	1.4s	41K	$0.02	$0.07
17	165	Qwen3 4B	818	±20	870	4.9%	1.9%	94 tps	1.5s	128K	$0.01	$0.01
18	121	QwQ 32B	825	±16	1.3K	5.5%	5.4%	41 tps	2.1s	16K	$0.43	$0.56
19	126	Qwen3 30B A3B	832	±20	950	4.0%	5.1%	163 tps	1.0s	41K	$0.06	$0.21
20	139	GLM 4.6V	837	±23	640	5.2%	6.4%	21 tps	1.8s	128K	$0.38	$0.90
21	161	Llama 4 Maverick	838	±10	2.4K	4.3%	1.2%	88 tps	2.4s	1M	$0.23	$0.83
22	133	DeepSeek V3.2 Speciale	842	±38	485	4.9%	6.0%	43 tps	1.4s	131K	$0.84	$1.52
23	121	Qwen3 32B Fast	863	±13	2K	4.5%	11.6%	30 tps	3.1s	41K	$0.10	$0.25
24	119	ERNIE 4.5 300B A47B	873	±12	1.1K	3.4%	4.7%	23 tps	2.3s	123K	$0.28	$1.10
25	143	Gemini 2.0 Flash	885	±16	870	5.9%	<0.1%	76 tps	0.5s	1M	$0.14	$0.56
26	157	Qwen3 Next 80B A3B Thinking	890	±16	1.1K	3.4%	0.6%	175 tps	1.3s	256K	$0.21	$2.26
27	133	GPT-4.1 nano	896	±11	1.8K	3.2%	0.6%	175 tps	0.5s	1M	$0.10	$0.40
28	157	GPT-5 Nano	901	±9	1.2K	5.0%	3.2%	113 tps	20.9s	400K	$0.05	$0.40
29	143	Gemini 2.0 Flash Lite	902	±14	990	7.9%	<0.1%	42 tps	0.5s	1M	$0.08	$0.30
30	133	DeepSeek-R1 0528	904	±19	640	4.5%	1.3%	93 tps	0.5s	64K	$1.60	$3.67
31	124	Qwen3 235B A22B Thinking 2507	905	±16	695	2.1%	2.5%	53 tps	1.6s	131K	$0.59	$5.70
32	65	DeepSeek V3.2 Exp Chat	909	±11	1.3K	2.6%	2.6%	29 tps	1.5s	131K	$0.27	$0.39
33	101	gpt-oss-20b	912	±12	1.5K	4.1%	0.5%	216 tps	0.5s	131K	$0.06	$0.26
34	153	OpenAI o1	926	±15	915	2.1%	4.2%	92 tps	5.5s	200K	$15.00	$60.00
35	148	DeepSeek-R1	936	±21	705	4.1%	0.8%	133 tps	0.6s	64K	$0.91	$3.07
36	126	Qwen3 VL 235B A22B Thinking	939	±15	965	3.5%	4.3%	47 tps	3.0s	127K	$0.47	$3.31
37	133	Qwen3 14B	943	±16	825	4.1%	1.7%	109 tps	0.8s	41K	$0.04	$0.15
38	139	OpenAI o4-mini	947	±11	1.2K	2.5%	1.4%	97 tps	7.0s	128K	$1.10	$4.40
39	129	Command A	948	±12	1.9K	3.1%	2.2%	42 tps	0.8s	256K	$2.00	$7.33
40	71	Seed 1.8 251228	949	±18	1.2K	2.7%	3.7%	41 tps	2.1s	256K	$0.25	$2.00

1of4

View All (133 models)