Leaderboard | Text

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

Choose topic

All topics Facts and Information Creative Writing and Ideation Logic and Problem-Solving Task Completion Coding

Language

Choose language

All languages English Chinese Arabic Spanish Indonesian Japanese

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

1000

Amazon Nova Micro 1.0

1002

Devstral Small 2507

1004

Gemini 2.0 Flash

1005

Mistral Small 3.1

1005

GLM 4.5 Flash

1005

Qwen 2.5 32B Instruct

1005

GLM 4.6V

1006

Mistral Small 3.2 24B Instruct

1006

ERNIE 4.5 21B A3B

1007

OpenAI o4-mini

1007

Qwen3 Max Thinking

1009

Qwen3 VL 30B A3B Instruct

1009

MiMo V2 Flash

1009

OpenAI o4-mini-high

1010

Seed 2.0 Mini (Medium)

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
121	246	Amazon Nova Micro 1.0	1000	±23	630	2.3%	4.1%	193 tps	0.6s	128K	$0.04	$0.07
122	170	Devstral Small 2507	1002	±8	980	2.0%	2.2%	186 tps	0.5s	131K	$0.10	$0.30
123	143	Gemini 2.0 Flash	1004	±3	23.8K	0.9%	<0.1%	76 tps	0.5s	1M	$0.14	$0.56
124	161	Mistral Small 3.1	1005	±3	9.4K	1.0%	7.4%	13 tps	2.6s	32K	$0.17	$0.28
125	194	GLM 4.5 Flash	1005	±11	1.1K	2.3%	12.2%	15 tps	2.2s	131K	$0	$0
126	153	Qwen 2.5 32B Instruct	1005	±3	15.7K	1.0%	2.5%	48 tps	1.0s	131K	$0.21	$0.25
127	139	GLM 4.6V	1005	±3	7.4K	1.4%	6.4%	21 tps	1.8s	128K	$0.38	$0.90
128	186	Mistral Small 3.2 24B Instruct	1006	±7	1.6K	3.4%	1.9%	113 tps	1.1s	131K	$0.02	$0.08
129	165	ERNIE 4.5 21B A3B	1006	±7	1.4K	1.8%	2.3%	78 tps	1.5s	120K	$0.05	$0.19
130	139	OpenAI o4-mini	1007	±3	13.6K	2.2%	1.4%	97 tps	7.0s	128K	$1.10	$4.40
131	129	Qwen3 Max Thinking	1007	±5	6.9K	1.4%	13.5%	32 tps	2.3s	256K	$1.20	$6.00
132	139	Qwen3 VL 30B A3B Instruct	1009	±9	1.2K	4.5%	1.8%	80 tps	2.6s	129K	$0.18	$0.67
133	175	MiMo V2 Flash	1009	±10	645	3.7%	7.2%	24 tps	1.9s	262K	$0.07	$0.23
134	148	OpenAI o4-mini-high	1009	±2	19.5K	2.1%	1.9%	117 tps	15.9s	200K	$1.10	$4.40
135	139	Seed 2.0 Mini (Medium)	1010	±10	1.3K	2.6%	11.9%	33 tps	1.7s	256K	$0.15	$0.60
136	179	Baichuan-M2-32B	1011	±8	1.4K	2.7%	<0.1%	32 tps	3.3s	131K	$0.07	$0.07
137	101	Qwen3.5 35B A3B	1011	±15	1.1K	1.4%	2.1%	116 tps	2.1s	256K	$0.63	$1.13
138	133	GPT-4.1 nano	1014	±2	52.1K	1.3%	0.6%	175 tps	0.5s	1M	$0.10	$0.40
139	157	Qwen3 Next 80B A3B Thinking	1015	±3	12.3K	2.3%	0.6%	175 tps	1.3s	256K	$0.21	$2.26
140	133	Kimi K2 0905	1016	±4	9.2K	2.1%	4.0%	30 tps	1.4s	262K	$0.63	$2.39
141	148	OpenAI o3	1018	±6	4.2K	1.8%	0.9%	85 tps	6.8s	128K	$7.33	$29.33
142	143	Seed 1.6 250615	1018	±4	3.6K	1.6%	3.1%	46 tps	2.2s	256K	$0.25	$2.00
143	161	Qwen3 8B	1020	±5	6.1K	2.6%	2.4%	61 tps	1.4s	41K	$0.02	$0.07
144	143	Mistral Medium 3	1023	±9	1.2K	1.7%	2.4%	47 tps	0.8s	33K	$0.40	$2.00
145	86	Claude Sonnet 4	1026	±2	88.9K	1.5%	1.8%	49 tps	1.3s	200K	$3.00	$15.00
146	165	Qwen3 VL 30B A3B Thinking	1027	±7	2.3K	4.6%	4.5%	84 tps	2.9s	127K	$0.20	$1.47
147	126	DeepSeek V3	1027	±2	41.8K	1.0%	0.9%	69 tps	1.1s	64K	$0.59	$1.49
148	165	Qwen3 4B	1027	±4	9.4K	3.3%	1.9%	94 tps	1.5s	128K	$0.01	$0.01
149	133	DeepSeek V3.2 Speciale	1027	±5	5.9K	2.2%	6.0%	43 tps	1.4s	131K	$0.84	$1.52
150	126	Qwen3 VL 235B A22B Thinking	1027	±4	7.3K	2.9%	4.3%	47 tps	3.0s	127K	$0.47	$3.31
151	111	Grok 3 Fast	1030	±3	12K	1.1%	1.7%	52 tps	2.4s	131K	$5.00	$25.00
152	129	Command A	1030	±2	67.4K	1.3%	2.2%	42 tps	0.8s	256K	$2.00	$7.33
153	153	Ministral 14B 3.0	1031	±6	2.3K	3.1%	2.0%	119 tps	0.5s	128K	$0.20	$0.20
154	118	GPT-4.1 mini	1031	±2	57.7K	1.3%	1.1%	67 tps	0.9s	1M	$0.34	$1.60
155	148	Qwen3 30B A3B Thinking 2507	1035	±4	4K	1.4%	0.5%	124 tps	1.2s	131K	$0.16	$1.70
156	106	Claude Sonnet 3.5 v2	1035	±4	16.6K	1.0%	<0.1%	46 tps	1.4s	200K	$3.00	$15.00
157	124	Kimi K2 0905 Turbo	1037	±3	18.8K	2.4%	0.7%	373 tps	0.5s	262K	$1.70	$6.50
158	84	MiniMax M2.5	1038	±11	1.5K	2.0%	1.4%	70 tps	1.9s	205K	$0.28	$1.20
159	86	Seed 2.0 Lite (Medium)	1043	±9	1.2K	2.0%	6.6%	33 tps	1.6s	256K	$0.25	$2.00
160	113	Mistral Medium	1044	±2	33.2K	1.3%	1.8%	48 tps	0.6s	33K	$1.48	$4.55

4of8

View All (283 models)