Leaderboard | Text

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

Topics

Choose topic

All topics Facts and Information Creative Writing and Ideation Logic and Problem-Solving Task Completion Coding

Choose language

All languages English Chinese Arabic Spanish Indonesian Japanese

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

1144

gpt-oss-120b

1144

DeepSeek V3.2

1147

Qwen3 235B A22B Instruct 2507

1148

Qwen3 Max Instruct Preview

1150

DeepSeek V3.1 Terminus Chat

1151

Kimi K2.5

1151

Qwen3.5 122B A17B

1158

Qwen3 30B A3B Instruct 2507

1159

Gemini 3 Flash Preview

1159

Gemini 2.5 Pro High

1161

Qwen3 Next 80B A3B Instruct

1164

Step 3.5 Flash

1165

MiniMax M2.1 Lightning

1167

MiniMax M2.7

1168

Qwen3 Omni 30B A3B Instruct

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
241	48	gpt-oss-120b	1144	±2	40.7K	3.7%	0.7%	213 tps	0.5s	131K	$0.11	$0.50
242	40	DeepSeek V3.2	1144	±3	20.7K	1.9%	1.4%	83 tps	5.1s	131K	$0.43	$1.09
243	40	Qwen3 235B A22B Instruct 2507	1147	±2	32.2K	4.7%	6.8%	13 tps	1.9s	262K	$0.13	$0.52
244	42	Qwen3 Max Instruct Preview	1148	±2	36.6K	3.5%	1.1%	31 tps	1.7s	256K	$1.43	$6.61
245	44	DeepSeek V3.1 Terminus Chat	1150	±3	17.8K	4.2%	3.4%	27 tps	1.5s	131K	$0.86	$1.80
246	33	Kimi K2.5	1151	±3	32.5K	1.8%	6.5%	33 tps	1.7s	262K	$0.34	$2.57
247	52	Qwen3.5 122B A17B	1151	±4	4.7K	1.6%	1.5%	82 tps	1.4s	256K	$0.40	$3.20
248	33	Qwen3 30B A3B Instruct 2507	1158	±2	31.6K	4.1%	1.2%	55 tps	1.3s	131K	$0.13	$0.72
249	17	Gemini 3 Flash Preview	1159	±3	17.8K	2.1%	1.3%	138 tps	1.4s	1M	$0.50	$3.00
250	32	Gemini 2.5 Pro High	1159	±2	42.7K	4.5%	1.5%	48 tps	2.3s	1M	$1.25	$10.00
251	33	Qwen3 Next 80B A3B Instruct	1161	±2	24.9K	3.8%	0.6%	84 tps	1.1s	256K	$0.20	$1.42
252	48	Step 3.5 Flash	1164	±5	4K	1.5%	2.2%	109 tps	0.6s	256K	$0.05	$0.15
253	56	MiniMax M2.1 Lightning	1165	±5	4.9K	1.2%	1.7%	52 tps	2.1s	205K	$0.30	$2.40
254	29	MiniMax M2.7	1167	±8	1.1K	1.8%	3.0%	34 tps	2.5s	205K	$0.30	$1.20
255	62	Qwen3 Omni 30B A3B Instruct	1168	±5	3K	2.3%	3.9%	65 tps	1.2s	66K	$0.35	$0.97
256	37	Kimi K2.5 Instant	1171	±4	6.2K	1.8%	2.9%	32 tps	3.0s	262K	$0.50	$3.00
257	26	Claude Haiku 4.5 (Extended Thinking)	1173	±2	24.3K	3.1%	1.4%	115 tps	0.7s	200K	$1.00	$5.00
258	17	Claude Opus 4.5	1173	±2	22.5K	2.2%	1.5%	45 tps	1.5s	200K	$5.00	$25.00
259	17	Grok 4.20 Beta Reasoning	1175	±7	3.3K	1.8%	1.1%	77 tps	4.5s	2M	$2.00	$5.50
260	16	GPT-5.2	1176	±2	22.6K	1.8%	4.1%	18 tps	2.7s	400K	$1.75	$14.00
261	26	GPT-5 (High)	1177	±2	22.1K	3.1%	4.5%	81 tps	35.9s	400K	$1.25	$10.00
262	106	GPT-5.4 nano	1177	±10	650	2.3%	0.7%	149 tps	0.5s	400K	$0.20	$1.25
263	26	Grok 4.1 Fast Non-Reasoning	1177	±2	25.7K	3.0%	0.9%	101 tps	0.5s	2M	$0.20	$0.50
264	17	GPT-5.2 (High)	1180	±2	54.6K	1.9%	6.7%	18 tps	16.3s	400K	$1.75	$14.00
265	14	Gemini 3 Pro (Low)	1180	±3	28.9K	2.2%	2.4%	51 tps	3.5s	1M	$2.00	$12.00
266	22	GLM 5	1182	±4	17.3K	2.1%	3.4%	36 tps	2.7s	200K	$0.72	$2.55
267	33	Grok 4.20 Multi Agent Beta	1183	±9	2.6K	2.0%	1.2%	56 tps	8.8s	2M	$2.00	$6.00
268	37	Qwen3 Omni 30B A3B Thinking	1186	±3	7.5K	2.1%	3.7%	67 tps	1.2s	66K	$0.97	$1.79
269	29	Nova Experimental Chat 12-10	1188	±3	9.8K	1.9%	2.4%	84 tps	12.9s	98K	$0	$0
270	29	Qwen3 VL 235B A22B Instruct	1188	±3	13.5K	5.2%	3.1%	75 tps	1.9s	129K	$0.37	$1.81
271	14	Gemini 3 Flash Preview Thinking	1190	±2	47K	2.3%	1.6%	3 tps	6.2s	1M	$0.50	$3.00
272	22	Grok 4.20 Beta Non-reasoning	1192	±11	1.3K	3.1%	1.1%	151 tps	0.6s	2M	$2.00	$6.00
273	22	GPT-5 Chat	1196	±1	75.1K	3.4%	1.3%	95 tps	0.9s	400K	$1.25	$10.00
274	13	GPT-5.3 Instant	1199	±6	9.3K	1.7%	0.9%	63 tps	0.8s	400K	$1.75	$14.00
275	17	GPT-5.4 mini	1203	±10	885	2.7%	0.8%	148 tps	0.5s	400K	$0.75	$4.50
276	10	Gemini 3 Pro	1207	±1	78K	2.2%	2.1%	50 tps	3.6s	1M	$2.00	$12.00
277	22	MiniMax M2.7-highspeed	1207	±10	1.1K	2.1%	2.3%	50 tps	2.1s	205K	$0.60	$2.40
278	10	Claude Sonnet 4.5 (Thinking)	1228	±1	66.2K	3.4%	1.9%	44 tps	1.1s	200K	$3.00	$15.00
279	10	GPT-5.2 Instant	1232	±2	39.3K	1.6%	1.7%	52 tps	2.0s	400K	$1.75	$14.00
280	6	Gemini 3.1 Pro	1245	±3	26K	2.0%	3.5%	35 tps	4.1s	1M	$2.00	$12.00

7of8

View All (288 models)