Leaderboard | Text

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

Choose topic

All topics Facts and Information Creative Writing and Ideation Logic and Problem-Solving Task Completion Coding

Language

Choose language

All languages English Chinese Arabic Spanish Indonesian Japanese

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

1147

Qwen3 235B A22B Instruct 2507

1147

Claude Opus 4.5 (Thinking)

1147

Kimi K2.5

1148

Claude Sonnet 4.5 (Thinking)

1149

GLM 5

1150

Step 3.5 Flash

1151

Qwen3 Max Instruct Preview

1152

DeepSeek V3.2

1152

DeepSeek V3.1 Terminus Chat

1153

Gemini 2.5 Pro

1156

Grok 4 Fast Non-Reasoning

1156

GPT-5 (High)

1157

Qwen3 Omni 30B A3B Instruct

1158

Kimi K2.5 Instant

1160

Grok 4.1 Fast Reasoning

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
241	40	Qwen3 235B A22B Instruct 2507	1147	±2	24.5K	1.4%	6.8%	13 tps	1.9s	262K	$0.13	$0.52
242	7	Claude Opus 4.5 (Thinking)	1147	±4	21.9K	1.6%	1.8%	49 tps	1.4s	200K	$5.00	$25.00
243	33	Kimi K2.5	1147	±3	16.1K	1.2%	6.5%	33 tps	1.7s	262K	$0.34	$2.57
244	10	Claude Sonnet 4.5 (Thinking)	1148	±2	27.2K	2.5%	1.9%	44 tps	1.1s	200K	$3.00	$15.00
245	22	GLM 5	1149	±4	6.4K	1.2%	3.4%	36 tps	2.7s	200K	$0.72	$2.55
246	48	Step 3.5 Flash	1150	±6	2.9K	1.7%	2.2%	109 tps	0.6s	256K	$0.05	$0.15
247	42	Qwen3 Max Instruct Preview	1151	±2	26.4K	2.0%	1.1%	31 tps	1.7s	256K	$1.43	$6.61
248	40	DeepSeek V3.2	1152	±3	16.5K	1.1%	1.4%	83 tps	5.1s	131K	$0.43	$1.09
249	44	DeepSeek V3.1 Terminus Chat	1152	±2	14.5K	1.8%	3.4%	27 tps	1.5s	131K	$0.86	$1.80
250	44	Gemini 2.5 Pro	1153	±2	38.9K	1.8%	2.3%	45 tps	2.6s	1M	$1.25	$10.00
251	52	Grok 4 Fast Non-Reasoning	1156	±2	16.7K	2.2%	1.5%	93 tps	0.6s	2M	$0.27	$0.67
252	26	GPT-5 (High)	1156	±3	12.3K	2.3%	4.5%	81 tps	35.9s	400K	$1.25	$10.00
253	62	Qwen3 Omni 30B A3B Instruct	1157	±6	2.3K	1.9%	3.9%	65 tps	1.2s	66K	$0.35	$0.97
254	37	Kimi K2.5 Instant	1158	±6	4K	1.5%	2.9%	32 tps	3.0s	262K	$0.50	$3.00
255	44	Grok 4.1 Fast Reasoning	1160	±2	23.1K	1.8%	1.5%	58 tps	7.3s	2M	$0.20	$0.50
256	33	Qwen3 30B A3B Instruct 2507	1167	±2	24.4K	1.7%	1.2%	55 tps	1.3s	131K	$0.13	$0.72
257	17	GPT-5.2 (High)	1168	±2	31.4K	1.1%	6.7%	18 tps	16.3s	400K	$1.75	$14.00
258	17	Gemini 3 Flash Preview	1173	±3	12.8K	0.7%	1.3%	138 tps	1.4s	1M	$0.50	$3.00
259	32	Gemini 2.5 Pro High	1175	±1	27.6K	2.0%	1.5%	48 tps	2.3s	1M	$1.25	$10.00
260	48	gpt-oss-120b	1182	±2	26.3K	1.3%	0.7%	213 tps	0.5s	131K	$0.11	$0.50
261	33	Qwen3 Next 80B A3B Instruct	1185	±2	19.8K	1.9%	0.6%	84 tps	1.1s	256K	$0.20	$1.42
262	37	Qwen3 Omni 30B A3B Thinking	1188	±5	5.3K	1.2%	3.7%	67 tps	1.2s	66K	$0.97	$1.79
263	16	GPT-5.2	1193	±2	16.3K	0.9%	4.1%	18 tps	2.7s	400K	$1.75	$14.00
264	14	Gemini 3 Pro (Low)	1195	±3	20.3K	1.1%	2.4%	51 tps	3.5s	1M	$2.00	$12.00
265	14	Gemini 3 Flash Preview Thinking	1195	±3	19.7K	1.0%	1.6%	3 tps	6.2s	1M	$0.50	$3.00
266	33	Grok 4.20 Multi Agent Beta	1197	±7	1.7K	1.8%	1.2%	56 tps	8.8s	2M	$2.00	$6.00
267	29	Nova Experimental Chat 12-10	1206	±3	9K	1.2%	2.4%	84 tps	12.9s	98K	$0	$0
268	13	GPT-5.3 Instant	1206	±4	5.5K	1.0%	0.9%	63 tps	0.8s	400K	$1.75	$14.00
269	26	Grok 4.1 Fast Non-Reasoning	1207	±3	20.1K	1.7%	0.9%	101 tps	0.5s	2M	$0.20	$0.50
270	22	GPT-5 Chat	1208	±2	58K	1.4%	1.3%	95 tps	0.9s	400K	$1.25	$10.00
271	29	Qwen3 VL 235B A22B Instruct	1211	±3	10.2K	2.5%	3.1%	75 tps	1.9s	129K	$0.37	$1.81
272	4	Claude Sonnet 4.6	1212	±5	6K	1.1%	1.6%	47 tps	1.2s	200K	$3.00	$15.00
273	17	Grok 4.20 Beta Reasoning	1216	±7	2.1K	1.7%	1.1%	77 tps	4.5s	2M	$2.00	$5.50
274	10	Gemini 3 Pro	1222	±3	44.5K	1.1%	2.1%	50 tps	3.6s	1M	$2.00	$12.00
275	5	Claude Sonnet 4.6 (Thinking)	1256	±5	5.8K	1.4%	4.7%	57 tps	1.1s	200K	$3.00	$15.00
276	10	GPT-5.2 Instant	1260	±3	27.1K	0.8%	1.7%	52 tps	2.0s	400K	$1.75	$14.00
277	6	Gemini 3.1 Pro	1270	±5	12.3K	1.1%	3.5%	35 tps	4.1s	1M	$2.00	$12.00
278	2	Claude Opus 4.6	1274	±4	7.1K	1.4%	2.1%	48 tps	1.7s	200K	$5.00	$25.00
279	8	GPT-5.1	1288	±2	19.7K	1.3%	2.3%	71 tps	1.4s	400K	$1.42	$11.33
280	8	GPT-5.1 (High)	1289	±2	23K	1.4%	3.2%	76 tps	6.9s	400K	$1.25	$10.00

7of8

View All (283 models)