Leaderboard | Text

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

Choose topic

All topics Facts and Information Creative Writing and Ideation Logic and Problem-Solving Task Completion Coding

Language

Choose language

All languages English Chinese Arabic Spanish Indonesian Japanese

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

1388

Claude Opus 4.6 (Thinking)

1339

GPT-5.4

1328

Claude Opus 4.6

1325

Claude Sonnet 4.6 (Thinking)

1268

Gemini 3.1 Pro

1266

GPT-5.2 Instant

1258

GPT-5.1 (High)

1243

GPT-5 (High)

1233

GPT-5.1

1232

Claude Sonnet 4.6

1226

Qwen3 30B A3B Instruct 2507

1217

Gemini 3 Pro

1208

GPT-5 Chat

1202

Qwen3 VL 235B A22B Instruct

1191

GPT-5.3 Instant

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
1	1	Claude Opus 4.6 (Thinking)	1388	±10	1.8K	0.8%	2.5%	56 tps	1.6s	200K	$5.00	$25.00
2	2	GPT-5.4	1339	±15	530	0.9%	2.6%	55 tps	0.8s	1M	$2.50	$15.00
3	2	Claude Opus 4.6	1328	±8	2.3K	0.9%	2.1%	48 tps	1.7s	200K	$5.00	$25.00
4	5	Claude Sonnet 4.6 (Thinking)	1325	±7	1.6K	1.2%	4.7%	57 tps	1.1s	200K	$3.00	$15.00
5	6	Gemini 3.1 Pro	1268	±8	4.3K	0.7%	3.5%	35 tps	4.1s	1M	$2.00	$12.00
6	10	GPT-5.2 Instant	1266	±4	6.2K	0.7%	1.7%	52 tps	2.0s	400K	$1.75	$14.00
7	8	GPT-5.1 (High)	1258	±6	5.3K	1.3%	3.2%	76 tps	6.9s	400K	$1.25	$10.00
8	26	GPT-5 (High)	1243	±7	3K	2.3%	4.5%	81 tps	35.9s	400K	$1.25	$10.00
9	8	GPT-5.1	1233	±8	3.3K	1.4%	2.3%	71 tps	1.4s	400K	$1.42	$11.33
10	4	Claude Sonnet 4.6	1232	±11	1.6K	0.9%	1.6%	47 tps	1.2s	200K	$3.00	$15.00
11	33	Qwen3 30B A3B Instruct 2507	1226	±8	5.6K	2.2%	1.2%	55 tps	1.3s	131K	$0.13	$0.72
12	10	Gemini 3 Pro	1217	±5	11.7K	0.9%	2.1%	50 tps	3.6s	1M	$2.00	$12.00
13	22	GPT-5 Chat	1208	±5	10.4K	2.2%	1.3%	95 tps	0.9s	400K	$1.25	$10.00
14	29	Qwen3 VL 235B A22B Instruct	1202	±10	1.8K	4.5%	3.1%	75 tps	1.9s	129K	$0.37	$1.81
15	13	GPT-5.3 Instant	1191	±11	1.6K	1.2%	0.9%	63 tps	0.8s	400K	$1.75	$14.00
16	17	Grok 4.20 Beta Reasoning	1190	±17	540	0.9%	1.1%	77 tps	4.5s	2M	$2.00	$5.50
17	14	Gemini 3 Pro (Low)	1189	±6	4.8K	0.9%	2.4%	51 tps	3.5s	1M	$2.00	$12.00
18	32	Gemini 2.5 Pro High	1182	±3	6.7K	2.2%	1.5%	48 tps	2.3s	1M	$1.25	$10.00
19	40	Qwen3 235B A22B Instruct 2507	1178	±6	5.1K	1.9%	6.8%	13 tps	1.9s	262K	$0.13	$0.52
20	106	Claude Sonnet 3.5 v2	1177	±8	1.6K	1.2%	<0.1%	46 tps	1.4s	200K	$3.00	$15.00
21	17	GPT-5.2 (High)	1177	±7	7.4K	0.8%	6.7%	18 tps	16.3s	400K	$1.75	$14.00
22	14	Gemini 3 Flash Preview Thinking	1173	±5	4.4K	0.6%	1.6%	3 tps	6.2s	1M	$0.50	$3.00
23	81	GPT-4o	1170	±9	2.3K	2.8%	1.0%	49 tps	2.4s	128K	$3.71	$12.57
24	16	GPT-5.2	1168	±6	3K	1.2%	4.1%	18 tps	2.7s	400K	$1.75	$14.00
25	17	Gemini 3 Flash Preview	1166	±7	2.4K	0.6%	1.3%	138 tps	1.4s	1M	$0.50	$3.00
26	60	Gemini 2.5 Flash Preview 0925	1163	±7	2.7K	2.9%	1.2%	5 tps	0.9s	1M	$0.13	$0.97
27	7	Claude Opus 4.5 (Thinking)	1155	±5	5.3K	1.6%	1.8%	49 tps	1.4s	200K	$5.00	$25.00
28	26	Grok 4.1 Fast Non-Reasoning	1151	±6	3.2K	1.8%	0.9%	101 tps	0.5s	2M	$0.20	$0.50
29	68	Qwen Plus (Aug'24)	1150	±5	7.5K	1.4%	1.4%	53 tps	1.3s	30K	$0.40	$1.20
30	17	Claude Opus 4.5	1144	±8	2.4K	2.1%	1.5%	45 tps	1.5s	200K	$5.00	$25.00
31	37	Qwen3 Omni 30B A3B Thinking	1139	±7	1.6K	1.2%	3.7%	67 tps	1.2s	66K	$0.97	$1.79
32	29	Nova Experimental Chat 12-10	1138	±9	1.9K	0.5%	2.4%	84 tps	12.9s	98K	$0	$0
33	10	Claude Sonnet 4.5 (Thinking)	1136	±5	6.8K	2.7%	1.9%	44 tps	1.1s	200K	$3.00	$15.00
34	42	GPT-5.2 (Extra High)	1131	±5	3.7K	0.9%	13.2%	17 tps	20.5s	400K	$1.75	$14.00
35	26	Claude Haiku 4.5 (Extended Thinking)	1129	±5	3.6K	1.6%	1.4%	115 tps	0.7s	200K	$1.00	$5.00
36	42	Qwen3 Max Instruct Preview	1126	±4	4.3K	2.8%	1.1%	31 tps	1.7s	256K	$1.43	$6.61
37	44	Gemini 2.5 Pro	1126	±4	16.2K	1.5%	2.3%	45 tps	2.6s	1M	$1.25	$10.00
38	44	Grok 4.1 Fast Reasoning	1119	±6	5.4K	1.5%	1.5%	58 tps	7.3s	2M	$0.20	$0.50
39	37	Claude Sonnet 4.5	1116	±6	5K	3.1%	1.4%	41 tps	1.3s	200K	$1.80	$9.00
40	56	DeepSeek V3.1 Turbo	1114	±6	4K	2.1%	0.9%	173 tps	1.3s	164K	$2.00	$3.75

1of4

View All (142 models)