Leaderboard | Text

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

Topics

Choose topic

All topics Facts and Information Creative Writing and Ideation Logic and Problem-Solving Task Completion Coding

Choose language

All languages English Chinese Arabic Spanish Indonesian Japanese

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

1398

Claude Opus 4.6 (Thinking)

1381

Claude Opus 4.6

1343

GPT-5.4

1314

Claude Sonnet 4.6

1305

Claude Sonnet 4.6 (Thinking)

1264

GPT-5.1

1260

Claude Opus 4.5 (Thinking)

1250

GPT-5.1 (High)

1245

Gemini 3.1 Pro

1232

GPT-5.2 Instant

1228

Claude Sonnet 4.5 (Thinking)

1207

Gemini 3 Pro

1203

GPT-5.4 mini

1199

GPT-5.3 Instant

1196

GPT-5 Chat

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
1	1	Claude Opus 4.6 (Thinking)	1398	±4	16.9K	1.4%	2.5%	56 tps	1.6s	200K	$5.00	$25.00
2	2	Claude Opus 4.6	1381	±3	21.8K	1.1%	2.1%	48 tps	1.7s	200K	$5.00	$25.00
3	2	GPT-5.4	1343	±5	5.8K	1.3%	2.6%	55 tps	0.8s	1M	$2.50	$15.00
4	4	Claude Sonnet 4.6	1314	±4	16.6K	1.2%	1.6%	47 tps	1.2s	200K	$3.00	$15.00
5	5	Claude Sonnet 4.6 (Thinking)	1305	±3	17.7K	2.7%	4.7%	57 tps	1.1s	200K	$3.00	$15.00
6	8	GPT-5.1	1264	±2	27.6K	2.3%	2.3%	71 tps	1.4s	400K	$1.42	$11.33
7	7	Claude Opus 4.5 (Thinking)	1260	±2	61K	1.8%	1.8%	49 tps	1.4s	200K	$5.00	$25.00
8	8	GPT-5.1 (High)	1250	±3	37.8K	2.4%	3.2%	76 tps	6.9s	400K	$1.25	$10.00
9	6	Gemini 3.1 Pro	1245	±3	26K	2.0%	3.5%	35 tps	4.1s	1M	$2.00	$12.00
10	10	GPT-5.2 Instant	1232	±2	39.3K	1.6%	1.7%	52 tps	2.0s	400K	$1.75	$14.00
11	10	Claude Sonnet 4.5 (Thinking)	1228	±1	66.2K	3.4%	1.9%	44 tps	1.1s	200K	$3.00	$15.00
12	10	Gemini 3 Pro	1207	±1	78K	2.2%	2.1%	50 tps	3.6s	1M	$2.00	$12.00
13	17	GPT-5.4 mini	1203	±10	885	2.7%	0.8%	148 tps	0.5s	400K	$0.75	$4.50
14	13	GPT-5.3 Instant	1199	±6	9.3K	1.7%	0.9%	63 tps	0.8s	400K	$1.75	$14.00
15	22	GPT-5 Chat	1196	±1	75.1K	3.4%	1.3%	95 tps	0.9s	400K	$1.25	$10.00
16	22	Grok 4.20 Beta Non-reasoning	1192	±11	1.3K	3.1%	1.1%	151 tps	0.6s	2M	$2.00	$6.00
17	14	Gemini 3 Flash Preview Thinking	1190	±2	47K	2.3%	1.6%	3 tps	6.2s	1M	$0.50	$3.00
18	29	Qwen3 VL 235B A22B Instruct	1188	±3	13.5K	5.2%	3.1%	75 tps	1.9s	129K	$0.37	$1.81
19	29	Nova Experimental Chat 12-10	1188	±3	9.8K	1.9%	2.4%	84 tps	12.9s	98K	$0	$0
20	37	Qwen3 Omni 30B A3B Thinking	1186	±3	7.5K	2.1%	3.7%	67 tps	1.2s	66K	$0.97	$1.79
21	33	Grok 4.20 Multi Agent Beta	1183	±9	2.6K	2.0%	1.2%	56 tps	8.8s	2M	$2.00	$6.00
22	22	GLM 5	1182	±4	17.3K	2.1%	3.4%	36 tps	2.7s	200K	$0.72	$2.55
23	14	Gemini 3 Pro (Low)	1180	±3	28.9K	2.2%	2.4%	51 tps	3.5s	1M	$2.00	$12.00
24	17	GPT-5.2 (High)	1180	±2	54.6K	1.9%	6.7%	18 tps	16.3s	400K	$1.75	$14.00
25	26	Grok 4.1 Fast Non-Reasoning	1177	±2	25.7K	3.0%	0.9%	101 tps	0.5s	2M	$0.20	$0.50
26	106	GPT-5.4 nano	1177	±10	650	2.3%	0.7%	149 tps	0.5s	400K	$0.20	$1.25
27	26	GPT-5 (High)	1177	±2	22.1K	3.1%	4.5%	81 tps	35.9s	400K	$1.25	$10.00
28	16	GPT-5.2	1176	±2	22.6K	1.8%	4.1%	18 tps	2.7s	400K	$1.75	$14.00
29	17	Grok 4.20 Beta Reasoning	1175	±7	3.3K	1.8%	1.1%	77 tps	4.5s	2M	$2.00	$5.50
30	17	Claude Opus 4.5	1173	±2	22.5K	2.2%	1.5%	45 tps	1.5s	200K	$5.00	$25.00
31	26	Claude Haiku 4.5 (Extended Thinking)	1173	±2	24.3K	3.1%	1.4%	115 tps	0.7s	200K	$1.00	$5.00
32	62	Qwen3 Omni 30B A3B Instruct	1168	±5	3K	2.3%	3.9%	65 tps	1.2s	66K	$0.35	$0.97
33	29	MiniMax M2.7	1167	±8	1.1K	1.8%	3.0%	34 tps	2.5s	205K	$0.30	$1.20
34	56	MiniMax M2.1 Lightning	1165	±5	4.9K	1.2%	1.7%	52 tps	2.1s	205K	$0.30	$2.40
35	32	Gemini 2.5 Pro High	1159	±2	42.7K	4.5%	1.5%	48 tps	2.3s	1M	$1.25	$10.00
36	17	Gemini 3 Flash Preview	1159	±3	17.8K	2.1%	1.3%	138 tps	1.4s	1M	$0.50	$3.00
37	33	Qwen3 30B A3B Instruct 2507	1158	±2	31.6K	4.1%	1.2%	55 tps	1.3s	131K	$0.13	$0.72
38	44	DeepSeek V3.1 Terminus Chat	1150	±3	17.8K	4.2%	3.4%	27 tps	1.5s	131K	$0.86	$1.80
39	42	Qwen3 Max Instruct Preview	1148	±2	36.6K	3.5%	1.1%	31 tps	1.7s	256K	$1.43	$6.61
40	40	Qwen3 235B A22B Instruct 2507	1147	±2	32.2K	4.7%	6.8%	13 tps	1.9s	262K	$0.13	$0.52

1of6

View All (208 models)