Leaderboard | Text

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

Choose topic

All topics Facts and Information Creative Writing and Ideation Logic and Problem-Solving Task Completion Coding

Language

Choose language

All languages English Chinese Arabic Spanish Indonesian Japanese

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

1364

GPT-5.4

1342

Claude Opus 4.6 (Thinking)

1316

Grok 4.20 Beta Non-reasoning

1289

GPT-5.1 (High)

1288

GPT-5.1

1274

Claude Opus 4.6

1270

Gemini 3.1 Pro

1260

GPT-5.2 Instant

1256

Claude Sonnet 4.6 (Thinking)

1222

Gemini 3 Pro

1216

Grok 4.20 Beta Reasoning

1212

Claude Sonnet 4.6

1211

Qwen3 VL 235B A22B Instruct

1208

GPT-5 Chat

1207

Grok 4.1 Fast Non-Reasoning

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
1	2	GPT-5.4	1364	±7	2.3K	1.1%	2.6%	55 tps	0.8s	1M	$2.50	$15.00
2	1	Claude Opus 4.6 (Thinking)	1342	±5	5.7K	0.9%	2.5%	56 tps	1.6s	200K	$5.00	$25.00
3	22	Grok 4.20 Beta Non-reasoning	1316	±13	630	3.8%	1.1%	151 tps	0.6s	2M	$2.00	$6.00
4	8	GPT-5.1 (High)	1289	±2	23K	1.4%	3.2%	76 tps	6.9s	400K	$1.25	$10.00
5	8	GPT-5.1	1288	±2	19.7K	1.3%	2.3%	71 tps	1.4s	400K	$1.42	$11.33
6	2	Claude Opus 4.6	1274	±4	7.1K	1.4%	2.1%	48 tps	1.7s	200K	$5.00	$25.00
7	6	Gemini 3.1 Pro	1270	±5	12.3K	1.1%	3.5%	35 tps	4.1s	1M	$2.00	$12.00
8	10	GPT-5.2 Instant	1260	±3	27.1K	0.8%	1.7%	52 tps	2.0s	400K	$1.75	$14.00
9	5	Claude Sonnet 4.6 (Thinking)	1256	±5	5.8K	1.4%	4.7%	57 tps	1.1s	200K	$3.00	$15.00
10	10	Gemini 3 Pro	1222	±3	44.5K	1.1%	2.1%	50 tps	3.6s	1M	$2.00	$12.00
11	17	Grok 4.20 Beta Reasoning	1216	±7	2.1K	1.7%	1.1%	77 tps	4.5s	2M	$2.00	$5.50
12	4	Claude Sonnet 4.6	1212	±5	6K	1.1%	1.6%	47 tps	1.2s	200K	$3.00	$15.00
13	29	Qwen3 VL 235B A22B Instruct	1211	±3	10.2K	2.5%	3.1%	75 tps	1.9s	129K	$0.37	$1.81
14	22	GPT-5 Chat	1208	±2	58K	1.4%	1.3%	95 tps	0.9s	400K	$1.25	$10.00
15	26	Grok 4.1 Fast Non-Reasoning	1207	±3	20.1K	1.7%	0.9%	101 tps	0.5s	2M	$0.20	$0.50
16	13	GPT-5.3 Instant	1206	±4	5.5K	1.0%	0.9%	63 tps	0.8s	400K	$1.75	$14.00
17	29	Nova Experimental Chat 12-10	1206	±3	9K	1.2%	2.4%	84 tps	12.9s	98K	$0	$0
18	33	Grok 4.20 Multi Agent Beta	1197	±7	1.7K	1.8%	1.2%	56 tps	8.8s	2M	$2.00	$6.00
19	14	Gemini 3 Flash Preview Thinking	1195	±3	19.7K	1.0%	1.6%	3 tps	6.2s	1M	$0.50	$3.00
20	14	Gemini 3 Pro (Low)	1195	±3	20.3K	1.1%	2.4%	51 tps	3.5s	1M	$2.00	$12.00
21	16	GPT-5.2	1193	±2	16.3K	0.9%	4.1%	18 tps	2.7s	400K	$1.75	$14.00
22	37	Qwen3 Omni 30B A3B Thinking	1188	±5	5.3K	1.2%	3.7%	67 tps	1.2s	66K	$0.97	$1.79
23	32	Gemini 2.5 Pro High	1175	±1	27.6K	2.0%	1.5%	48 tps	2.3s	1M	$1.25	$10.00
24	17	Gemini 3 Flash Preview	1173	±3	12.8K	0.7%	1.3%	138 tps	1.4s	1M	$0.50	$3.00
25	17	GPT-5.2 (High)	1168	±2	31.4K	1.1%	6.7%	18 tps	16.3s	400K	$1.75	$14.00
26	33	Qwen3 30B A3B Instruct 2507	1167	±2	24.4K	1.7%	1.2%	55 tps	1.3s	131K	$0.13	$0.72
27	44	Grok 4.1 Fast Reasoning	1160	±2	23.1K	1.8%	1.5%	58 tps	7.3s	2M	$0.20	$0.50
28	62	Qwen3 Omni 30B A3B Instruct	1157	±6	2.3K	1.9%	3.9%	65 tps	1.2s	66K	$0.35	$0.97
29	26	GPT-5 (High)	1156	±3	12.3K	2.3%	4.5%	81 tps	35.9s	400K	$1.25	$10.00
30	52	Grok 4 Fast Non-Reasoning	1156	±2	16.7K	2.2%	1.5%	93 tps	0.6s	2M	$0.27	$0.67
31	44	Gemini 2.5 Pro	1153	±2	38.9K	1.8%	2.3%	45 tps	2.6s	1M	$1.25	$10.00
32	44	DeepSeek V3.1 Terminus Chat	1152	±2	14.5K	1.8%	3.4%	27 tps	1.5s	131K	$0.86	$1.80
33	40	DeepSeek V3.2	1152	±3	16.5K	1.1%	1.4%	83 tps	5.1s	131K	$0.43	$1.09
34	42	Qwen3 Max Instruct Preview	1151	±2	26.4K	2.0%	1.1%	31 tps	1.7s	256K	$1.43	$6.61
35	22	GLM 5	1149	±4	6.4K	1.2%	3.4%	36 tps	2.7s	200K	$0.72	$2.55
36	10	Claude Sonnet 4.5 (Thinking)	1148	±2	27.2K	2.5%	1.9%	44 tps	1.1s	200K	$3.00	$15.00
37	7	Claude Opus 4.5 (Thinking)	1147	±4	21.9K	1.6%	1.8%	49 tps	1.4s	200K	$5.00	$25.00
38	40	Qwen3 235B A22B Instruct 2507	1147	±2	24.5K	1.4%	6.8%	13 tps	1.9s	262K	$0.13	$0.52
39	42	GPT-5.2 (Extra High)	1147	±2	15.6K	1.4%	13.2%	17 tps	20.5s	400K	$1.75	$14.00
40	29	MiniMax M2.7	1142	±13	700	1.4%	3.0%	34 tps	2.5s	205K	$0.30	$1.20

1of6

View All (203 models)