Leaderboard | Text

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

Choose topic

All topics Facts and Information Creative Writing and Ideation Logic and Problem-Solving Task Completion Coding

Language

Choose language

All languages English Chinese Arabic Spanish Indonesian Japanese

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

1157

Qwen3 Omni 30B A3B Instruct

1156

GPT-5 (High)

1156

Grok 4 Fast Non-Reasoning

1153

Gemini 2.5 Pro

1152

DeepSeek V3.1 Terminus Chat

1152

DeepSeek V3.2

1151

Nova Experimental Chat 10-09

1151

Gemini 2.5 Flash Thinking Preview 0925

1151

Qwen3 Max Instruct Preview

1150

Step 3.5 Flash

1149

GLM 5

1148

Claude Sonnet 4.5 (Thinking)

1148

MAI-DS-R1 FP8

1147

Kimi K2.5

1147

Claude Opus 4.5 (Thinking)

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
41	62	Qwen3 Omni 30B A3B Instruct	1157	±6	2.3K	1.9%	3.9%	65 tps	1.2s	66K	$0.35	$0.97
42	26	GPT-5 (High)	1156	±3	12.3K	2.3%	4.5%	81 tps	35.9s	400K	$1.25	$10.00
43	52	Grok 4 Fast Non-Reasoning	1156	±2	16.7K	2.2%	1.5%	93 tps	0.6s	2M	$0.27	$0.67
44	44	Gemini 2.5 Pro	1153	±2	38.9K	1.8%	2.3%	45 tps	2.6s	1M	$1.25	$10.00
45	44	DeepSeek V3.1 Terminus Chat	1152	±2	14.5K	1.8%	3.4%	27 tps	1.5s	131K	$0.86	$1.80
46	40	DeepSeek V3.2	1152	±3	16.5K	1.1%	1.4%	83 tps	5.1s	131K	$0.43	$1.09
47	84	Nova Experimental Chat 10-09	1151	±4	5.3K	4.0%	<0.1%	59 tps	6.1s	98K	$0	$0
48	43	Gemini 2.5 Flash Thinking Preview 0925	1151	±3	13.9K	2.1%	<0.1%	111 tps	4.7s	1M	$0.30	$2.50
49	42	Qwen3 Max Instruct Preview	1151	±2	26.4K	2.0%	1.1%	31 tps	1.7s	256K	$1.43	$6.61
50	48	Step 3.5 Flash	1150	±6	2.9K	1.7%	2.2%	109 tps	0.6s	256K	$0.05	$0.15
51	22	GLM 5	1149	±4	6.4K	1.2%	3.4%	36 tps	2.7s	200K	$0.72	$2.55
52	10	Claude Sonnet 4.5 (Thinking)	1148	±2	27.2K	2.5%	1.9%	44 tps	1.1s	200K	$3.00	$15.00
53	182	MAI-DS-R1 FP8	1148	±10	605	2.4%	<0.1%	79 tps	2.8s	164K	$0.25	$1.00
54	33	Kimi K2.5	1147	±3	16.1K	1.2%	6.5%	33 tps	1.7s	262K	$0.34	$2.57
55	7	Claude Opus 4.5 (Thinking)	1147	±4	21.9K	1.6%	1.8%	49 tps	1.4s	200K	$5.00	$25.00
56	40	Qwen3 235B A22B Instruct 2507	1147	±2	24.5K	1.4%	6.8%	13 tps	1.9s	262K	$0.13	$0.52
57	42	GPT-5.2 (Extra High)	1147	±2	15.6K	1.4%	13.2%	17 tps	20.5s	400K	$1.75	$14.00
58	48	Polaris Alpha	1146	±5	1.6K	1.9%	<0.1%	48 tps	1.1s	256K	$0	$0
59	44	Kimi K2 Thinking Turbo	1145	±2	10.9K	1.6%	2.0%	75 tps	1.4s	262K	$1.15	$8.00
60	56	DeepSeek V3.2 Thinking	1144	±4	16.9K	1.3%	9.0%	30 tps	2.6s	131K	$0.28	$0.42
61	29	MiniMax M2.7	1142	±13	700	1.4%	3.0%	34 tps	2.5s	205K	$0.30	$1.20
62	48	Grok 4 Fast Reasoning	1142	±3	14.5K	2.0%	2.1%	102 tps	3.1s	2M	$0.30	$0.75
63	17	GPT-5.4 mini	1141	±14	545	1.8%	0.8%	148 tps	0.5s	400K	$0.75	$4.50
64	56	DeepSeek V3.1 Turbo	1134	±3	9.5K	1.2%	0.9%	173 tps	1.3s	164K	$2.00	$3.75
65	65	Mistral Large 3	1133	±4	10.8K	2.6%	2.1%	51 tps	1.0s	256K	$0.50	$1.50
66	80	GPT-5 (Minimal)	1132	±3	12.9K	2.2%	<0.1%	67 tps	1.4s	400K	$1.25	$10.00
67	84	Claude Sonnet 3.7 (Thinking)	1130	±4	2.3K	2.7%	<0.1%	41 tps	2.6s	200K	$3.00	$15.00
68	56	MiniMax M2.1 Lightning	1129	±5	3.6K	1.4%	1.7%	52 tps	2.1s	205K	$0.30	$2.40
69	86	Qwen3 235B A22B	1129	±3	7.8K	2.1%	5.3%	71 tps	0.9s	41K	$0.23	$0.63
70	71	DeepSeek V3.1	1125	±4	4.4K	1.1%	0.8%	197 tps	0.4s	164K	$0.55	$1.60
71	65	DeepSeek V3.2 Exp Chat	1125	±3	11.5K	1.9%	2.6%	29 tps	1.5s	131K	$0.27	$0.39
72	60	MiniMax M2.1	1124	±3	24.4K	1.0%	2.1%	66 tps	2.6s	205K	$0.30	$1.20
73	86	Nemotron 3 Nano (Thinking)	1123	±3	5.9K	1.5%	2.0%	200 tps	0.5s	256K	$0	$0
74	52	Qwen3.5 122B A17B	1123	±5	2.6K	1.3%	1.5%	82 tps	1.4s	256K	$0.40	$3.20
75	100	Qwen Plus 0728 (Thinking)	1123	±5	3K	2.1%	<0.1%	56 tps	1.1s	1M	$0.40	$4.00
76	26	Claude Haiku 4.5 (Extended Thinking)	1121	±3	14.1K	1.8%	1.4%	115 tps	0.7s	200K	$1.00	$5.00
77	60	Gemini 2.5 Flash Preview 0925	1118	±3	14.4K	2.2%	1.2%	5 tps	0.9s	1M	$0.13	$0.97
78	52	GPT-5	1117	±2	31.1K	1.7%	3.1%	78 tps	23.1s	400K	$1.25	$9.67
79	81	OpenAI o3-pro	1116	±5	3.2K	2.8%	5.2%	22 tps	70.8s	200K	$20.00	$80.00
80	68	Grok 4	1110	±1	98.8K	0.9%	3.9%	29 tps	11.1s	256K	$3.00	$15.00

2of11

View All (410 models)