Leaderboard | Text

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

Choose topic

All topics Facts and Information Creative Writing and Ideation Logic and Problem-Solving Task Completion Coding

Language

Choose language

All languages English Chinese Arabic Spanish Indonesian Japanese

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

1291

Kimi K2.5

1231

Qwen3 Next 80B A3B Instruct

1228

MiniMax M2.5 Lightning

1216

Qwen3.5 122B A17B

1211

Qwen3.5 27B

1210

Kimi K2.5 Instant

1192

Kimi K2 Thinking Turbo

1178

DeepSeek V3.2 Thinking

1165

gpt-oss-120b

1134

Grok 3 Beta

1131

Mistral Large 3

1107

DeepSeek V3.2 Exp Chat

1102

Step 3.5 Flash

1093

Qwen3 235B A22B

1089

DeepSeek V3.2 Exp Thinking

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
1	19	Kimi K2.5	1291	±11	16.5K	3.4%	6.5%	33 tps	1.7s	262K	$0.34	$2.57
2	31	Qwen3 Next 80B A3B Instruct	1231	±5	8.8K	5.8%	0.6%	84 tps	1.1s	256K	$0.20	$1.42
3	31	MiniMax M2.5 Lightning	1228	±14	1.7K	3.2%	1.5%	51 tps	2.0s	205K	$0.60	$2.40
4	36	Qwen3.5 122B A17B	1216	±15	1.9K	3.1%	1.5%	82 tps	1.4s	256K	$0.40	$3.20
5	36	Qwen3.5 27B	1211	±16	910	4.7%	3.7%	55 tps	2.6s	256K	$0.30	$2.40
6	36	Kimi K2.5 Instant	1210	±8	1.8K	3.2%	2.9%	32 tps	3.0s	262K	$0.50	$3.00
7	49	Kimi K2 Thinking Turbo	1192	±6	20.3K	3.4%	2.0%	75 tps	1.4s	262K	$1.15	$8.00
8	60	DeepSeek V3.2 Thinking	1178	±9	23.3K	4.0%	9.0%	30 tps	2.6s	131K	$0.28	$0.42
9	69	gpt-oss-120b	1165	±5	19.2K	5.0%	0.7%	213 tps	0.5s	131K	$0.11	$0.50
10	97	Grok 3 Beta	1134	±9	2K	0.8%	<0.1%	58 tps	0.8s	131K	$3.00	$15.00
11	77	Mistral Large 3	1131	±8	5.4K	5.8%	2.1%	51 tps	1.0s	256K	$0.50	$1.50
12	90	DeepSeek V3.2 Exp Chat	1107	±4	5.5K	6.1%	2.6%	29 tps	1.5s	131K	$0.27	$0.39
13	90	Step 3.5 Flash	1102	±24	810	3.6%	2.2%	109 tps	0.6s	256K	$0.05	$0.15
14	98	Qwen3 235B A22B	1093	±6	4.5K	8.0%	5.3%	71 tps	0.9s	41K	$0.23	$0.63
15	98	DeepSeek V3.2 Exp Thinking	1089	±7	5.9K	3.5%	7.2%	26 tps	3.0s	131K	$0.28	$0.42
16	112	Kimi K2 Fast	1073	±5	35K	6.4%	0.8%	365 tps	0.5s	131K	$1.00	$3.00
17	112	gpt-oss-20b	1066	±6	7.7K	7.1%	0.5%	216 tps	0.5s	131K	$0.06	$0.26
18	119	DeepSeek V3.1 Terminus Thinking	1061	±9	2.9K	9.4%	5.9%	27 tps	1.8s	131K	$0.56	$1.68
19	151	Llama 3 8B Turbo	1059	±24	600	1.6%	<0.1%	97 tps	0.1s	8K	$0.12	$0.13
20	119	Qwen3 32B Fast	1052	±6	11.4K	5.2%	11.6%	30 tps	3.1s	41K	$0.10	$0.25
21	164	Llama 3 70B Turbo	1037	±6	4.3K	1.0%	<0.1%	31 tps	0.0s	8K	$0.73	$0.83
22	135	QwQ 32B	1035	±4	11.6K	6.4%	5.4%	41 tps	2.1s	16K	$0.43	$0.56
23	174	Qwen 2.5 72B Turbo	1035	±22	670	5.0%	<0.1%	84 tps	0.8s	131K	$0.60	$0.60
24	135	Qwen3 VL 30B A3B Instruct	1034	±15	1K	6.7%	1.8%	80 tps	2.6s	129K	$0.18	$0.67
25	135	DeepSeek V3	1032	±5	17.6K	3.7%	0.9%	69 tps	1.1s	64K	$0.59	$1.49
26	144	Command A	1024	±4	22.4K	4.8%	2.2%	42 tps	0.8s	256K	$2.00	$7.33
27	148	Nemotron 3 Nano (Thinking)	1012	±13	2K	6.7%	2.0%	200 tps	0.5s	256K	$0	$0
28	189	K2 Think	1005	±16	1.4K	5.6%	<0.1%	418 tps	2.8s	N/A	$0	$0
29	148	DeepSeek-R1 Turbo	1003	±9	2.5K	5.6%	2.6%	29 tps	1.8s	64K	$2.85	$4.75
30	148	Qwen3 30B A3B	994	±8	6.3K	6.9%	5.1%	163 tps	1.0s	41K	$0.06	$0.21
31	159	Mistral Small 3.1 24B Instruct	980	±11	2.9K	4.3%	7.5%	15 tps	2.4s	131K	$0.06	$0.18
32	159	DeepSeek-R1 0528	979	±6	5.5K	3.5%	1.3%	93 tps	0.5s	64K	$1.60	$3.67
33	167	Qwen 2.5 32B Instruct	972	±7	4.1K	6.5%	2.5%	48 tps	1.0s	131K	$0.21	$0.25
34	167	Llama 4 Maverick	971	±5	21K	5.0%	1.2%	88 tps	2.4s	1M	$0.23	$0.83
35	167	Pixtral Large	969	±14	3.5K	3.9%	2.5%	57 tps	1.3s	128K	$1.50	$4.50
36	167	Qwen3 VL 30B A3B Thinking	967	±11	1.9K	8.9%	4.5%	84 tps	2.9s	127K	$0.20	$1.47
37	167	Qwen3 14B	962	±8	5.3K	8.4%	1.7%	109 tps	0.8s	41K	$0.04	$0.15
38	167	Llama 3.1 8B Turbo	958	±14	2.2K	2.0%	2.1%	650 tps	0.5s	128K	$0.13	$0.14
39	230	NVIDIA Llama 3.3 Nemotron Super 49B v1	942	±9	3.6K	2.3%	<0.1%	13 tps	N/A	131K	$0.07	$0.20
40	179	DeepSeek-R1	939	±6	6.4K	4.3%	0.8%	133 tps	0.6s	64K	$0.91	$3.07

1of3

View All (99 models)