Leaderboard | Text

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

Topics

Choose topic

All topics Facts and Information Creative Writing and Ideation Logic and Problem-Solving Task Completion Coding

Choose language

All languages English Chinese Arabic Spanish Indonesian Japanese

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

1270

Qwen3 Next 80B A3B Instruct

1269

gpt-oss-120b

1175

Kimi K2.5

1171

Kimi K2 Thinking Turbo

1163

Step 3.5 Flash

1158

DeepSeek V3.2 Thinking

1119

MiniMax M2.5 Lightning

1085

gpt-oss-20b

1073

DeepSeek-R1 Turbo

1069

Kimi K2 Fast

1061

Qwen3 32B Fast

1059

Nemotron 3 Nano (Thinking)

1054

DeepSeek V3.2 Exp Chat

1053

Qwen3.5 122B A17B

1053

Qwen3 14B

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
1	33	Qwen3 Next 80B A3B Instruct	1270	±10	2.3K	2.8%	0.6%	84 tps	1.1s	256K	$0.20	$1.42
2	48	gpt-oss-120b	1269	±7	4.6K	1.4%	0.7%	213 tps	0.5s	131K	$0.11	$0.50
3	33	Kimi K2.5	1175	±12	3.1K	1.6%	6.5%	33 tps	1.7s	262K	$0.34	$2.57
4	44	Kimi K2 Thinking Turbo	1171	±11	1.8K	4.2%	2.0%	75 tps	1.4s	262K	$1.15	$8.00
5	48	Step 3.5 Flash	1163	±23	640	0.8%	2.2%	109 tps	0.6s	256K	$0.05	$0.15
6	56	DeepSeek V3.2 Thinking	1158	±19	2.4K	2.3%	9.0%	30 tps	2.6s	131K	$0.28	$0.42
7	79	MiniMax M2.5 Lightning	1119	±16	650	0.8%	1.5%	51 tps	2.0s	205K	$0.60	$2.40
8	101	gpt-oss-20b	1085	±12	2.1K	2.6%	0.5%	216 tps	0.5s	131K	$0.06	$0.26
9	95	DeepSeek-R1 Turbo	1073	±16	780	3.7%	2.6%	29 tps	1.8s	64K	$2.85	$4.75
10	113	Kimi K2 Fast	1069	±7	8.5K	1.0%	0.8%	365 tps	0.5s	131K	$1.00	$3.00
11	121	Qwen3 32B Fast	1061	±9	3K	2.4%	11.6%	30 tps	3.1s	41K	$0.10	$0.25
12	86	Nemotron 3 Nano (Thinking)	1059	±18	825	1.8%	2.0%	200 tps	0.5s	256K	$0	$0
13	65	DeepSeek V3.2 Exp Chat	1054	±14	1.2K	3.7%	2.6%	29 tps	1.5s	131K	$0.27	$0.39
14	52	Qwen3.5 122B A17B	1053	±25	580	3.3%	1.5%	82 tps	1.4s	256K	$0.40	$3.20
15	133	Qwen3 14B	1053	±13	1.7K	2.9%	1.7%	109 tps	0.8s	41K	$0.04	$0.15
16	37	Kimi K2.5 Instant	1046	±16	620	2.4%	2.9%	32 tps	3.0s	262K	$0.50	$3.00
17	95	DeepSeek V3.2 Exp Thinking	1046	±18	735	4.5%	7.2%	26 tps	3.0s	131K	$0.28	$0.42
18	133	DeepSeek-R1 0528	1038	±13	1.7K	2.0%	1.3%	93 tps	0.5s	64K	$1.60	$3.67
19	121	QwQ 32B	1015	±7	4.6K	1.4%	5.4%	41 tps	2.1s	16K	$0.43	$0.56
20	153	Qwen 2.5 32B Instruct	1004	±14	1.2K	1.7%	2.5%	48 tps	1.0s	131K	$0.21	$0.25
21	86	Qwen3 235B A22B	998	±18	1.4K	3.2%	5.3%	71 tps	0.9s	41K	$0.23	$0.63
22	126	Qwen3 30B A3B	986	±16	1.9K	3.4%	5.1%	163 tps	1.0s	41K	$0.06	$0.21
23	126	DeepSeek V3	975	±10	5.5K	0.5%	0.9%	69 tps	1.1s	64K	$0.59	$1.49
24	65	Mistral Large 3	974	±20	1.2K	5.5%	2.1%	51 tps	1.0s	256K	$0.50	$1.50
25	129	Command A	959	±8	6.2K	1.2%	2.2%	42 tps	0.8s	256K	$2.00	$7.33
26	165	Pixtral Large	941	±18	940	3.6%	2.5%	57 tps	1.3s	128K	$1.50	$4.50
27	106	DeepSeek V3.1 Terminus Thinking	927	±17	840	4.5%	5.9%	27 tps	1.8s	131K	$0.56	$1.68
28	161	Mistral Small 3.1	924	±36	615	2.4%	7.4%	13 tps	2.6s	32K	$0.17	$0.28
29	148	DeepSeek-R1	904	±16	1.7K	2.8%	0.8%	133 tps	0.6s	64K	$0.91	$3.07
30	161	Llama 4 Maverick	900	±11	5.1K	1.9%	1.2%	88 tps	2.4s	1M	$0.23	$0.83
31	219	NVIDIA Llama 3.3 Nemotron Super 49B v1	883	±17	915	1.1%	<0.1%	13 tps	N/A	131K	$0.07	$0.20
32	201	Gemma 3 27B IT	879	±21	560	1.8%	2.0%	60 tps	0.8s	128K	$0.17	$0.29
33	186	GLM 4.6V Flash	858	±23	750	2.6%	3.7%	64 tps	2.1s	128K	$0.04	$0.40
34	177	Mistral Small 3.1 24B Instruct	839	±22	695	3.5%	7.5%	15 tps	2.4s	131K	$0.06	$0.18
35	222	Sky T1 32B Preview	821	±18	625	1.6%	7.8%	73 tps	0.6s	16K	$0.12	$0.18
36	292	Arcee AI Spotlight	788	±15	1.1K	1.3%	<0.1%	121 tps	0.4s	131K	$0.18	$0.18
37	225	Command R 7B	787	±26	660	2.9%	1.1%	76 tps	0.4s	128K	$0.04	$0.15
38	200	NVIDIA Llama 3.1 Nemotron 70B	783	±17	1.2K	2.4%	<0.1%	9 tps	0.1s	128K	$0.33	$0.39
39	200	K2 Think	763	±24	605	0.8%	<0.1%	418 tps	2.8s	N/A	$0	$0
40	246	DeepSeek-R1 Distill Llama 70B	755	±19	960	2.5%	3.6%	27 tps	1.6s	32K	$0.73	$0.95

1of2

View All (48 models)