Leaderboard | Text

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

Topics

Choose topic

All topics Facts and Information Creative Writing and Ideation Logic and Problem-Solving Task Completion Coding

Choose language

All languages English Chinese Arabic Spanish Indonesian Japanese

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

258

Phi 4 Mini Reasoning

486

Phi 4 Reasoning

681

Hermes 4 405B Reasoning FP8

709

C4AI Aya Expanse 32B

724

Gemma 3 4B

730

Command R

755

DeepSeek-R1 Distill Llama 70B

787

Command R 7B

821

Sky T1 32B Preview

839

Mistral Small 3.1 24B Instruct

858

GLM 4.6V Flash

879

Gemma 3 27B IT

900

Llama 4 Maverick

904

DeepSeek-R1

924

Mistral Small 3.1

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
1	291	Phi 4 Mini Reasoning	258	±22	935	7.0%	9.7%	30 tps	0.9s	128K	$0.07	$0.30
2	287	Phi 4 Reasoning	486	±32	530	2.8%	21.0%	29 tps	1.0s	33K	$0.06	$0.25
3	260	Hermes 4 405B Reasoning FP8	681	±31	725	5.2%	3.6%	32 tps	0.8s	131K	$1.00	$3.00
4	214	C4AI Aya Expanse 32B	709	±22	925	1.6%	1.5%	43 tps	0.5s	128K	$0.50	$1.50
5	235	Gemma 3 4B	724	±24	660	3.6%	1.3%	138 tps	0.7s	131K	$0.02	$0.04
6	225	Command R	730	±23	605	2.4%	5.8%	54 tps	0.6s	128K	$0.30	$0.99
7	246	DeepSeek-R1 Distill Llama 70B	755	±19	960	2.5%	3.6%	27 tps	1.6s	32K	$0.73	$0.95
8	225	Command R 7B	787	±26	660	2.9%	1.1%	76 tps	0.4s	128K	$0.04	$0.15
9	222	Sky T1 32B Preview	821	±18	625	1.6%	7.8%	73 tps	0.6s	16K	$0.12	$0.18
10	177	Mistral Small 3.1 24B Instruct	839	±22	695	3.5%	7.5%	15 tps	2.4s	131K	$0.06	$0.18
11	186	GLM 4.6V Flash	858	±23	750	2.6%	3.7%	64 tps	2.1s	128K	$0.04	$0.40
12	201	Gemma 3 27B IT	879	±21	560	1.8%	2.0%	60 tps	0.8s	128K	$0.17	$0.29
13	161	Llama 4 Maverick	900	±11	5.1K	1.9%	1.2%	88 tps	2.4s	1M	$0.23	$0.83
14	148	DeepSeek-R1	904	±16	1.7K	2.8%	0.8%	133 tps	0.6s	64K	$0.91	$3.07
15	161	Mistral Small 3.1	924	±36	615	2.4%	7.4%	13 tps	2.6s	32K	$0.17	$0.28
16	106	DeepSeek V3.1 Terminus Thinking	927	±17	840	4.5%	5.9%	27 tps	1.8s	131K	$0.56	$1.68
17	165	Pixtral Large	941	±18	940	3.6%	2.5%	57 tps	1.3s	128K	$1.50	$4.50
18	129	Command A	959	±8	6.2K	1.2%	2.2%	42 tps	0.8s	256K	$2.00	$7.33
19	65	Mistral Large 3	974	±20	1.2K	5.5%	2.1%	51 tps	1.0s	256K	$0.50	$1.50
20	126	DeepSeek V3	975	±10	5.5K	0.5%	0.9%	69 tps	1.1s	64K	$0.59	$1.49
21	126	Qwen3 30B A3B	986	±16	1.9K	3.4%	5.1%	163 tps	1.0s	41K	$0.06	$0.21
22	86	Qwen3 235B A22B	998	±18	1.4K	3.2%	5.3%	71 tps	0.9s	41K	$0.23	$0.63
23	153	Qwen 2.5 32B Instruct	1004	±14	1.2K	1.7%	2.5%	48 tps	1.0s	131K	$0.21	$0.25
24	121	QwQ 32B	1015	±7	4.6K	1.4%	5.4%	41 tps	2.1s	16K	$0.43	$0.56
25	133	DeepSeek-R1 0528	1038	±13	1.7K	2.0%	1.3%	93 tps	0.5s	64K	$1.60	$3.67
26	95	DeepSeek V3.2 Exp Thinking	1046	±18	735	4.5%	7.2%	26 tps	3.0s	131K	$0.28	$0.42
27	37	Kimi K2.5 Instant	1046	±16	620	2.4%	2.9%	32 tps	3.0s	262K	$0.50	$3.00
28	133	Qwen3 14B	1053	±13	1.7K	2.9%	1.7%	109 tps	0.8s	41K	$0.04	$0.15
29	52	Qwen3.5 122B A17B	1053	±25	580	3.3%	1.5%	82 tps	1.4s	256K	$0.40	$3.20
30	65	DeepSeek V3.2 Exp Chat	1054	±14	1.2K	3.7%	2.6%	29 tps	1.5s	131K	$0.27	$0.39
31	86	Nemotron 3 Nano (Thinking)	1059	±18	825	1.8%	2.0%	200 tps	0.5s	256K	$0	$0
32	121	Qwen3 32B Fast	1061	±9	3K	2.4%	11.6%	30 tps	3.1s	41K	$0.10	$0.25
33	113	Kimi K2 Fast	1069	±7	8.5K	1.0%	0.8%	365 tps	0.5s	131K	$1.00	$3.00
34	95	DeepSeek-R1 Turbo	1073	±16	780	3.7%	2.6%	29 tps	1.8s	64K	$2.85	$4.75
35	101	gpt-oss-20b	1085	±12	2.1K	2.6%	0.5%	216 tps	0.5s	131K	$0.06	$0.26
36	79	MiniMax M2.5 Lightning	1119	±16	650	0.8%	1.5%	51 tps	2.0s	205K	$0.60	$2.40
37	56	DeepSeek V3.2 Thinking	1158	±19	2.4K	2.3%	9.0%	30 tps	2.6s	131K	$0.28	$0.42
38	48	Step 3.5 Flash	1163	±23	640	0.8%	2.2%	109 tps	0.6s	256K	$0.05	$0.15
39	44	Kimi K2 Thinking Turbo	1171	±11	1.8K	4.2%	2.0%	75 tps	1.4s	262K	$1.15	$8.00
40	33	Kimi K2.5	1175	±12	3.1K	1.6%	6.5%	33 tps	1.7s	262K	$0.34	$2.57

1of2

View All (42 models)