Leaderboard | Text

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

Choose topic

All topics Facts and Information Creative Writing and Ideation Logic and Problem-Solving Task Completion Coding

Language

Choose language

All languages English Chinese Arabic Spanish Indonesian Japanese

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

1005

Command A

1012

Qwen3 Max Thinking

1013

DeepSeek V3 (Turbo)

1013

DeepSeek V3

1013

Kimi K2 0905

1017

Qwen3 30B A3B Thinking 2507

1018

GLM 4.6V

1018

Seed 1.8 251228

1019

GLM 4.5

1019

Qwen 2.5 32B Instruct

1022

Gemini 2.0 Flash

1024

Qwen3 VL 235B A22B Thinking

1025

GPT-5 Mini

1026

GLM 4.7

1029

DeepSeek V3.2 Exp Thinking

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
81	129	Command A	1005	±5	8.6K	1.7%	2.2%	42 tps	0.8s	256K	$2.00	$7.33
82	129	Qwen3 Max Thinking	1012	±6	2.1K	0.2%	13.5%	32 tps	2.3s	256K	$1.20	$6.00
83	101	DeepSeek V3 (Turbo)	1013	±12	705	1.4%	1.5%	32 tps	1.5s	64K	$0.40	$1.30
84	126	DeepSeek V3	1013	±6	8.8K	1.3%	0.9%	69 tps	1.1s	64K	$0.59	$1.49
85	133	Kimi K2 0905	1013	±11	2.1K	3.7%	4.0%	30 tps	1.4s	262K	$0.63	$2.39
86	148	Qwen3 30B A3B Thinking 2507	1017	±9	2.2K	1.8%	0.5%	124 tps	1.2s	131K	$0.16	$1.70
87	139	GLM 4.6V	1018	±12	1.6K	1.2%	6.4%	21 tps	1.8s	128K	$0.38	$0.90
88	71	Seed 1.8 251228	1018	±6	4.4K	1.0%	3.7%	41 tps	2.1s	256K	$0.25	$2.00
89	113	GLM 4.5	1019	±6	2.5K	1.6%	3.7%	46 tps	1.4s	131K	$0.43	$1.63
90	153	Qwen 2.5 32B Instruct	1019	±8	1.4K	1.8%	2.5%	48 tps	1.0s	131K	$0.21	$0.25
91	143	Gemini 2.0 Flash	1022	±7	2.5K	2.5%	<0.1%	76 tps	0.5s	1M	$0.14	$0.56
92	126	Qwen3 VL 235B A22B Thinking	1024	±11	1.6K	4.2%	4.3%	47 tps	3.0s	127K	$0.47	$3.31
93	71	GPT-5 Mini	1025	±6	3.2K	2.0%	2.6%	66 tps	14.2s	400K	$0.25	$2.00
94	68	GLM 4.7	1026	±6	4.5K	0.8%	5.8%	40 tps	1.5s	200K	$0.77	$1.73
95	95	DeepSeek V3.2 Exp Thinking	1029	±11	1.4K	0.7%	7.2%	26 tps	3.0s	131K	$0.28	$0.42
96	86	Qwen3 235B A22B	1030	±9	3.1K	1.6%	5.3%	71 tps	0.9s	41K	$0.23	$0.63
97	65	GLM 4.6	1030	±8	2.6K	2.8%	5.4%	39 tps	1.5s	200K	$0.42	$1.66
98	113	Mistral Medium	1035	±5	3.6K	1.8%	1.8%	48 tps	0.6s	33K	$1.48	$4.55
99	106	DeepSeek V3.1 Terminus Thinking	1035	±11	1.4K	2.8%	5.9%	27 tps	1.8s	131K	$0.56	$1.68
100	119	GLM 4.7 FP8	1039	±9	515	1.0%	6.9%	40 tps	1.3s	200K	$0.30	$1.20
101	71	Qwen3.5 397B A17B	1040	±10	1.4K	1.4%	4.3%	57 tps	1.4s	256K	$0.52	$3.00
102	71	Gemini 2.5 Flash Thinking	1042	±5	6.5K	1.5%	2.2%	88 tps	6.4s	1M	$0.30	$2.50
103	48	Claude Sonnet 4 (Thinking)	1044	±5	8.4K	2.3%	1.5%	52 tps	1.5s	200K	$3.00	$13.67
104	133	GPT-4.1 nano	1046	±8	5.1K	2.0%	0.6%	175 tps	0.5s	1M	$0.10	$0.40
105	119	ERNIE 4.5 300B A47B	1046	±6	5.3K	1.3%	4.7%	23 tps	2.3s	123K	$0.28	$1.10
106	62	MiniMax M2	1046	±6	3.8K	1.9%	2.2%	39 tps	2.3s	205K	$0.21	$0.85
107	65	DeepSeek V3.2 Exp Chat	1047	±9	2.2K	3.1%	2.6%	29 tps	1.5s	131K	$0.27	$0.39
108	126	Qwen3 30B A3B	1051	±7	3.9K	1.3%	5.1%	163 tps	1.0s	41K	$0.06	$0.21
109	44	DeepSeek V3.1 Terminus Chat	1053	±6	2.6K	2.6%	3.4%	27 tps	1.5s	131K	$0.86	$1.80
110	71	DeepSeek V3.1	1053	±13	1.8K	1.6%	0.8%	197 tps	0.4s	164K	$0.55	$1.60
111	95	Kimi K2 Thinking	1054	±9	1.9K	3.8%	4.2%	61 tps	5.9s	262K	$0.24	$1.03
112	106	Grok 3	1054	±6	7.1K	1.7%	1.5%	53 tps	0.6s	1M	$3.67	$18.33
113	81	OpenAI o3-pro	1061	±14	1.3K	2.7%	5.2%	22 tps	70.8s	200K	$20.00	$80.00
114	118	GPT-4.1 mini	1062	±5	5.5K	1.8%	1.1%	67 tps	0.9s	1M	$0.34	$1.60
115	65	Mistral Large 3	1064	±7	1.8K	2.2%	2.1%	51 tps	1.0s	256K	$0.50	$1.50
116	44	Kimi K2 Thinking Turbo	1065	±6	3K	1.9%	2.0%	75 tps	1.4s	262K	$1.15	$8.00
117	124	Qwen3 235B A22B Thinking 2507	1065	±7	1.8K	1.9%	2.5%	53 tps	1.6s	131K	$0.59	$5.70
118	71	Gemini 2.5 Flash Lite Preview 0925	1066	±6	3.3K	2.8%	1.2%	209 tps	0.7s	1M	$0.25	$0.35
119	79	Qwen3 Max Thinking Preview	1067	±6	3.1K	1.4%	3.1%	40 tps	2.1s	256K	$1.20	$6.00
120	56	MiniMax M2.1 Lightning	1067	±12	855	0.6%	1.7%	52 tps	2.1s	205K	$0.30	$2.40

3of5

View All (193 models)