Leaderboard | Text

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

Choose topic

All topics Facts and Information Creative Writing and Ideation Logic and Problem-Solving Task Completion Coding

Language

Choose language

All languages English Chinese Arabic Spanish Indonesian Japanese

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

1039

GLM 4.7 FP8

1035

DeepSeek V3.1 Terminus Thinking

1035

Mistral Medium

1035

GLM 4.5 Air

1030

GLM 4.6

1030

Qwen3 235B A22B

1029

DeepSeek V3.2 Exp Thinking

1026

GLM 4.7

1025

GPT-5 Mini

1025

Weather

1024

Qwen3 VL 235B A22B Thinking

1023

Gemini 2.5 Pro Preview 0325

1022

Gemini 2.0 Flash

1019

Qwen 2.5 32B Instruct

1019

GLM 4.5

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
121	119	GLM 4.7 FP8	1039	±9	515	1.0%	6.9%	40 tps	1.3s	200K	$0.30	$1.20
122	106	DeepSeek V3.1 Terminus Thinking	1035	±11	1.4K	2.8%	5.9%	27 tps	1.8s	131K	$0.56	$1.68
123	113	Mistral Medium	1035	±5	3.6K	1.8%	1.8%	48 tps	0.6s	33K	$1.48	$4.55
124	147	GLM 4.5 Air	1035	±7	3.2K	2.3%	<0.1%	22 tps	1.4s	131K	$0.10	$0.38
125	65	GLM 4.6	1030	±8	2.6K	2.8%	5.4%	39 tps	1.5s	200K	$0.42	$1.66
126	86	Qwen3 235B A22B	1030	±9	3.1K	1.6%	5.3%	71 tps	0.9s	41K	$0.23	$0.63
127	95	DeepSeek V3.2 Exp Thinking	1029	±11	1.4K	0.7%	7.2%	26 tps	3.0s	131K	$0.28	$0.42
128	68	GLM 4.7	1026	±6	4.5K	0.8%	5.8%	40 tps	1.5s	200K	$0.77	$1.73
129	71	GPT-5 Mini	1025	±6	3.2K	2.0%	2.6%	66 tps	14.2s	400K	$0.25	$2.00
130	314	Weather	1025	±19	580	4.1%	<0.1%	36 tps	1.1s	32K	$0	$0
131	126	Qwen3 VL 235B A22B Thinking	1024	±11	1.6K	4.2%	4.3%	47 tps	3.0s	127K	$0.47	$3.31
132	159	Gemini 2.5 Pro Preview 0325	1023	±18	735	2.6%	<0.1%	3 tps	16.6s	1M	$1.25	$10.00
133	143	Gemini 2.0 Flash	1022	±7	2.5K	2.5%	<0.1%	76 tps	0.5s	1M	$0.14	$0.56
134	153	Qwen 2.5 32B Instruct	1019	±8	1.4K	1.8%	2.5%	48 tps	1.0s	131K	$0.21	$0.25
135	113	GLM 4.5	1019	±6	2.5K	1.6%	3.7%	46 tps	1.4s	131K	$0.43	$1.63
136	71	Seed 1.8 251228	1018	±6	4.4K	1.0%	3.7%	41 tps	2.1s	256K	$0.25	$2.00
137	139	GLM 4.6V	1018	±12	1.6K	1.2%	6.4%	21 tps	1.8s	128K	$0.38	$0.90
138	148	Qwen3 30B A3B Thinking 2507	1017	±9	2.2K	1.8%	0.5%	124 tps	1.2s	131K	$0.16	$1.70
139	56	Claude Opus 4.1 (Thinking)	1017	±8	1.5K	1.3%	<0.1%	20 tps	3.9s	200K	$15.00	$75.00
140	133	Kimi K2 0905	1013	±11	2.1K	3.7%	4.0%	30 tps	1.4s	262K	$0.63	$2.39
141	126	DeepSeek V3	1013	±6	8.8K	1.3%	0.9%	69 tps	1.1s	64K	$0.59	$1.49
142	101	DeepSeek V3 (Turbo)	1013	±12	705	1.4%	1.5%	32 tps	1.5s	64K	$0.40	$1.30
143	129	Qwen3 Max Thinking	1012	±6	2.1K	0.2%	13.5%	32 tps	2.3s	256K	$1.20	$6.00
144	129	Command A	1005	±5	8.6K	1.7%	2.2%	42 tps	0.8s	256K	$2.00	$7.33
145	143	Seed 1.6 250615	1005	±20	880	2.2%	3.1%	46 tps	2.2s	256K	$0.25	$2.00
146	213	Claude Haiku 3.5	1005	±12	1.5K	3.0%	0.8%	40 tps	2.8s	200K	$0.80	$4.00
147	133	DeepSeek V3.2 Speciale	1003	±10	1.3K	2.2%	6.0%	43 tps	1.4s	131K	$0.84	$1.52
148	113	Kimi K2 Fast	1003	±4	10K	1.8%	0.8%	365 tps	0.5s	131K	$1.00	$3.00
149	113	Gemini 2.5 Flash Lite Thinking	1003	±8	3.7K	2.4%	1.0%	118 tps	4.4s	1M	$0.03	$0.13
150	133	Qwen3 14B	1002	±6	3.6K	1.6%	1.7%	109 tps	0.8s	41K	$0.04	$0.15
151	148	DeepSeek-R1	1001	±6	5K	1.7%	0.8%	133 tps	0.6s	64K	$0.91	$3.07
152	157	Qwen3 Next 80B A3B Thinking	1000	±7	3.2K	3.0%	0.6%	175 tps	1.3s	256K	$0.21	$2.26
153	133	DeepSeek-R1 0528	998	±4	4.9K	1.5%	1.3%	93 tps	0.5s	64K	$1.60	$3.67
154	292	GPT-5 Nano Minimal	992	±16	515	4.6%	<0.1%	88 tps	0.8s	400K	$0.05	$0.40
155	161	Qwen3 8B	992	±8	3.1K	1.6%	2.4%	61 tps	1.4s	41K	$0.02	$0.07
156	71	MiniMax M2.5 FP8	988	±19	525	1.9%	3.6%	33 tps	1.7s	205K	$0.45	$1.75
157	143	Gemini 2.0 Flash Lite	988	±6	4.1K	2.6%	<0.1%	42 tps	0.5s	1M	$0.08	$0.30
158	84	Claude Sonnet 3.7 (Thinking)	983	±6	4.8K	2.1%	<0.1%	41 tps	2.6s	200K	$3.00	$15.00
159	200	K2 Think	982	±11	895	0.6%	<0.1%	418 tps	2.8s	N/A	$0	$0
160	153	OpenAI o1	981	±5	9.1K	1.7%	4.2%	92 tps	5.5s	200K	$15.00	$60.00

4of7

View All (260 models)