Leaderboard | Text

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

Choose topic

All topics Facts and Information Creative Writing and Ideation Logic and Problem-Solving Task Completion Coding

Language

Choose language

All languages English Chinese Arabic Spanish Indonesian Japanese

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

1020

GPT-4.1 nano

1020

GLM 4.5 AirX

1027

Amazon Nova 2 Lite

1028

DeepSeek V3

1029

Command A

1029

DeepSeek V3.1

1032

DeepSeek-R1 Turbo

1032

ERNIE 4.5 300B A47B

1034

DeepSeek V3 (Turbo)

1035

GPT-5 Nano

1037

OpenAI o1

1038

Gemini 2.5 Flash Lite

1038

Gemini 2.0 Flash

1041

Kimi K2 0905 Turbo

1042

Pixtral Large

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
121	133	GPT-4.1 nano	1020	±4	8.8K	9.7%	0.6%	175 tps	0.5s	1M	$0.10	$0.40
122	113	GLM 4.5 AirX	1020	±10	805	9.0%	3.3%	75 tps	1.2s	131K	$1.10	$4.50
123	86	Amazon Nova 2 Lite	1027	±10	2.6K	7.9%	1.0%	137 tps	0.6s	300K	$0.35	$2.95
124	126	DeepSeek V3	1028	±7	5.9K	5.7%	0.9%	69 tps	1.1s	64K	$0.59	$1.49
125	129	Command A	1029	±5	11K	8.4%	2.2%	42 tps	0.8s	256K	$2.00	$7.33
126	71	DeepSeek V3.1	1029	±10	1.1K	4.5%	0.8%	197 tps	0.4s	164K	$0.55	$1.60
127	95	DeepSeek-R1 Turbo	1032	±10	1.4K	5.5%	2.6%	29 tps	1.8s	64K	$2.85	$4.75
128	119	ERNIE 4.5 300B A47B	1032	±6	6.1K	8.7%	4.7%	23 tps	2.3s	123K	$0.28	$1.10
129	101	DeepSeek V3 (Turbo)	1034	±11	1K	5.9%	1.5%	32 tps	1.5s	64K	$0.40	$1.30
130	157	GPT-5 Nano	1035	±6	3.8K	10.6%	3.2%	113 tps	20.9s	400K	$0.05	$0.40
131	153	OpenAI o1	1037	±15	1.2K	4.8%	4.2%	92 tps	5.5s	200K	$15.00	$60.00
132	101	Gemini 2.5 Flash Lite	1038	±5	12.8K	12.6%	1.3%	210 tps	0.7s	1M	$0.10	$0.40
133	143	Gemini 2.0 Flash	1038	±6	3.7K	8.9%	<0.1%	76 tps	0.5s	1M	$0.14	$0.56
134	124	Kimi K2 0905 Turbo	1041	±4	6.8K	12.4%	0.7%	373 tps	0.5s	262K	$1.70	$6.50
135	165	Pixtral Large	1042	±8	2.5K	5.1%	2.5%	57 tps	1.3s	128K	$1.50	$4.50
136	143	Seed 1.6 250615	1042	±13	1.2K	4.8%	3.1%	46 tps	2.2s	256K	$0.25	$2.00
137	79	Qwen3 Max Thinking Preview	1044	±5	5.1K	7.7%	3.1%	40 tps	2.1s	256K	$1.20	$6.00
138	119	GLM 4.7 FP8	1046	±19	490	3.0%	6.9%	40 tps	1.3s	200K	$0.30	$1.20
139	106	DeepSeek V3.1 Terminus Thinking	1047	±7	2.5K	11.6%	5.9%	27 tps	1.8s	131K	$0.56	$1.68
140	37	Qwen3 Omni 30B A3B Thinking	1047	±11	1.3K	5.9%	3.7%	67 tps	1.2s	66K	$0.97	$1.79
141	81	OpenAI o3-pro	1048	±8	2.2K	3.5%	5.2%	22 tps	70.8s	200K	$20.00	$80.00
142	111	Grok 3 Fast	1051	±17	1.1K	2.6%	1.7%	52 tps	2.4s	131K	$5.00	$25.00
143	139	Seed 2.0 Mini (Medium)	1053	±21	605	4.0%	11.9%	33 tps	1.7s	256K	$0.15	$0.60
144	71	Qwen3.5 397B A17B	1055	±11	1.6K	2.1%	4.3%	57 tps	1.4s	256K	$0.52	$3.00
145	93	DeepSeek V3 0324 Turbo	1055	±5	9.3K	10.3%	6.3%	12 tps	2.4s	164K	$0.73	$1.79
146	113	Gemini 2.5 Flash Lite Thinking	1059	±5	6.6K	9.5%	1.0%	118 tps	4.4s	1M	$0.03	$0.13
147	118	GPT-4.1 mini	1060	±4	11.7K	6.8%	1.1%	67 tps	0.9s	1M	$0.34	$1.60
148	95	DeepSeek V3.2 Exp Thinking	1063	±8	5K	3.4%	7.2%	26 tps	3.0s	131K	$0.28	$0.42
149	95	Kimi K2 Thinking	1064	±12	1.6K	6.8%	4.2%	61 tps	5.9s	262K	$0.24	$1.03
150	95	Qwen3 32B	1070	±18	620	7.5%	3.9%	30 tps	3.1s	41K	$0.12	$0.42
151	71	Gemini 2.5 Flash Lite Preview 0925	1070	±5	6.7K	8.6%	1.2%	209 tps	0.7s	1M	$0.25	$0.35
152	86	Qwen3 235B A22B	1074	±10	2.8K	14.4%	5.3%	71 tps	0.9s	41K	$0.23	$0.63
153	93	Qwen Max	1077	±5	8.8K	9.1%	1.5%	49 tps	1.5s	33K	$1.60	$6.40
154	65	DeepSeek V3.2 Exp Chat	1079	±4	4.3K	8.8%	2.6%	29 tps	1.5s	131K	$0.27	$0.39
155	48	Step 3.5 Flash	1079	±23	645	2.3%	2.2%	109 tps	0.6s	256K	$0.05	$0.15
156	81	Qwen3.5 27B	1082	±26	550	2.7%	3.7%	55 tps	2.6s	256K	$0.30	$2.40
157	86	DeepSeek V3.1 Chat	1084	±6	3.7K	10.1%	2.8%	21 tps	1.6s	131K	$0.38	$1.00
158	62	Qwen3 Omni 30B A3B Instruct	1085	±14	570	6.6%	3.9%	65 tps	1.2s	66K	$0.35	$0.97
159	48	gpt-oss-120b	1086	±4	15.1K	7.5%	0.7%	213 tps	0.5s	131K	$0.11	$0.50
160	62	MiniMax M2	1087	±6	16.5K	5.2%	2.2%	39 tps	2.3s	205K	$0.21	$0.85

4of6

View All (237 models)