Leaderboard | Text

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

Topics

Choose topic

All topics Facts and Information Creative Writing and Ideation Logic and Problem-Solving Task Completion Coding

Choose language

All languages English Chinese Arabic Spanish Indonesian Japanese

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

1113

Gemini 2.5 Flash Lite

1116

Gemini 2.5 Flash Lite Preview 0925

1119

MiniMax M2.5 Lightning

1121

LongCat Flash Chat

1125

DeepSeek V3.2

1131

GPT-5 (High)

1145

Kimi K2 0905

1147

Claude Sonnet 3.5 v2

1148

Claude Sonnet 4.5

1150

Qwen3 Omni 30B A3B Thinking

1151

Gemini 2.5 Flash Preview 0925

1152

Qwen Plus (Aug'24)

1154

Grok 4 Fast Reasoning

1154

Claude Haiku 4.5 (Extended Thinking)

1158

DeepSeek V3.2 Thinking

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
121	101	Gemini 2.5 Flash Lite	1113	±6	4.4K	1.8%	1.3%	210 tps	0.7s	1M	$0.10	$0.40
122	71	Gemini 2.5 Flash Lite Preview 0925	1116	±10	2K	2.4%	1.2%	209 tps	0.7s	1M	$0.25	$0.35
123	79	MiniMax M2.5 Lightning	1119	±16	650	0.8%	1.5%	51 tps	2.0s	205K	$0.60	$2.40
124	111	LongCat Flash Chat	1121	±17	725	4.6%	0.8%	85 tps	0.9s	131K	$0.14	$0.68
125	40	DeepSeek V3.2	1125	±10	2.1K	1.6%	1.4%	83 tps	5.1s	131K	$0.43	$1.09
126	26	GPT-5 (High)	1131	±9	2.7K	3.2%	4.5%	81 tps	35.9s	400K	$1.25	$10.00
127	133	Kimi K2 0905	1145	±10	1.5K	2.0%	4.0%	30 tps	1.4s	262K	$0.63	$2.39
128	106	Claude Sonnet 3.5 v2	1147	±14	1K	2.0%	<0.1%	46 tps	1.4s	200K	$3.00	$15.00
129	37	Claude Sonnet 4.5	1148	±9	3.3K	2.7%	1.4%	41 tps	1.3s	200K	$1.80	$9.00
130	37	Qwen3 Omni 30B A3B Thinking	1150	±16	845	3.4%	3.7%	67 tps	1.2s	66K	$0.97	$1.79
131	60	Gemini 2.5 Flash Preview 0925	1151	±9	1.8K	3.3%	1.2%	5 tps	0.9s	1M	$0.13	$0.97
132	68	Qwen Plus (Aug'24)	1152	±9	4.8K	1.1%	1.4%	53 tps	1.3s	30K	$0.40	$1.20
133	48	Grok 4 Fast Reasoning	1154	±10	2.2K	3.0%	2.1%	102 tps	3.1s	2M	$0.30	$0.75
134	26	Claude Haiku 4.5 (Extended Thinking)	1154	±8	2.4K	2.8%	1.4%	115 tps	0.7s	200K	$1.00	$5.00
135	56	DeepSeek V3.2 Thinking	1158	±19	2.4K	2.3%	9.0%	30 tps	2.6s	131K	$0.28	$0.42
136	44	Grok 4.1 Fast Reasoning	1159	±11	4.3K	2.9%	1.5%	58 tps	7.3s	2M	$0.20	$0.50
137	44	Gemini 2.5 Pro	1161	±6	12.1K	1.4%	2.3%	45 tps	2.6s	1M	$1.25	$10.00
138	56	Gemini 3.1 Flash Lite Preview Thinking	1162	±19	730	2.0%	1.7%	75 tps	4.7s	1M	$0.25	$1.50
139	48	Step 3.5 Flash	1163	±23	640	0.8%	2.2%	109 tps	0.6s	256K	$0.05	$0.15
140	44	Kimi K2 Thinking Turbo	1171	±11	1.8K	4.2%	2.0%	75 tps	1.4s	262K	$1.15	$8.00
141	42	GPT-5.2 (Extra High)	1172	±13	3K	1.6%	13.2%	17 tps	20.5s	400K	$1.75	$14.00
142	33	Kimi K2.5	1175	±12	3.1K	1.6%	6.5%	33 tps	1.7s	262K	$0.34	$2.57
143	17	Claude Opus 4.5	1177	±15	1.9K	4.7%	1.5%	45 tps	1.5s	200K	$5.00	$25.00
144	29	Nova Experimental Chat 12-10	1192	±11	1.4K	0.7%	2.4%	84 tps	12.9s	98K	$0	$0
145	42	Qwen3 Max Instruct Preview	1192	±9	2.8K	3.0%	1.1%	31 tps	1.7s	256K	$1.43	$6.61
146	13	GPT-5.3 Instant	1225	±13	2.3K	1.3%	0.9%	63 tps	0.8s	400K	$1.75	$14.00
147	81	GPT-4o	1228	±11	2.9K	1.7%	1.0%	49 tps	2.4s	128K	$3.71	$12.57
148	62	GPT-5.1 Instant	1233	±12	2.6K	2.7%	1.3%	50 tps	1.9s	400K	$1.25	$10.00
149	32	Gemini 2.5 Pro High	1234	±6	4.6K	2.4%	1.5%	48 tps	2.3s	1M	$1.25	$10.00
150	14	Gemini 3 Flash Preview Thinking	1248	±10	3.7K	1.3%	1.6%	3 tps	6.2s	1M	$0.50	$3.00
151	26	Grok 4.1 Fast Non-Reasoning	1260	±16	2.5K	3.7%	0.9%	101 tps	0.5s	2M	$0.20	$0.50
152	10	Claude Sonnet 4.5 (Thinking)	1261	±7	5.5K	1.9%	1.9%	44 tps	1.1s	200K	$3.00	$15.00
153	40	Qwen3 235B A22B Instruct 2507	1261	±8	3.1K	1.4%	6.8%	13 tps	1.9s	262K	$0.13	$0.52
154	48	gpt-oss-120b	1269	±7	4.6K	1.4%	0.7%	213 tps	0.5s	131K	$0.11	$0.50
155	22	GPT-5 Chat	1269	±5	7.9K	1.6%	1.3%	95 tps	0.9s	400K	$1.25	$10.00
156	33	Qwen3 Next 80B A3B Instruct	1270	±10	2.3K	2.8%	0.6%	84 tps	1.1s	256K	$0.20	$1.42
157	29	Qwen3 VL 235B A22B Instruct	1273	±14	1.1K	2.3%	3.1%	75 tps	1.9s	129K	$0.37	$1.81
158	17	GPT-5.2 (High)	1275	±12	6.5K	1.4%	6.7%	18 tps	16.3s	400K	$1.75	$14.00
159	5	Claude Sonnet 4.6 (Thinking)	1280	±14	1.4K	2.2%	4.7%	57 tps	1.1s	200K	$3.00	$15.00
160	17	Gemini 3 Flash Preview	1281	±11	2K	1.2%	1.3%	138 tps	1.4s	1M	$0.50	$3.00

4of5

View All (173 models)