Leaderboard | Text

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

Choose topic

All topics Facts and Information Creative Writing and Ideation Logic and Problem-Solving Task Completion Coding

Language

Choose language

All languages English Chinese Arabic Spanish Indonesian Japanese

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

999

K2 Think

998

Llama 3.1 8B Turbo

996

LongCat Flash Chat

992

Arcee AI Virtuoso-Large

991

Kimi K2 0905

990

Qwen3 Max Thinking

987

Qwen Turbo

984

Cogito v2.1 671B

983

DeepSeek-R1 0528

983

Gemma 3 27B IT

981

GLM 4.5 Air

980

Arcee AI Maestro Reasoning

979

Arcee AI Blitz

979

Qwen3 VL 235B A22B Thinking

978

DeepSeek R1T2 Chimera

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
161	200	K2 Think	999	±14	1.1K	6.2%	<0.1%	418 tps	2.8s	N/A	$0	$0
162	170	Llama 3.1 8B Turbo	998	±12	1.1K	2.8%	2.1%	650 tps	0.5s	128K	$0.13	$0.14
163	111	LongCat Flash Chat	996	±14	930	7.0%	0.8%	85 tps	0.9s	131K	$0.14	$0.68
164	219	Arcee AI Virtuoso-Large	992	±10	1.4K	15.2%	<0.1%	64 tps	0.5s	131K	$0.75	$1.20
165	133	Kimi K2 0905	991	±6	7.5K	5.6%	4.0%	30 tps	1.4s	262K	$0.63	$2.39
166	129	Qwen3 Max Thinking	990	±13	1.7K	2.3%	13.5%	32 tps	2.3s	256K	$1.20	$6.00
167	159	Qwen Turbo	987	±4	4.8K	12.9%	<0.1%	53 tps	1.1s	1M	$0.05	$0.20
168	157	Cogito v2.1 671B	984	±17	715	5.9%	0.8%	85 tps	0.5s	128K	$1.25	$1.25
169	133	DeepSeek-R1 0528	983	±12	1.3K	4.6%	1.3%	93 tps	0.5s	64K	$1.60	$3.67
170	201	Gemma 3 27B IT	983	±15	905	10.4%	2.0%	60 tps	0.8s	128K	$0.17	$0.29
171	147	GLM 4.5 Air	981	±6	4.6K	15.0%	<0.1%	22 tps	1.4s	131K	$0.10	$0.38
172	147	Arcee AI Maestro Reasoning	980	±9	1.8K	12.0%	<0.1%	85 tps	0.3s	131K	$0.90	$3.30
173	241	Arcee AI Blitz	979	±13	610	5.4%	<0.1%	6 tps	N/A	33K	$0.45	$0.75
174	126	Qwen3 VL 235B A22B Thinking	979	±6	3.5K	11.5%	4.3%	47 tps	3.0s	127K	$0.47	$3.31
175	165	DeepSeek R1T2 Chimera	978	±10	1.1K	11.0%	3.0%	28 tps	1.8s	164K	$0.13	$0.45
176	214	Qwen 2.5 VL 32B Instruct	977	±20	850	7.6%	6.3%	43 tps	3.2s	128K	$0.35	$0.62
177	209	Seed 1.6 Flash 250715	974	±16	980	6.2%	2.5%	108 tps	1.6s	256K	$0.07	$0.30
178	265	Llama 3.1 405B Instruct Turbo	973	±18	625	10.1%	<0.1%	26 tps	0.8s	131K	$3.50	$3.50
179	222	Sky T1 32B Preview	972	±16	805	10.6%	7.8%	73 tps	0.6s	16K	$0.12	$0.18
180	302	YouTube	968	±12	3.7K	3.4%	<0.1%	34 tps	2.7s	32K	$0.99	$0.99
181	177	Mistral Small 3.1 24B Instruct	966	±12	1K	10.6%	7.5%	15 tps	2.4s	131K	$0.06	$0.18
182	161	Mistral Small 3.1	960	±16	915	11.2%	7.4%	13 tps	2.6s	32K	$0.17	$0.28
183	302	OLMo 2 0425 1B Instruct	956	±19	570	1.7%	<0.1%	68 tps	0.0s	4K	$0	$0
184	161	Llama 4 Maverick	956	±4	11.2K	8.2%	1.2%	88 tps	2.4s	1M	$0.23	$0.83
185	121	QwQ 32B	955	±7	5K	15.3%	5.4%	41 tps	2.1s	16K	$0.43	$0.56
186	253	Magistral Medium	955	±14	795	18.0%	<0.1%	95 tps	0.5s	41K	$2.00	$5.00
187	101	gpt-oss-20b	954	±5	6.1K	10.8%	0.5%	216 tps	0.5s	131K	$0.06	$0.26
188	233	Llama 3.1 70B Instruct Turbo	951	±10	1.9K	10.3%	<0.1%	110 tps	0.8s	128K	$0.88	$0.88
189	165	Qwen3 VL 30B A3B Thinking	949	±8	1.5K	11.2%	4.5%	84 tps	2.9s	127K	$0.20	$1.47
190	213	Claude Haiku 3.5	949	±8	3.4K	9.7%	0.8%	40 tps	2.8s	200K	$0.80	$4.00
191	177	OpenAI o3-mini	946	±6	6.7K	12.3%	0.8%	143 tps	3.3s	200K	$1.10	$4.40
192	153	Ministral 14B 3.0	945	±28	490	11.7%	2.0%	119 tps	0.5s	128K	$0.20	$0.20
193	170	Devstral Medium	945	±11	1.6K	14.7%	1.5%	77 tps	0.6s	131K	$0.40	$2.00
194	292	GPT-5 Nano Minimal	945	±11	1.3K	12.9%	<0.1%	88 tps	0.8s	400K	$0.05	$0.40
195	241	Claude Haiku 3	944	±11	880	10.7%	0.4%	62 tps	0.5s	200K	$0.25	$1.25
196	194	Llama 3.2 11B Instruct	943	±14	745	14.4%	1.5%	152 tps	0.5s	8K	$0.16	$0.16
197	219	EXAONE Deep 32B	941	±17	525	4.5%	<0.1%	24 tps	N/A	33K	$0	$0
198	126	Qwen3 30B A3B	939	±5	3.7K	12.1%	5.1%	163 tps	1.0s	41K	$0.06	$0.21
199	201	GPT-4o mini	939	±9	1.4K	9.2%	2.1%	71 tps	1.7s	128K	$0.15	$0.60
200	148	DeepSeek-R1	939	±12	1.6K	5.5%	0.8%	133 tps	0.6s	64K	$0.91	$3.07

5of8

View All (312 models)