Leaderboard | Text

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

Choose topic

All topics Facts and Information Creative Writing and Ideation Logic and Problem-Solving Task Completion Coding

Language

Choose language

All languages English Chinese Arabic Spanish Indonesian Japanese

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

1012

Qwen3 VL 30B A3B Instruct

1011

Gemini 2.0 Flash Lite

1006

Kimi K2 Fast

1002

GLM 4.5

1000

NVIDIA Llama 3.3 Nemotron Super 49B v1.5

1000

OpenAI o4-mini

998

Llama 3.1 8B Turbo

996

LongCat Flash Chat

991

Kimi K2 0905

990

Qwen3 Max Thinking

984

Cogito v2.1 671B

983

DeepSeek-R1 0528

983

Gemma 3 27B IT

979

Qwen3 VL 235B A22B Thinking

978

DeepSeek R1T2 Chimera

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
121	139	Qwen3 VL 30B A3B Instruct	1012	±17	1K	6.5%	1.8%	80 tps	2.6s	129K	$0.18	$0.67
122	143	Gemini 2.0 Flash Lite	1011	±6	5.7K	6.9%	<0.1%	42 tps	0.5s	1M	$0.08	$0.30
123	113	Kimi K2 Fast	1006	±4	26.2K	13.8%	0.8%	365 tps	0.5s	131K	$1.00	$3.00
124	113	GLM 4.5	1002	±5	3.7K	14.3%	3.7%	46 tps	1.4s	131K	$0.43	$1.63
125	121	NVIDIA Llama 3.3 Nemotron Super 49B v1.5	1000	±16	1K	9.9%	2.0%	50 tps	0.6s	131K	$0.09	$0.33
126	139	OpenAI o4-mini	1000	±5	4.8K	10.2%	1.4%	97 tps	7.0s	128K	$1.10	$4.40
127	170	Llama 3.1 8B Turbo	998	±12	1.1K	2.8%	2.1%	650 tps	0.5s	128K	$0.13	$0.14
128	111	LongCat Flash Chat	996	±14	930	7.0%	0.8%	85 tps	0.9s	131K	$0.14	$0.68
129	133	Kimi K2 0905	991	±6	7.5K	5.6%	4.0%	30 tps	1.4s	262K	$0.63	$2.39
130	129	Qwen3 Max Thinking	990	±13	1.7K	2.3%	13.5%	32 tps	2.3s	256K	$1.20	$6.00
131	157	Cogito v2.1 671B	984	±17	715	5.9%	0.8%	85 tps	0.5s	128K	$1.25	$1.25
132	133	DeepSeek-R1 0528	983	±12	1.3K	4.6%	1.3%	93 tps	0.5s	64K	$1.60	$3.67
133	201	Gemma 3 27B IT	983	±15	905	10.4%	2.0%	60 tps	0.8s	128K	$0.17	$0.29
134	126	Qwen3 VL 235B A22B Thinking	979	±6	3.5K	11.5%	4.3%	47 tps	3.0s	127K	$0.47	$3.31
135	165	DeepSeek R1T2 Chimera	978	±10	1.1K	11.0%	3.0%	28 tps	1.8s	164K	$0.13	$0.45
136	214	Qwen 2.5 VL 32B Instruct	977	±20	850	7.6%	6.3%	43 tps	3.2s	128K	$0.35	$0.62
137	209	Seed 1.6 Flash 250715	974	±16	980	6.2%	2.5%	108 tps	1.6s	256K	$0.07	$0.30
138	222	Sky T1 32B Preview	972	±16	805	10.6%	7.8%	73 tps	0.6s	16K	$0.12	$0.18
139	177	Mistral Small 3.1 24B Instruct	966	±12	1K	10.6%	7.5%	15 tps	2.4s	131K	$0.06	$0.18
140	161	Mistral Small 3.1	960	±16	915	11.2%	7.4%	13 tps	2.6s	32K	$0.17	$0.28
141	161	Llama 4 Maverick	956	±4	11.2K	8.2%	1.2%	88 tps	2.4s	1M	$0.23	$0.83
142	121	QwQ 32B	955	±7	5K	15.3%	5.4%	41 tps	2.1s	16K	$0.43	$0.56
143	101	gpt-oss-20b	954	±5	6.1K	10.8%	0.5%	216 tps	0.5s	131K	$0.06	$0.26
144	165	Qwen3 VL 30B A3B Thinking	949	±8	1.5K	11.2%	4.5%	84 tps	2.9s	127K	$0.20	$1.47
145	177	OpenAI o3-mini	946	±6	6.7K	12.3%	0.8%	143 tps	3.3s	200K	$1.10	$4.40
146	153	Ministral 14B 3.0	945	±28	490	11.7%	2.0%	119 tps	0.5s	128K	$0.20	$0.20
147	170	Devstral Medium	945	±11	1.6K	14.7%	1.5%	77 tps	0.6s	131K	$0.40	$2.00
148	194	Llama 3.2 11B Instruct	943	±14	745	14.4%	1.5%	152 tps	0.5s	8K	$0.16	$0.16
149	126	Qwen3 30B A3B	939	±5	3.7K	12.1%	5.1%	163 tps	1.0s	41K	$0.06	$0.21
150	201	GPT-4o mini	939	±9	1.4K	9.2%	2.1%	71 tps	1.7s	128K	$0.15	$0.60
151	148	DeepSeek-R1	939	±12	1.6K	5.5%	0.8%	133 tps	0.6s	64K	$0.91	$3.07
152	86	Nemotron 3 Nano (Thinking)	938	±14	1.3K	7.6%	2.0%	200 tps	0.5s	256K	$0	$0
153	139	GLM 4.6V	938	±11	2.5K	6.1%	6.4%	21 tps	1.8s	128K	$0.38	$0.90
154	148	OpenAI o4-mini-high	937	±6	6K	14.9%	1.9%	117 tps	15.9s	200K	$1.10	$4.40
155	175	OpenAI o3-mini-low	937	±4	6.1K	13.6%	0.7%	139 tps	1.5s	200K	$1.10	$4.40
156	133	Qwen3 14B	933	±11	2.7K	17.1%	1.7%	109 tps	0.8s	41K	$0.04	$0.15
157	121	Qwen3 32B Fast	932	±5	4.5K	12.9%	11.6%	30 tps	3.1s	41K	$0.10	$0.25
158	265	Qwen 2.5 VL 72B Instruct	929	±12	1.2K	7.9%	5.3%	25 tps	3.7s	128K	$1.01	$2.79
159	209	Qwen 2.5 14B Instruct	928	±13	910	11.7%	2.4%	40 tps	1.6s	1M	$0.40	$1.61
160	133	DeepSeek V3.2 Speciale	924	±12	1.6K	6.3%	6.0%	43 tps	1.4s	131K	$0.84	$1.52

4of6

View All (237 models)