Leaderboard | Text

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

Choose topic

All topics Facts and Information Creative Writing and Ideation Logic and Problem-Solving Task Completion Coding

Language

Choose language

All languages English Chinese Arabic Spanish Indonesian Japanese

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

900

Seed 2.0 Mini (Medium)

898

Qwen3 235B A22B

894

DeepSeek-R1

883

Llama 4 Maverick

878

Qwen3 4B

875

Llama 4 Scout

874

GLM 4.7 Flash

869

DeepSeek-R1 Distill Llama 70B

868

OpenAI o3-mini-high

866

Qwen3 Max Thinking

865

GLM 4.6V

858

Kimi K2 0711

856

Qwen3 14B

853

Qwen3 32B Fast

846

Qwen3 Next 80B A3B Thinking

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
121	139	Seed 2.0 Mini (Medium)	900	±30	515	3.7%	11.9%	33 tps	1.7s	256K	$0.15	$0.60
122	86	Qwen3 235B A22B	898	±23	725	3.3%	5.3%	71 tps	0.9s	41K	$0.23	$0.63
123	148	DeepSeek-R1	894	±12	1.1K	3.8%	0.8%	133 tps	0.6s	64K	$0.91	$3.07
124	161	Llama 4 Maverick	883	±12	3.6K	4.4%	1.2%	88 tps	2.4s	1M	$0.23	$0.83
125	165	Qwen3 4B	878	±23	735	3.9%	1.9%	94 tps	1.5s	128K	$0.01	$0.01
126	160	Llama 4 Scout	875	±15	2.3K	2.9%	0.6%	88 tps	5.1s	131K	$0.18	$0.46
127	179	GLM 4.7 Flash	874	±24	855	2.8%	5.8%	61 tps	2.8s	128K	$0.07	$0.39
128	246	DeepSeek-R1 Distill Llama 70B	869	±28	590	4.8%	3.6%	27 tps	1.6s	32K	$0.73	$0.95
129	214	OpenAI o3-mini-high	868	±13	1.4K	3.8%	2.4%	231 tps	10.5s	200K	$1.10	$4.40
130	129	Qwen3 Max Thinking	866	±14	1.5K	1.7%	13.5%	32 tps	2.3s	256K	$1.20	$6.00
131	139	GLM 4.6V	865	±24	890	2.7%	6.4%	21 tps	1.8s	128K	$0.38	$0.90
132	170	Kimi K2 0711	858	±24	890	4.3%	1.6%	29 tps	1.3s	131K	$0.72	$2.60
133	133	Qwen3 14B	856	±24	745	2.6%	1.7%	109 tps	0.8s	41K	$0.04	$0.15
134	121	Qwen3 32B Fast	853	±13	1.8K	4.2%	11.6%	30 tps	3.1s	41K	$0.10	$0.25
135	157	Qwen3 Next 80B A3B Thinking	846	±15	1.3K	3.9%	0.6%	175 tps	1.3s	256K	$0.21	$2.26
136	157	GPT-5 Nano	843	±14	2K	6.0%	3.2%	113 tps	20.9s	400K	$0.05	$0.40
137	175	OpenAI o3-mini-low	838	±21	1.7K	2.6%	0.7%	139 tps	1.5s	200K	$1.10	$4.40
138	84	GPT-5 Mini Minimal	835	±13	1.1K	6.6%	1.2%	63 tps	1.4s	400K	$0.25	$2.00
139	186	Grok 3 Mini Fast	832	±23	1K	3.3%	1.6%	44 tps	0.5s	131K	$0.60	$4.00
140	133	DeepSeek V3.2 Speciale	830	±28	540	3.6%	6.0%	43 tps	1.4s	131K	$0.84	$1.52
141	161	Qwen3 8B	827	±36	600	4.0%	2.4%	61 tps	1.4s	41K	$0.02	$0.07
142	201	GPT-4o mini	826	±18	645	6.5%	2.1%	71 tps	1.7s	128K	$0.15	$0.60
143	86	Nemotron 3 Nano (Thinking)	821	±23	540	4.4%	2.0%	200 tps	0.5s	256K	$0	$0
144	148	Qwen3 30B A3B Thinking 2507	818	±18	795	3.0%	0.5%	124 tps	1.2s	131K	$0.16	$1.70
145	265	Qwen 2.5 VL 72B Instruct	804	±29	715	6.5%	5.3%	25 tps	3.7s	128K	$1.01	$2.79
146	229	Magistral Medium 2509	797	±17	570	5.0%	4.0%	58 tps	0.9s	131K	$2.00	$5.00
147	265	Magistral Small 2509	790	±29	530	6.2%	2.7%	116 tps	0.6s	131K	$0.50	$1.50
148	186	Gemma 3n E4B	781	±27	535	4.5%	2.0%	30 tps	0.5s	8K	$0.01	$0.02
149	165	Pixtral Large	778	±18	1.1K	7.2%	2.5%	57 tps	1.3s	128K	$1.50	$4.50
150	194	Llama 3.3 70B	745	±30	525	4.5%	0.3%	500 tps	0.5s	8K	$0.48	$0.66
151	274	Pixtral 12B	744	±33	940	9.6%	2.2%	101 tps	1.2s	131K	$0.08	$0.08
152	186	Grok 3 Mini	739	±20	1.4K	2.5%	1.2%	43 tps	0.5s	131K	$0.30	$0.50
153	186	GLM 4.6V Flash	726	±29	575	2.5%	3.7%	64 tps	2.1s	128K	$0.04	$0.40
154	288	Qwen 2.5 VL 3B Instruct	546	±31	1K	9.1%	3.0%	44 tps	2.5s	128K	$0.21	$0.63

4of4

View All (154 models)