Leaderboard | Text

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

Choose topic

All topics Facts and Information Creative Writing and Ideation Logic and Problem-Solving Task Completion Coding

Language

Choose language

All languages English Chinese Arabic Spanish Indonesian Japanese

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

995

Seed 1.6 250615

994

Arcee AI Virtuoso-Large

994

Qwen3 30B A3B

992

Claude Haiku 3

989

GPT-5 Nano

988

OpenAI o3-mini-low

987

Grok Code Fast 1

986

GLM 4.6V

985

Cypher Alpha

982

GLM 4.6 FP8

981

Kimi K2 0711

981

Seed 2.0 Mini (Medium)

980

Mistral Small 3.1 24B Instruct

979

DeepSeek-R1 0528

976

DeepSeek V3.1 Thinking

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
201	148	Seed 1.6 250615	995	±21	1.6K	6.0%	3.1%	46 tps	2.2s	256K	$0.25	$2.00
202	195	Arcee AI Virtuoso-Large	994	±8	3K	5.7%	<0.1%	64 tps	0.5s	131K	$0.75	$1.20
203	148	Qwen3 30B A3B	994	±8	6.3K	6.9%	5.1%	163 tps	1.0s	41K	$0.06	$0.21
204	195	Claude Haiku 3	992	±11	2.8K	3.0%	0.4%	62 tps	0.5s	200K	$0.25	$1.25
205	159	GPT-5 Nano	989	±6	4.6K	8.0%	3.2%	113 tps	20.9s	400K	$0.05	$0.40
206	159	OpenAI o3-mini-low	988	±6	12.2K	6.4%	0.7%	139 tps	1.5s	200K	$1.10	$4.40
207	159	Grok Code Fast 1	987	±9	2.5K	6.0%	5.9%	294 tps	0.5s	256K	$0.20	$1.50
208	159	GLM 4.6V	986	±8	3K	5.5%	6.4%	21 tps	1.8s	128K	$0.38	$0.90
209	195	Cypher Alpha	985	±20	735	8.7%	<0.1%	4 tps	N/A	1M	$0	$0
210	195	GLM 4.6 FP8	982	±17	1.2K	11.7%	<0.1%	56 tps	1.8s	200K	$0.40	$1.75
211	159	Kimi K2 0711	981	±6	7K	4.5%	1.6%	29 tps	1.3s	131K	$0.72	$2.60
212	159	Seed 2.0 Mini (Medium)	981	±35	570	5.8%	11.9%	33 tps	1.7s	256K	$0.15	$0.60
213	159	Mistral Small 3.1 24B Instruct	980	±11	2.9K	4.3%	7.5%	15 tps	2.4s	131K	$0.06	$0.18
214	159	DeepSeek-R1 0528	979	±6	5.5K	3.5%	1.3%	93 tps	0.5s	64K	$1.60	$3.67
215	167	DeepSeek V3.1 Thinking	976	±9	5.2K	9.5%	7.1%	18 tps	1.8s	131K	$0.23	$0.75
216	211	Grok 4 (Low Reasoning)	975	±21	520	2.8%	<0.1%	18 tps	9.5s	256K	$0	$0
217	167	Nemotron 3 Nano	974	±46	580	6.5%	1.3%	216 tps	0.8s	256K	$0.05	$4.94
218	211	Arcee AI Coder-Large	972	±15	985	4.4%	<0.1%	60 tps	1.6s	33K	$0.50	$0.80
219	167	Qwen 2.5 32B Instruct	972	±7	4.1K	6.5%	2.5%	48 tps	1.0s	131K	$0.21	$0.25
220	219	Arcee Coder Large	971	±7	3.6K	2.6%	<0.1%	54 tps	1.3s	33K	$0.50	$0.80
221	167	Llama 4 Maverick	971	±5	21K	5.0%	1.2%	88 tps	2.4s	1M	$0.23	$0.83
222	167	Mistral Small 3.2 24B	970	±13	4.6K	4.9%	2.8%	141 tps	0.7s	33K	$0.02	$0.08
223	167	Pixtral Large	969	±14	3.5K	3.9%	2.5%	57 tps	1.3s	128K	$1.50	$4.50
224	167	Qwen3 VL 30B A3B Thinking	967	±11	1.9K	8.9%	4.5%	84 tps	2.9s	127K	$0.20	$1.47
225	167	Llama 4 Scout	965	±5	17.5K	5.3%	0.6%	88 tps	5.1s	131K	$0.18	$0.46
226	167	Devstral Medium	962	±11	3.5K	5.2%	1.5%	77 tps	0.6s	131K	$0.40	$2.00
227	167	Qwen3 14B	962	±8	5.3K	8.4%	1.7%	109 tps	0.8s	41K	$0.04	$0.15
228	167	Qwen 2.5 72B	960	±15	1.4K	4.5%	1.2%	96 tps	1.2s	131K	$0.14	$0.26
229	167	Llama 3.1 8B Turbo	958	±14	2.2K	2.0%	2.1%	650 tps	0.5s	128K	$0.13	$0.14
230	179	Qwen3 30B A3B Thinking 2507	953	±10	3.5K	4.7%	0.5%	124 tps	1.2s	131K	$0.16	$1.70
231	230	Magistral Medium	952	±16	1.3K	10.8%	<0.1%	95 tps	0.5s	41K	$2.00	$5.00
232	179	Switchpoint Router	949	±10	2.7K	3.6%	1.7%	71 tps	4.9s	131K	$0.85	$3.40
233	179	Qwen3 8B	948	±9	4.2K	8.2%	2.4%	61 tps	1.4s	41K	$0.02	$0.07
234	179	Ministral 14B 3.0	948	±16	805	8.5%	2.0%	119 tps	0.5s	128K	$0.20	$0.20
235	179	Grok 3 Mini Fast	943	±7	9K	7.0%	1.6%	44 tps	0.5s	131K	$0.60	$4.00
236	179	ERNIE 4.5 21B A3B	943	±28	540	6.9%	2.3%	78 tps	1.5s	120K	$0.05	$0.19
237	230	NVIDIA Llama 3.3 Nemotron Super 49B v1	942	±9	3.6K	2.3%	<0.1%	13 tps	N/A	131K	$0.07	$0.20
238	179	ERNIE 4.5 VL 424B A47B	942	±18	725	6.5%	4.9%	36 tps	3.5s	123K	$0.42	$1.25
239	179	NVIDIA Llama 3.3 Nemotron Super 49B v1.5	941	±19	1.4K	6.9%	2.0%	50 tps	0.6s	131K	$0.09	$0.33
240	179	DeepSeek-R1	939	±6	6.4K	4.3%	0.8%	133 tps	0.6s	64K	$0.91	$3.07

6of11

View All (404 models)