Leaderboard | Text

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

Choose topic

All topics Facts and Information Creative Writing and Ideation Logic and Problem-Solving Task Completion Coding

Language

Choose language

All languages English Chinese Arabic Spanish Indonesian Japanese

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

1033

Arcee AI Maestro Reasoning

1032

Llama 3.1 405B Instruct

1031

Llama 3 70B Turbo

1031

GPT-4.1 mini

1031

Ministral 14B 3.0

1030

Command A

1030

Claude Opus 4 (Thinking)

1030

Grok 3 Fast

1027

Qwen3 VL 235B A22B Thinking

1027

DeepSeek V3.2 Speciale

1027

Qwen3 4B

1027

DeepSeek V3

1027

Qwen3 VL 30B A3B Thinking

1026

Claude Sonnet 4

1025

GLM 4.5 Turbo

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
161	147	Arcee AI Maestro Reasoning	1033	±4	11.8K	1.6%	<0.1%	85 tps	0.3s	131K	$0.90	$3.30
162	159	Llama 3.1 405B Instruct	1032	±8	1.3K	1.5%	<0.1%	52 tps	0.5s	128K	$2.60	$4.27
163	177	Llama 3 70B Turbo	1031	±3	15.7K	1.3%	<0.1%	31 tps	0.0s	8K	$0.73	$0.83
164	118	GPT-4.1 mini	1031	±2	57.7K	1.3%	1.1%	67 tps	0.9s	1M	$0.34	$1.60
165	153	Ministral 14B 3.0	1031	±6	2.3K	3.1%	2.0%	119 tps	0.5s	128K	$0.20	$0.20
166	129	Command A	1030	±2	67.4K	1.3%	2.2%	42 tps	0.8s	256K	$2.00	$7.33
167	21	Claude Opus 4 (Thinking)	1030	±6	1.3K	3.0%	<0.1%	28 tps	1.3s	200K	$15.00	$75.00
168	111	Grok 3 Fast	1030	±3	12K	1.1%	1.7%	52 tps	2.4s	131K	$5.00	$25.00
169	126	Qwen3 VL 235B A22B Thinking	1027	±4	7.3K	2.9%	4.3%	47 tps	3.0s	127K	$0.47	$3.31
170	133	DeepSeek V3.2 Speciale	1027	±5	5.9K	2.2%	6.0%	43 tps	1.4s	131K	$0.84	$1.52
171	165	Qwen3 4B	1027	±4	9.4K	3.3%	1.9%	94 tps	1.5s	128K	$0.01	$0.01
172	126	DeepSeek V3	1027	±2	41.8K	1.0%	0.9%	69 tps	1.1s	64K	$0.59	$1.49
173	165	Qwen3 VL 30B A3B Thinking	1027	±7	2.3K	4.6%	4.5%	84 tps	2.9s	127K	$0.20	$1.47
174	86	Claude Sonnet 4	1026	±2	88.9K	1.5%	1.8%	49 tps	1.3s	200K	$3.00	$15.00
175	182	GLM 4.5 Turbo	1025	±11	1K	2.9%	<0.1%	46 tps	1.6s	131K	$1.00	$3.00
176	159	Qwen Turbo	1025	±3	32.8K	1.3%	<0.1%	53 tps	1.1s	1M	$0.05	$0.20
177	143	Mistral Medium 3	1023	±9	1.2K	1.7%	2.4%	47 tps	0.8s	33K	$0.40	$2.00
178	182	GLM 4.6 FP8	1022	±5	2.2K	3.7%	<0.1%	56 tps	1.8s	200K	$0.40	$1.75
179	77	Claude Opus 4.1	1022	±3	6.5K	2.5%	3.0%	17 tps	3.7s	200K	$15.00	$75.00
180	161	Qwen3 8B	1020	±5	6.1K	2.6%	2.4%	61 tps	1.4s	41K	$0.02	$0.07
181	143	Seed 1.6 250615	1018	±4	3.6K	1.6%	3.1%	46 tps	2.2s	256K	$0.25	$2.00
182	182	Fauna Fox	1018	±4	10.7K	2.4%	<0.1%	194 tps	0.3s	128K	$0.04	$0.15
183	148	OpenAI o3	1018	±6	4.2K	1.8%	0.9%	85 tps	6.8s	128K	$7.33	$29.33
184	133	Kimi K2 0905	1016	±4	9.2K	2.1%	4.0%	30 tps	1.4s	262K	$0.63	$2.39
185	157	Qwen3 Next 80B A3B Thinking	1015	±3	12.3K	2.3%	0.6%	175 tps	1.3s	256K	$0.21	$2.26
186	133	GPT-4.1 nano	1014	±2	52.1K	1.3%	0.6%	175 tps	0.5s	1M	$0.10	$0.40
187	101	Qwen3.5 35B A3B	1011	±15	1.1K	1.4%	2.1%	116 tps	2.1s	256K	$0.63	$1.13
188	179	Baichuan-M2-32B	1011	±8	1.4K	2.7%	<0.1%	32 tps	3.3s	131K	$0.07	$0.07
189	139	Seed 2.0 Mini (Medium)	1010	±10	1.3K	2.6%	11.9%	33 tps	1.7s	256K	$0.15	$0.60
190	148	OpenAI o4-mini-high	1009	±2	19.5K	2.1%	1.9%	117 tps	15.9s	200K	$1.10	$4.40
191	175	MiMo V2 Flash	1009	±10	645	3.7%	7.2%	24 tps	1.9s	262K	$0.07	$0.23
192	139	Qwen3 VL 30B A3B Instruct	1009	±9	1.2K	4.5%	1.8%	80 tps	2.6s	129K	$0.18	$0.67
193	193	GPT-5 Nano High	1008	±10	600	1.6%	<0.1%	23 tps	25.7s	400K	$0.05	$0.40
194	129	Qwen3 Max Thinking	1007	±5	6.9K	1.4%	13.5%	32 tps	2.3s	256K	$1.20	$6.00
195	139	OpenAI o4-mini	1007	±3	13.6K	2.2%	1.4%	97 tps	7.0s	128K	$1.10	$4.40
196	165	ERNIE 4.5 21B A3B	1006	±7	1.4K	1.8%	2.3%	78 tps	1.5s	120K	$0.05	$0.19
197	186	Mistral Small 3.2 24B Instruct	1006	±7	1.6K	3.4%	1.9%	113 tps	1.1s	131K	$0.02	$0.08
198	233	TNG Tech DeepSeek R1T Chimera	1006	±12	600	0.8%	<0.1%	78 tps	1.5s	164K	$0.11	$0.44
199	139	GLM 4.6V	1005	±3	7.4K	1.4%	6.4%	21 tps	1.8s	128K	$0.38	$0.90
200	153	Qwen 2.5 32B Instruct	1005	±3	15.7K	1.0%	2.5%	48 tps	1.0s	131K	$0.21	$0.25

5of11

View All (410 models)