Leaderboard | Text

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

Choose topic

All topics Facts and Information Creative Writing and Ideation Logic and Problem-Solving Task Completion Coding

Language

Choose language

All languages English Chinese Arabic Spanish Indonesian Japanese

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

932

Qwen3 32B Fast

933

Qwen3 14B

937

OpenAI o3-mini-low

937

OpenAI o4-mini-high

938

GLM 4.6V

938

Nemotron 3 Nano (Thinking)

939

DeepSeek-R1

939

GPT-4o mini

939

Qwen3 30B A3B

943

Llama 3.2 11B Instruct

945

Devstral Medium

945

Ministral 14B 3.0

946

OpenAI o3-mini

949

Qwen3 VL 30B A3B Thinking

954

gpt-oss-20b

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
81	121	Qwen3 32B Fast	932	±5	4.5K	12.9%	11.6%	30 tps	3.1s	41K	$0.10	$0.25
82	133	Qwen3 14B	933	±11	2.7K	17.1%	1.7%	109 tps	0.8s	41K	$0.04	$0.15
83	175	OpenAI o3-mini-low	937	±4	6.1K	13.6%	0.7%	139 tps	1.5s	200K	$1.10	$4.40
84	148	OpenAI o4-mini-high	937	±6	6K	14.9%	1.9%	117 tps	15.9s	200K	$1.10	$4.40
85	139	GLM 4.6V	938	±11	2.5K	6.1%	6.4%	21 tps	1.8s	128K	$0.38	$0.90
86	86	Nemotron 3 Nano (Thinking)	938	±14	1.3K	7.6%	2.0%	200 tps	0.5s	256K	$0	$0
87	148	DeepSeek-R1	939	±12	1.6K	5.5%	0.8%	133 tps	0.6s	64K	$0.91	$3.07
88	201	GPT-4o mini	939	±9	1.4K	9.2%	2.1%	71 tps	1.7s	128K	$0.15	$0.60
89	126	Qwen3 30B A3B	939	±5	3.7K	12.1%	5.1%	163 tps	1.0s	41K	$0.06	$0.21
90	194	Llama 3.2 11B Instruct	943	±14	745	14.4%	1.5%	152 tps	0.5s	8K	$0.16	$0.16
91	170	Devstral Medium	945	±11	1.6K	14.7%	1.5%	77 tps	0.6s	131K	$0.40	$2.00
92	153	Ministral 14B 3.0	945	±28	490	11.7%	2.0%	119 tps	0.5s	128K	$0.20	$0.20
93	177	OpenAI o3-mini	946	±6	6.7K	12.3%	0.8%	143 tps	3.3s	200K	$1.10	$4.40
94	165	Qwen3 VL 30B A3B Thinking	949	±8	1.5K	11.2%	4.5%	84 tps	2.9s	127K	$0.20	$1.47
95	101	gpt-oss-20b	954	±5	6.1K	10.8%	0.5%	216 tps	0.5s	131K	$0.06	$0.26
96	121	QwQ 32B	955	±7	5K	15.3%	5.4%	41 tps	2.1s	16K	$0.43	$0.56
97	161	Llama 4 Maverick	956	±4	11.2K	8.2%	1.2%	88 tps	2.4s	1M	$0.23	$0.83
98	161	Mistral Small 3.1	960	±16	915	11.2%	7.4%	13 tps	2.6s	32K	$0.17	$0.28
99	177	Mistral Small 3.1 24B Instruct	966	±12	1K	10.6%	7.5%	15 tps	2.4s	131K	$0.06	$0.18
100	222	Sky T1 32B Preview	972	±16	805	10.6%	7.8%	73 tps	0.6s	16K	$0.12	$0.18
101	209	Seed 1.6 Flash 250715	974	±16	980	6.2%	2.5%	108 tps	1.6s	256K	$0.07	$0.30
102	214	Qwen 2.5 VL 32B Instruct	977	±20	850	7.6%	6.3%	43 tps	3.2s	128K	$0.35	$0.62
103	165	DeepSeek R1T2 Chimera	978	±10	1.1K	11.0%	3.0%	28 tps	1.8s	164K	$0.13	$0.45
104	126	Qwen3 VL 235B A22B Thinking	979	±6	3.5K	11.5%	4.3%	47 tps	3.0s	127K	$0.47	$3.31
105	201	Gemma 3 27B IT	983	±15	905	10.4%	2.0%	60 tps	0.8s	128K	$0.17	$0.29
106	133	DeepSeek-R1 0528	983	±12	1.3K	4.6%	1.3%	93 tps	0.5s	64K	$1.60	$3.67
107	157	Cogito v2.1 671B	984	±17	715	5.9%	0.8%	85 tps	0.5s	128K	$1.25	$1.25
108	129	Qwen3 Max Thinking	990	±13	1.7K	2.3%	13.5%	32 tps	2.3s	256K	$1.20	$6.00
109	133	Kimi K2 0905	991	±6	7.5K	5.6%	4.0%	30 tps	1.4s	262K	$0.63	$2.39
110	111	LongCat Flash Chat	996	±14	930	7.0%	0.8%	85 tps	0.9s	131K	$0.14	$0.68
111	170	Llama 3.1 8B Turbo	998	±12	1.1K	2.8%	2.1%	650 tps	0.5s	128K	$0.13	$0.14
112	139	OpenAI o4-mini	1000	±5	4.8K	10.2%	1.4%	97 tps	7.0s	128K	$1.10	$4.40
113	121	NVIDIA Llama 3.3 Nemotron Super 49B v1.5	1000	±16	1K	9.9%	2.0%	50 tps	0.6s	131K	$0.09	$0.33
114	113	GLM 4.5	1002	±5	3.7K	14.3%	3.7%	46 tps	1.4s	131K	$0.43	$1.63
115	113	Kimi K2 Fast	1006	±4	26.2K	13.8%	0.8%	365 tps	0.5s	131K	$1.00	$3.00
116	143	Gemini 2.0 Flash Lite	1011	±6	5.7K	6.9%	<0.1%	42 tps	0.5s	1M	$0.08	$0.30
117	139	Qwen3 VL 30B A3B Instruct	1012	±17	1K	6.5%	1.8%	80 tps	2.6s	129K	$0.18	$0.67
118	129	DeepSeek V3.1 Thinking	1014	±7	3.9K	14.0%	7.1%	18 tps	1.8s	131K	$0.23	$0.75
119	148	OpenAI o3	1016	±11	1.3K	4.6%	0.9%	85 tps	6.8s	128K	$7.33	$29.33
120	124	Qwen3 235B A22B Thinking 2507	1018	±11	1.1K	4.2%	2.5%	53 tps	1.6s	131K	$0.59	$5.70

3of6

View All (237 models)