Leaderboard | Coding

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

1061

OpenAI o1-pro

1061

Gemini 2.5 Flash Lite Thinking

1058

Seed 2.0 Lite (Medium)

1058

LongCat Flash Chat

1057

GPT-5.1 Codex Mini (Medium)

1054

GPT-5.1 Codex Mini (High)

1052

Qwen3 32B Fast

1049

ERNIE 4.5 300B A47B

1044

Cogito v2.1 671B

1044

Qwen3 32B

1042

GLM 4.5 AirX

1042

Kimi K2 Thinking

1042

OpenAI o4-mini

1039

Gemini 3.1 Flash Lite Preview Thinking

1035

QwQ 32B

Last updated about 1 month ago

Rank	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
121	OpenAI o1-pro	1061	±20	680	7.5%	5.2%	33 tps	72.8s	200K	$150.00	$600.00
122	Gemini 2.5 Flash Lite Thinking	1061	±4	9.8K	6.2%	1.0%	118 tps	4.4s	1M	$0.03	$0.13
123	Seed 2.0 Lite (Medium)	1058	±20	525	3.7%	6.6%	33 tps	1.6s	256K	$0.25	$2.00
124	LongCat Flash Chat	1058	±12	2.7K	5.9%	0.8%	85 tps	0.9s	131K	$0.14	$0.68
125	GPT-5.1 Codex Mini (Medium)	1057	±15	1.9K	4.9%	4.6%	69 tps	4.1s	400K	$0.25	$2.00
126	GPT-5.1 Codex Mini (High)	1054	±15	2.2K	3.9%	5.9%	70 tps	4.6s	400K	$0.25	$2.00
127	Qwen3 32B Fast	1052	±6	11.4K	5.2%	11.6%	30 tps	3.1s	41K	$0.10	$0.25
128	ERNIE 4.5 300B A47B	1049	±4	13.5K	3.9%	4.7%	23 tps	2.3s	123K	$0.28	$1.10
129	Cogito v2.1 671B	1044	±19	1.2K	4.6%	0.8%	85 tps	0.5s	128K	$1.25	$1.25
130	Qwen3 32B	1044	±19	850	6.6%	3.9%	30 tps	3.1s	41K	$0.12	$0.42
131	GLM 4.5 AirX	1042	±15	1.1K	6.9%	3.3%	75 tps	1.2s	131K	$1.10	$4.50
132	Kimi K2 Thinking	1042	±10	3.3K	5.1%	4.2%	61 tps	5.9s	262K	$0.24	$1.03
133	OpenAI o4-mini	1042	±5	8.5K	6.4%	1.4%	97 tps	7.0s	128K	$1.10	$4.40
134	Gemini 3.1 Flash Lite Preview Thinking	1039	±16	1.4K	4.2%	1.7%	75 tps	4.7s	1M	$0.25	$1.50
135	QwQ 32B	1035	±4	11.6K	6.4%	5.4%	41 tps	2.1s	16K	$0.43	$0.56
136	Qwen3 Next 80B A3B Thinking	1035	±5	6.2K	7.4%	0.6%	175 tps	1.3s	256K	$0.21	$2.26
137	Gemini 2.5 Flash Lite Thinking Preview 0925	1035	±7	5.8K	6.8%	1.5%	152 tps	3.0s	1M	$0.10	$0.40
138	Gemini 3.1 Flash Lite Preview	1034	±21	980	4.4%	1.0%	8 tps	1.2s	1M	$0.25	$1.50
139	Qwen3 VL 30B A3B Instruct	1034	±15	1K	6.7%	1.8%	80 tps	2.6s	129K	$0.18	$0.67
140	DeepSeek V3	1032	±5	17.6K	3.7%	0.9%	69 tps	1.1s	64K	$0.59	$1.49
141	DeepSeek V3.2 Speciale	1030	±10	2.3K	6.1%	6.0%	43 tps	1.4s	131K	$0.84	$1.52
142	Gemini 2.0 Flash Lite	1029	±5	14.7K	9.5%	<0.1%	42 tps	0.5s	1M	$0.08	$0.30
143	Amazon Nova 2 Lite	1026	±10	3.6K	6.0%	1.0%	137 tps	0.6s	300K	$0.35	$2.95
144	Command A	1024	±4	22.4K	4.8%	2.2%	42 tps	0.8s	256K	$2.00	$7.33
145	DeepSeek V3.1 Nex N1	1021	±19	565	5.0%	3.4%	24 tps	7.2s	131K	$0.14	$0.50
146	OpenAI o3	1020	±7	5.9K	4.0%	0.9%	85 tps	6.8s	128K	$7.33	$29.33
147	Gemini 2.0 Flash	1018	±7	8.2K	3.8%	<0.1%	76 tps	0.5s	1M	$0.14	$0.56
148	Nemotron 3 Nano (Thinking)	1012	±13	2K	6.7%	2.0%	200 tps	0.5s	256K	$0	$0
149	Qwen3 VL 235B A22B Thinking	1009	±6	4.6K	8.3%	4.3%	47 tps	3.0s	127K	$0.47	$3.31
150	Qwen3 Coder Plus	1007	±22	610	4.7%	5.1%	56 tps	2.3s	128K	$1.80	$9.80
151	DeepSeek-R1 Turbo	1003	±9	2.5K	5.6%	2.6%	29 tps	1.8s	64K	$2.85	$4.75
152	Qwen 2.5 VL 32B Instruct	1001	±21	865	4.9%	6.3%	43 tps	3.2s	128K	$0.35	$0.62
153	Qwen3 235B A22B Thinking 2507	1000	±10	2.8K	4.4%	2.5%	53 tps	1.6s	131K	$0.59	$5.70
154	OpenAI o3-mini-high	999	±5	8.3K	4.1%	2.4%	231 tps	10.5s	200K	$1.10	$4.40
155	OpenAI o3-mini	999	±4	15K	5.5%	0.8%	143 tps	3.3s	200K	$1.10	$4.40
156	OpenAI o4-mini-high	995	±7	13.6K	6.2%	1.9%	117 tps	15.9s	200K	$1.10	$4.40
157	Seed 1.6 250615	995	±21	1.6K	6.0%	3.1%	46 tps	2.2s	256K	$0.25	$2.00
158	Qwen3 30B A3B	994	±8	6.3K	6.9%	5.1%	163 tps	1.0s	41K	$0.06	$0.21
159	GPT-5 Nano	989	±6	4.6K	8.0%	3.2%	113 tps	20.9s	400K	$0.05	$0.40
160	OpenAI o3-mini-low	988	±6	12.2K	6.4%	0.7%	139 tps	1.5s	200K	$1.10	$4.40

4of8

View All (286 models)