Leaderboard | Coding

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

1223

GPT-4.5 Preview

1221

Nova Experimental Chat 10-20

1221

GPT-5.2 (Extra High)

1220

Qwen3 VL 235B A22B Instruct

1214

GPT-5 Codex (Medium)

1211

GPT-5.2 Codex (Medium)

1210

Claude Sonnet 3.7 (Thinking)

1206

Mistral Medium 3.1

1205

Claude Sonnet 4

1205

Gemini 3 Flash Preview

1204

Gemini 2.5 Pro High

1203

Qwen3 Max Instruct Preview

1201

Claude Sonnet 3.7

1200

GPT-5.1 Codex Max

1197

MiniMax M2.1 Lightning

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
41	44	GPT-4.5 Preview	1223	±7	2.5K	1.8%	<0.1%	36 tps	3.0s	200K	$75.00	$150.00
42	44	Nova Experimental Chat 10-20	1221	±5	4.4K	8.1%	<0.1%	30 tps	0.5s	98K	$0	$0
43	36	GPT-5.2 (Extra High)	1221	±9	8K	3.5%	13.2%	17 tps	20.5s	400K	$1.75	$14.00
44	36	Qwen3 VL 235B A22B Instruct	1220	±7	5.6K	6.7%	3.1%	75 tps	1.9s	129K	$0.37	$1.81
45	36	GPT-5 Codex (Medium)	1214	±6	8.8K	3.9%	4.1%	122 tps	5.2s	400K	$1.25	$10.00
46	36	GPT-5.2 Codex (Medium)	1211	±12	2.4K	3.0%	5.7%	37 tps	6.3s	400K	$1.75	$14.00
47	53	Claude Sonnet 3.7 (Thinking)	1210	±3	13.6K	3.1%	<0.1%	41 tps	2.6s	200K	$3.00	$15.00
48	53	Mistral Medium 3.1	1206	±5	16.4K	5.1%	<0.1%	77 tps	0.7s	128K	$0.40	$2.00
49	43	Claude Sonnet 4	1205	±3	43.2K	3.7%	1.8%	49 tps	1.3s	200K	$3.00	$15.00
50	43	Gemini 3 Flash Preview	1205	±11	7.2K	3.7%	1.3%	138 tps	1.4s	1M	$0.50	$3.00
51	43	Gemini 2.5 Pro High	1204	±3	21.1K	5.7%	1.5%	48 tps	2.3s	1M	$1.25	$10.00
52	43	Qwen3 Max Instruct Preview	1203	±6	16.1K	4.6%	1.1%	31 tps	1.7s	256K	$1.43	$6.61
53	58	Claude Sonnet 3.7	1201	±4	12.1K	3.2%	<0.1%	39 tps	1.6s	200K	$3.00	$15.00
54	43	GPT-5.1 Codex Max	1200	±12	6.4K	3.9%	3.0%	118 tps	4.1s	400K	$1.25	$10.00
55	43	MiniMax M2.1 Lightning	1197	±23	875	3.3%	1.7%	52 tps	2.1s	205K	$0.30	$2.40
56	49	Qwen3 30B A3B Instruct 2507	1194	±5	12.7K	5.7%	1.2%	55 tps	1.3s	131K	$0.13	$0.72
57	62	OpenAI o1-mini	1192	±4	15K	4.6%	<0.1%	118 tps	N/A	128K	$1.13	$4.51
58	49	MiniMax M2.1	1192	±8	19.4K	3.6%	2.1%	66 tps	2.6s	205K	$0.30	$1.20
59	49	DeepSeek V3.2	1189	±8	5.1K	4.7%	1.4%	83 tps	5.1s	131K	$0.43	$1.09
60	62	Qwen Plus 0728	1189	±8	2.1K	7.5%	<0.1%	55 tps	0.9s	1M	$0.40	$1.20
61	49	MiniMax M2.5 FP8	1185	±17	610	3.2%	3.6%	33 tps	1.7s	205K	$0.45	$1.75
62	49	GPT-5	1185	±4	21.3K	5.3%	3.1%	78 tps	23.1s	400K	$1.25	$9.67
63	49	Grok 4 Fast Non-Reasoning	1185	±5	8.1K	7.1%	1.5%	93 tps	0.6s	2M	$0.27	$0.67
64	49	MiniMax M2	1183	±5	19.7K	4.2%	2.2%	39 tps	2.3s	205K	$0.21	$0.85
65	49	Nova Experimental Chat 12-10	1182	±9	2.9K	3.8%	2.4%	84 tps	12.9s	98K	$0	$0
66	49	GLM 4.6	1182	±7	17.2K	4.4%	5.4%	39 tps	1.5s	200K	$0.42	$1.66
67	49	GPT-5.3 Codex (Low)	1178	±28	510	1.0%	1.8%	61 tps	4.3s	400K	$1.75	$14.00
68	60	Grok 4.1 Fast Reasoning	1178	±7	39.5K	4.4%	1.5%	58 tps	7.3s	2M	$0.20	$0.50
69	60	Grok 4 Fast Reasoning	1177	±3	14.5K	5.0%	2.1%	102 tps	3.1s	2M	$0.30	$0.75
70	60	Gemini 2.5 Pro	1176	±3	37.9K	4.8%	2.3%	45 tps	2.6s	1M	$1.25	$10.00
71	75	Gemini 2.5 Flash Thinking Preview 0925	1173	±7	9.2K	6.8%	<0.1%	111 tps	4.7s	1M	$0.30	$2.50
72	60	Qwen3 235B A22B Instruct 2507	1172	±4	12.6K	6.4%	6.8%	13 tps	1.9s	262K	$0.13	$0.52
73	60	Claude Sonnet 3.5 v2	1171	±6	5.5K	3.4%	<0.1%	46 tps	1.4s	200K	$3.00	$15.00
74	60	GPT-5.1 Codex (Medium)	1171	±14	3K	3.2%	4.6%	71 tps	3.7s	400K	$1.25	$10.00
75	60	GPT-5.1 Instant	1171	±8	8.3K	4.1%	1.3%	50 tps	1.9s	400K	$1.25	$10.00
76	75	Gemini 2.5 Pro Low	1170	±4	9.6K	8.1%	<0.1%	89 tps	2.4s	1M	$1.25	$10.00
77	60	Grok 4.20 Beta Reasoning	1167	±22	1.2K	4.1%	1.1%	77 tps	4.5s	2M	$2.00	$5.50
78	69	Qwen3.5 35B A3B	1164	±25	865	3.9%	2.1%	116 tps	2.1s	256K	$0.63	$1.13
79	69	GPT-5 Codex (Low)	1163	±10	5K	4.1%	2.7%	112 tps	3.5s	400K	$1.25	$10.00
80	69	GLM 4.7	1161	±7	16.8K	3.7%	5.8%	40 tps	1.5s	200K	$0.77	$1.73

2of8

View All (305 models)