Leaderboard | Coding

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

1226

Grok 4.1 Fast Non-Reasoning

1226

Qwen3 Max Instruct Preview

1224

GPT-5 Codex (Medium)

1220

Qwen3 30B A3B Instruct 2507

1218

GPT-5.2 (Extra High)

1211

GPT-5.1 Instant

1209

GPT-5

1204

MiniMax M2.1

1202

GPT-5.1 Codex (Medium)

1197

Qwen3 VL 235B A22B Instruct

1194

GPT-5 (High)

1194

Grok 4 Fast Reasoning

1191

Kimi K2.5 Instant

1188

GPT-5 Codex (Low)

1187

MiniMax M2.1 Lightning

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
41	31	Grok 4.1 Fast Non-Reasoning	1226	±13	5.8K	6.3%	0.9%	101 tps	0.5s	2M	$0.20	$0.50
42	43	Qwen3 Max Instruct Preview	1226	±9	4.7K	7.7%	1.1%	31 tps	1.7s	256K	$1.43	$6.61
43	36	GPT-5 Codex (Medium)	1224	±10	6.2K	3.9%	4.1%	122 tps	5.2s	400K	$1.25	$10.00
44	49	Qwen3 30B A3B Instruct 2507	1220	±7	5.7K	7.3%	1.2%	55 tps	1.3s	131K	$0.13	$0.72
45	36	GPT-5.2 (Extra High)	1218	±13	5.4K	3.6%	13.2%	17 tps	20.5s	400K	$1.75	$14.00
46	60	GPT-5.1 Instant	1211	±8	5.8K	4.2%	1.3%	50 tps	1.9s	400K	$1.25	$10.00
47	49	GPT-5	1209	±6	12.4K	6.7%	3.1%	78 tps	23.1s	400K	$1.25	$9.67
48	49	MiniMax M2.1	1204	±10	6.9K	5.1%	2.1%	66 tps	2.6s	205K	$0.30	$1.20
49	60	GPT-5.1 Codex (Medium)	1202	±20	2.5K	2.9%	4.6%	71 tps	3.7s	400K	$1.25	$10.00
50	36	Qwen3 VL 235B A22B Instruct	1197	±9	2.9K	7.7%	3.1%	75 tps	1.9s	129K	$0.37	$1.81
51	27	GPT-5 (High)	1194	±8	8.9K	4.0%	4.5%	81 tps	35.9s	400K	$1.25	$10.00
52	60	Grok 4 Fast Reasoning	1194	±8	7.7K	6.5%	2.1%	102 tps	3.1s	2M	$0.30	$0.75
53	36	Kimi K2.5 Instant	1191	±14	1.4K	3.4%	2.9%	32 tps	3.0s	262K	$0.50	$3.00
54	69	GPT-5 Codex (Low)	1188	±10	3.3K	4.2%	2.7%	112 tps	3.5s	400K	$1.25	$10.00
55	43	MiniMax M2.1 Lightning	1187	±31	655	3.0%	1.7%	52 tps	2.1s	205K	$0.30	$2.40
56	60	Claude Sonnet 3.5 v2	1186	±8	4.9K	3.4%	<0.1%	46 tps	1.4s	200K	$3.00	$15.00
57	60	Qwen3 235B A22B Instruct 2507	1181	±7	6.8K	7.8%	6.8%	13 tps	1.9s	262K	$0.13	$0.52
58	49	GLM 4.6	1179	±10	4.4K	8.3%	5.4%	39 tps	1.5s	200K	$0.42	$1.66
59	90	Grok 3 Fast	1179	±22	1.1K	1.7%	1.7%	52 tps	2.4s	131K	$5.00	$25.00
60	49	Kimi K2 Thinking Turbo	1177	±13	5.3K	4.5%	2.0%	75 tps	1.4s	262K	$1.15	$8.00
61	49	DeepSeek V3.2	1173	±13	3K	5.9%	1.4%	83 tps	5.1s	131K	$0.43	$1.09
62	69	DeepSeek V3.1 Terminus Chat	1171	±9	2.6K	10.5%	3.4%	27 tps	1.5s	131K	$0.86	$1.80
63	31	MiniMax M2.5 Lightning	1171	±27	1.1K	2.3%	1.5%	51 tps	2.0s	205K	$0.60	$2.40
64	60	Gemini 2.5 Pro	1167	±5	23K	5.7%	2.3%	45 tps	2.6s	1M	$1.25	$10.00
65	49	MiniMax M2	1166	±8	5.4K	7.5%	2.2%	39 tps	2.3s	205K	$0.21	$0.85
66	43	Gemini 2.5 Pro High	1164	±6	10.4K	7.1%	1.5%	48 tps	2.3s	1M	$1.25	$10.00
67	60	Grok 4.20 Beta Reasoning	1157	±20	930	4.1%	1.1%	77 tps	4.5s	2M	$2.00	$5.50
68	60	DeepSeek V3.2 Thinking	1153	±9	6.7K	4.6%	9.0%	30 tps	2.6s	131K	$0.28	$0.42
69	69	gpt-oss-120b	1152	±7	8.5K	8.0%	0.7%	213 tps	0.5s	131K	$0.11	$0.50
70	60	Grok 4.1 Fast Reasoning	1151	±6	12.8K	5.4%	1.5%	58 tps	7.3s	2M	$0.20	$0.50
71	74	Qwen Plus (Aug'24)	1134	±7	8.3K	4.9%	1.4%	53 tps	1.3s	30K	$0.40	$1.20
72	69	GLM 4.7	1134	±9	5.8K	5.5%	5.8%	40 tps	1.5s	200K	$0.77	$1.73
73	77	GPT-5 Mini	1133	±6	6.5K	6.0%	2.6%	66 tps	14.2s	400K	$0.25	$2.00
74	85	GPT-5.2 Codex (Low)	1131	±27	1K	3.3%	4.5%	41 tps	5.0s	400K	$1.75	$14.00
75	77	Grok 4	1129	±5	21.3K	5.4%	3.9%	29 tps	11.1s	256K	$3.00	$15.00
76	77	GPT-4.1	1127	±5	16.2K	1.8%	3.7%	112 tps	1.3s	1M	$2.00	$8.00
77	85	Gemini 2.5 Flash Thinking	1123	±7	10.7K	3.7%	2.2%	88 tps	6.4s	1M	$0.30	$2.50
78	77	Qwen3 Max Thinking Preview	1122	±14	3K	7.8%	3.1%	40 tps	2.1s	256K	$1.20	$6.00
79	98	Grok 3	1120	±8	9.2K	4.8%	1.5%	53 tps	0.6s	1M	$3.67	$18.33
80	90	Qwen Max	1119	±6	9K	4.7%	1.5%	49 tps	1.5s	33K	$1.60	$6.40

2of7

View All (273 models)