Leaderboard | UI Coding

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

870

Qwen3 Max Thinking Preview

873

gpt-oss-120b

877

DeepSeek V3.2 Exp Thinking

909

Mistral Medium 3.1

936

Kimi K2 0905

1001

Grok 4 Fast Reasoning

1007

Qwen3 Max Instruct Preview

1035

GPT-5 (High)

1069

GPT-5 Codex (High)

1090

Gemini 2.5 Pro High

1092

DeepSeek V3.2 Thinking

1098

Grok 4.1 Fast Reasoning

1103

GPT-5.1 Codex (High)

1113

Kimi K2 Thinking Turbo

1124

MiniMax M2.1

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
1	84	Qwen3 Max Thinking Preview	870	±24	1.6K	2.2%	3.1%	40 tps	2.1s	256K	$1.20	$6.00
2	51	gpt-oss-120b	873	±17	5K	1.9%	0.7%	213 tps	0.5s	131K	$0.11	$0.50
3	103	DeepSeek V3.2 Exp Thinking	877	±19	3.7K	2.1%	7.2%	26 tps	3.0s	131K	$0.28	$0.42
4	19	Mistral Medium 3.1	909	±20	5.1K	1.8%	<0.1%	77 tps	0.7s	128K	$0.40	$2.00
5	146	Kimi K2 0905	936	±26	4.8K	2.2%	4.0%	30 tps	1.4s	262K	$0.63	$2.39
6	51	Grok 4 Fast Reasoning	1001	±21	5.2K	2.2%	2.1%	102 tps	3.1s	2M	$0.30	$0.75
7	45	Qwen3 Max Instruct Preview	1007	±20	5.5K	1.4%	1.1%	31 tps	1.7s	256K	$1.43	$6.61
8	27	GPT-5 (High)	1035	±19	3.4K	1.9%	4.5%	81 tps	35.9s	400K	$1.25	$10.00
9	34	GPT-5 Codex (High)	1069	±18	4.9K	1.9%	3.2%	122 tps	7.1s	400K	$1.25	$10.00
10	33	Gemini 2.5 Pro High	1090	±21	4.6K	1.4%	1.5%	48 tps	2.3s	1M	$1.25	$10.00
11	59	DeepSeek V3.2 Thinking	1092	±14	16.2K	3.8%	9.0%	30 tps	2.6s	131K	$0.28	$0.42
12	47	Grok 4.1 Fast Reasoning	1098	±13	27.4K	4.1%	1.5%	58 tps	7.3s	2M	$0.20	$0.50
13	59	GPT-5.1 Codex (High)	1103	±13	24.3K	3.4%	3.2%	96 tps	3.9s	400K	$1.25	$10.00
14	47	Kimi K2 Thinking Turbo	1113	±16	14.3K	2.6%	2.0%	75 tps	1.4s	262K	$1.15	$8.00
15	64	MiniMax M2.1	1124	±14	11.4K	2.7%	2.1%	66 tps	2.6s	205K	$0.30	$1.20
16	66	MiniMax M2	1133	±16	12.4K	2.4%	2.2%	39 tps	2.3s	205K	$0.21	$0.85
17	8	GPT-5.1 (High)	1133	±17	6.4K	2.2%	3.2%	76 tps	6.9s	400K	$1.25	$10.00
18	73	GLM 4.7	1133	±15	9.7K	2.6%	5.8%	40 tps	1.5s	200K	$0.77	$1.73
19	69	GLM 4.6	1147	±13	11.1K	1.9%	5.4%	39 tps	1.5s	200K	$0.42	$1.66
20	17	GPT-5.2 (High)	1159	±13	12.3K	2.8%	6.7%	18 tps	16.3s	400K	$1.75	$14.00
21	5	Claude Sonnet 4.6 (Thinking)	1170	±23	5.6K	6.1%	4.7%	57 tps	1.1s	200K	$3.00	$15.00
22	14	Gemini 3 Flash Preview Thinking	1178	±13	20.9K	3.4%	1.6%	3 tps	6.2s	1M	$0.50	$3.00
23	22	GLM 5	1200	±17	8.3K	3.5%	3.4%	36 tps	2.7s	200K	$0.72	$2.55
24	10	Claude Sonnet 4.5 (Thinking)	1239	±10	13.8K	2.1%	1.9%	44 tps	1.1s	200K	$3.00	$15.00
25	10	Gemini 3 Pro	1269	±11	23.4K	2.3%	2.1%	50 tps	3.6s	1M	$2.00	$12.00
26	4	GPT-5.4 (High)	1284	±17	2.5K	5.3%	4.6%	68 tps	7.9s	1M	$2.50	$15.00
27	7	Claude Opus 4.5 (Thinking)	1293	±13	16.9K	1.9%	1.8%	49 tps	1.4s	200K	$5.00	$25.00
28	6	Gemini 3.1 Pro	1312	±14	7.6K	3.9%	3.5%	35 tps	4.1s	1M	$2.00	$12.00
29	1	Claude Opus 4.6 (Thinking)	1468	±15	4.4K	3.1%	2.5%	56 tps	1.6s	200K	$5.00	$25.00