Models
Topics
Language
More

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

Open license models

Filter the leaderboard to only show models that have an open license.

1453
Claude Opus 4.6 (Thinking)
1426
Claude Opus 4.6
1423
GPT-5.4
1376
Claude Sonnet 4.6
1370
Claude Sonnet 4.6 (Thinking)
1320
Gemini 3.1 Pro
1299
Claude Opus 4.5 (Thinking)
1291
GPT-5.1
1290
GPT-5.1 (High)
1277
GPT-5.2 Instant
1275
Gemini 3 Pro
1275
Claude Sonnet 4.5 (Thinking)
1261
GPT-5.3 Instant
1249
Gemini 3 Flash Preview Thinking
1247
Gemini 3 Pro (Low)

Last updated about 1 month ago

RankNameVIBE
Score
Confidence
Interval
VotesDownvote %Abort %SpeedLatencyContextCost
(Input)
Cost
(Output)
1Claude Opus 4.6 (Thinking)1453±334K1.3%2.5%56 tps1.6s200K$5.00$25.00
2Claude Opus 4.61426±345K1.1%2.1%48 tps1.7s200K$5.00$25.00
3GPT-5.41423±611.1K1.4%2.6%55 tps0.8s1M$2.50$15.00
4Claude Sonnet 4.61376±333.5K1.2%1.6%47 tps1.2s200K$3.00$15.00
5Claude Sonnet 4.6 (Thinking)1370±233.4K2.4%4.7%57 tps1.1s200K$3.00$15.00
6Gemini 3.1 Pro1320±355.8K1.8%3.5%35 tps4.1s1M$2.00$12.00
7Claude Opus 4.5 (Thinking)1299±2116.9K1.8%1.8%49 tps1.4s200K$5.00$25.00
8GPT-5.11291±248.3K2.3%2.3%71 tps1.4s400K$1.42$11.33
9GPT-5.1 (High)1290±265.6K2.3%3.2%76 tps6.9s400K$1.25$10.00
10GPT-5.2 Instant1277±267.2K1.7%1.7%52 tps2.0s400K$1.75$14.00
11Gemini 3 Pro1275±2163.7K1.9%2.1%50 tps3.6s1M$2.00$12.00
12Claude Sonnet 4.5 (Thinking)1275±2115.4K3.0%1.9%44 tps1.1s200K$3.00$15.00
13GPT-5.3 Instant1261±320.4K1.7%0.9%63 tps0.8s400K$1.75$14.00
14Gemini 3 Flash Preview Thinking1249±275.5K2.3%1.6%3 tps6.2s1M$0.50$3.00
15Gemini 3 Pro (Low)1247±253.4K2.2%2.4%51 tps3.5s1M$2.00$12.00
16GPT-5.21240±342.3K1.8%4.1%18 tps2.7s400K$1.75$14.00
17Grok 4.20 Beta Reasoning1229±66.3K2.2%1.1%77 tps4.5s2M$2.00$5.50
18GPT-5.4 mini1229±91.8K2.2%0.8%148 tps0.5s400K$0.75$4.50
19Claude Opus 4.51227±241.8K2.4%1.5%45 tps1.5s200K$5.00$25.00
20Gemini 3 Flash Preview1227±232.8K1.9%1.3%138 tps1.4s1M$0.50$3.00
21GPT-5.2 (High)1224±297.5K1.9%6.7%18 tps16.3s400K$1.75$14.00
22GPT-5 Chat1222±2129.9K3.0%1.3%95 tps0.9s400K$1.25$10.00
23GLM 51220±325.3K2.3%3.4%36 tps2.7s200K$0.72$2.55
24MiniMax M2.7-highspeed1219±81.8K2.4%2.3%50 tps2.1s205K$0.60$2.40
25Grok 4.20 Beta Non-reasoning1218±82.3K3.7%1.1%151 tps0.6s2M$2.00$6.00
26Claude Haiku 4.5 (Extended Thinking)1217±340.7K3.0%1.4%115 tps0.7s200K$1.00$5.00
27Grok 4.1 Fast Non-Reasoning1214±241.7K3.2%0.9%101 tps0.5s2M$0.20$0.50
28GPT-5 (High)1214±242.2K3.2%4.5%81 tps35.9s400K$1.25$10.00
29Nova Experimental Chat 12-101209±416.6K1.8%2.4%84 tps12.9s98K$0$0
30MiniMax M2.71206±71.7K2.6%3.0%34 tps2.5s205K$0.30$1.20
31Qwen3 VL 235B A22B Instruct1205±222.3K4.5%3.1%75 tps1.9s129K$0.37$1.81
32Gemini 2.5 Pro High1204±271.1K3.9%1.5%48 tps2.3s1M$1.25$10.00
33Qwen3 30B A3B Instruct 25071198±248.5K3.7%1.2%55 tps1.3s131K$0.13$0.72
34Qwen3 Next 80B A3B Instruct1197±238.2K3.5%0.6%84 tps1.1s256K$0.20$1.42
35Grok 4.20 Multi Agent Beta1196±55K2.0%1.2%56 tps8.8s2M$2.00$6.00
36Kimi K2.51194±349.3K2.1%6.5%33 tps1.7s262K$0.34$2.57
37Kimi K2.5 Instant1188±311.1K2.3%2.9%32 tps3.0s262K$0.50$3.00
38Qwen3 Omni 30B A3B Thinking1188±311.5K2.6%3.7%67 tps1.2s66K$0.97$1.79
39Claude Sonnet 4.51187±265.6K3.7%1.4%41 tps1.3s200K$1.80$9.00
40DeepSeek V3.21183±231.2K1.9%1.4%83 tps5.1s131K$0.43$1.09