Models
Topics
Language
More

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

Open license models

Filter the leaderboard to only show models that have an open license.

1398
Claude Opus 4.6 (Thinking)
1381
Claude Opus 4.6
1343
GPT-5.4
1314
Claude Sonnet 4.6
1305
Claude Sonnet 4.6 (Thinking)
1264
GPT-5.1
1260
Claude Opus 4.5 (Thinking)
1250
GPT-5.1 (High)
1245
Gemini 3.1 Pro
1232
GPT-5.2 Instant
1228
Claude Sonnet 4.5 (Thinking)
1207
Gemini 3 Pro
1203
GPT-5.4 mini
1199
GPT-5.3 Instant
1196
GPT-5 Chat

Last updated about 1 month ago

RankOverallNameVIBE
Score
Confidence
Interval
VotesDownvote %Abort %SpeedLatencyContextCost
(Input)
Cost
(Output)
11Claude Opus 4.6 (Thinking)1398±416.9K1.4%2.5%56 tps1.6s200K$5.00$25.00
22Claude Opus 4.61381±321.8K1.1%2.1%48 tps1.7s200K$5.00$25.00
32GPT-5.41343±55.8K1.3%2.6%55 tps0.8s1M$2.50$15.00
44Claude Sonnet 4.61314±416.6K1.2%1.6%47 tps1.2s200K$3.00$15.00
55Claude Sonnet 4.6 (Thinking)1305±317.7K2.7%4.7%57 tps1.1s200K$3.00$15.00
68GPT-5.11264±227.6K2.3%2.3%71 tps1.4s400K$1.42$11.33
77Claude Opus 4.5 (Thinking)1260±261K1.8%1.8%49 tps1.4s200K$5.00$25.00
88GPT-5.1 (High)1250±337.8K2.4%3.2%76 tps6.9s400K$1.25$10.00
96Gemini 3.1 Pro1245±326K2.0%3.5%35 tps4.1s1M$2.00$12.00
1010GPT-5.2 Instant1232±239.3K1.6%1.7%52 tps2.0s400K$1.75$14.00
1110Claude Sonnet 4.5 (Thinking)1228±166.2K3.4%1.9%44 tps1.1s200K$3.00$15.00
1210Gemini 3 Pro1207±178K2.2%2.1%50 tps3.6s1M$2.00$12.00
1317GPT-5.4 mini1203±108852.7%0.8%148 tps0.5s400K$0.75$4.50
1413GPT-5.3 Instant1199±69.3K1.7%0.9%63 tps0.8s400K$1.75$14.00
1522GPT-5 Chat1196±175.1K3.4%1.3%95 tps0.9s400K$1.25$10.00
1622Grok 4.20 Beta Non-reasoning1192±111.3K3.1%1.1%151 tps0.6s2M$2.00$6.00
1714Gemini 3 Flash Preview Thinking1190±247K2.3%1.6%3 tps6.2s1M$0.50$3.00
1829Qwen3 VL 235B A22B Instruct1188±313.5K5.2%3.1%75 tps1.9s129K$0.37$1.81
1929Nova Experimental Chat 12-101188±39.8K1.9%2.4%84 tps12.9s98K$0$0
2037Qwen3 Omni 30B A3B Thinking1186±37.5K2.1%3.7%67 tps1.2s66K$0.97$1.79
2133Grok 4.20 Multi Agent Beta1183±92.6K2.0%1.2%56 tps8.8s2M$2.00$6.00
2222GLM 51182±417.3K2.1%3.4%36 tps2.7s200K$0.72$2.55
2314Gemini 3 Pro (Low)1180±328.9K2.2%2.4%51 tps3.5s1M$2.00$12.00
2417GPT-5.2 (High)1180±254.6K1.9%6.7%18 tps16.3s400K$1.75$14.00
2526Grok 4.1 Fast Non-Reasoning1177±225.7K3.0%0.9%101 tps0.5s2M$0.20$0.50
26106GPT-5.4 nano1177±106502.3%0.7%149 tps0.5s400K$0.20$1.25
2726GPT-5 (High)1177±222.1K3.1%4.5%81 tps35.9s400K$1.25$10.00
2816GPT-5.21176±222.6K1.8%4.1%18 tps2.7s400K$1.75$14.00
2917Grok 4.20 Beta Reasoning1175±73.3K1.8%1.1%77 tps4.5s2M$2.00$5.50
3017Claude Opus 4.51173±222.5K2.2%1.5%45 tps1.5s200K$5.00$25.00
3126Claude Haiku 4.5 (Extended Thinking)1173±224.3K3.1%1.4%115 tps0.7s200K$1.00$5.00
3262Qwen3 Omni 30B A3B Instruct1168±53K2.3%3.9%65 tps1.2s66K$0.35$0.97
3329MiniMax M2.71167±81.1K1.8%3.0%34 tps2.5s205K$0.30$1.20
3456MiniMax M2.1 Lightning1165±54.9K1.2%1.7%52 tps2.1s205K$0.30$2.40
3532Gemini 2.5 Pro High1159±242.7K4.5%1.5%48 tps2.3s1M$1.25$10.00
3617Gemini 3 Flash Preview1159±317.8K2.1%1.3%138 tps1.4s1M$0.50$3.00
3733Qwen3 30B A3B Instruct 25071158±231.6K4.1%1.2%55 tps1.3s131K$0.13$0.72
3844DeepSeek V3.1 Terminus Chat1150±317.8K4.2%3.4%27 tps1.5s131K$0.86$1.80
3942Qwen3 Max Instruct Preview1148±236.6K3.5%1.1%31 tps1.7s256K$1.43$6.61
4040Qwen3 235B A22B Instruct 25071147±232.2K4.7%6.8%13 tps1.9s262K$0.13$0.52
View All (208 models)