Models
Topics
Language
More

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

Open license models

Filter the leaderboard to only show models that have an open license.

1175
Grok 4.20 Beta Reasoning
1173
Claude Opus 4.5
1173
Claude Haiku 4.5 (Extended Thinking)
1171
Kimi K2.5 Instant
1168
Qwen3 Omni 30B A3B Instruct
1167
MiniMax M2.7
1165
MiniMax M2.1 Lightning
1164
Step 3.5 Flash
1161
Qwen3 Next 80B A3B Instruct
1159
Gemini 2.5 Pro High
1159
Gemini 3 Flash Preview
1158
Qwen3 30B A3B Instruct 2507
1151
Qwen3.5 122B A17B
1151
Kimi K2.5
1150
DeepSeek V3.1 Terminus Chat

Last updated about 1 month ago

RankOverallNameVIBE
Score
Confidence
Interval
VotesDownvote %Abort %SpeedLatencyContextCost
(Input)
Cost
(Output)
4117Grok 4.20 Beta Reasoning1175±73.3K1.8%1.1%77 tps4.5s2M$2.00$5.50
4217Claude Opus 4.51173±222.5K2.2%1.5%45 tps1.5s200K$5.00$25.00
4326Claude Haiku 4.5 (Extended Thinking)1173±224.3K3.1%1.4%115 tps0.7s200K$1.00$5.00
4437Kimi K2.5 Instant1171±46.2K1.8%2.9%32 tps3.0s262K$0.50$3.00
4562Qwen3 Omni 30B A3B Instruct1168±53K2.3%3.9%65 tps1.2s66K$0.35$0.97
4629MiniMax M2.71167±81.1K1.8%3.0%34 tps2.5s205K$0.30$1.20
4756MiniMax M2.1 Lightning1165±54.9K1.2%1.7%52 tps2.1s205K$0.30$2.40
4848Step 3.5 Flash1164±54K1.5%2.2%109 tps0.6s256K$0.05$0.15
4933Qwen3 Next 80B A3B Instruct1161±224.9K3.8%0.6%84 tps1.1s256K$0.20$1.42
5032Gemini 2.5 Pro High1159±242.7K4.5%1.5%48 tps2.3s1M$1.25$10.00
5117Gemini 3 Flash Preview1159±317.8K2.1%1.3%138 tps1.4s1M$0.50$3.00
5233Qwen3 30B A3B Instruct 25071158±231.6K4.1%1.2%55 tps1.3s131K$0.13$0.72
5352Qwen3.5 122B A17B1151±44.7K1.6%1.5%82 tps1.4s256K$0.40$3.20
5433Kimi K2.51151±332.5K1.8%6.5%33 tps1.7s262K$0.34$2.57
5544DeepSeek V3.1 Terminus Chat1150±317.8K4.2%3.4%27 tps1.5s131K$0.86$1.80
5642Qwen3 Max Instruct Preview1148±236.6K3.5%1.1%31 tps1.7s256K$1.43$6.61
5740Qwen3 235B A22B Instruct 25071147±232.2K4.7%6.8%13 tps1.9s262K$0.13$0.52
5840DeepSeek V3.21144±320.7K1.9%1.4%83 tps5.1s131K$0.43$1.09
5948gpt-oss-120b1144±240.7K3.7%0.7%213 tps0.5s131K$0.11$0.50
6071MiniMax M2.5 FP81141±42.9K1.7%3.6%33 tps1.7s205K$0.45$1.75
6156DeepSeek V3.1 Turbo1140±214.5K2.3%0.9%173 tps1.3s164K$2.00$3.75
6237Claude Sonnet 4.51139±237.7K4.3%1.4%41 tps1.3s200K$1.80$9.00
6348Claude Sonnet 4 (Thinking)1138±230.7K2.6%1.5%52 tps1.5s200K$3.00$13.67
6444Kimi K2 Thinking Turbo1137±329.8K2.5%2.0%75 tps1.4s262K$1.15$8.00
6551GPT-5.2 (Medium)1136±88152.4%<0.1%39 tps2.5s400K$1.75$14.00
6644Gemini 2.5 Pro1136±168.8K3.9%2.3%45 tps2.6s1M$1.25$10.00
6756Claude Opus 4.1 (Thinking)1134±212K3.6%<0.1%20 tps3.9s200K$15.00$75.00
6843Gemini 2.5 Flash Thinking Preview 09251133±220.1K5.0%<0.1%111 tps4.7s1M$0.30$2.50
6942GPT-5.2 (Extra High) 1133±320.9K1.9%13.2%17 tps20.5s400K$1.75$14.00
7048Polaris Alpha1131±61.7K3.6%<0.1%48 tps1.1s256K$0$0
7160MiniMax M2.11129±241.8K2.0%2.1%66 tps2.6s205K$0.30$1.20
7248Grok 4 Fast Reasoning1128±225.9K3.9%2.1%102 tps3.1s2M$0.30$0.75
7352Claude Haiku 4.51128±231.4K3.7%1.1%100 tps0.9s200K$1.00$5.00
7444Grok 4.1 Fast Reasoning1128±257K3.1%1.5%58 tps7.3s2M$0.20$0.50
7552Grok 4 Fast Non-Reasoning1128±321.3K4.7%1.5%93 tps0.6s2M$0.27$0.67
7656DeepSeek V3.2 Thinking1127±337.6K2.6%9.0%30 tps2.6s131K$0.28$0.42
7786Nemotron 3 Nano (Thinking)1127±37.5K2.4%2.0%200 tps0.5s256K$0$0
7862MiniMax M21125±233.6K3.5%2.2%39 tps2.3s205K$0.21$0.85
7965DeepSeek V3.2 Exp Chat1124±314.3K4.0%2.6%29 tps1.5s131K$0.27$0.39
8071DeepSeek V3.11124±36.8K2.0%0.8%197 tps0.4s164K$0.55$1.60
View All (432 models)