Models
Topics
Language
More

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

Open license models

Filter the leaderboard to only show models that have an open license.

1211
Qwen3.5 27B
1210
Kimi K2.5 Instant
1205
Claude Sonnet 4
1205
Gemini 3 Flash Preview
1204
Gemini 2.5 Pro High
1203
Qwen3 Max Instruct Preview
1200
GPT-5.1 Codex Max
1197
MiniMax M2.1 Lightning
1194
Qwen3 30B A3B Instruct 2507
1192
Kimi K2 Thinking Turbo
1192
MiniMax M2.1
1189
DeepSeek V3.2
1185
MiniMax M2.5 FP8
1185
GPT-5
1185
Grok 4 Fast Non-Reasoning

Last updated about 1 month ago

RankOverallNameVIBE
Score
Confidence
Interval
VotesDownvote %Abort %SpeedLatencyContextCost
(Input)
Cost
(Output)
4136Qwen3.5 27B1211±169104.7%3.7%55 tps2.6s256K$0.30$2.40
4236Kimi K2.5 Instant1210±81.8K3.2%2.9%32 tps3.0s262K$0.50$3.00
4343Claude Sonnet 41205±343.2K3.7%1.8%49 tps1.3s200K$3.00$15.00
4443Gemini 3 Flash Preview1205±117.2K3.7%1.3%138 tps1.4s1M$0.50$3.00
4543Gemini 2.5 Pro High1204±321.1K5.7%1.5%48 tps2.3s1M$1.25$10.00
4643Qwen3 Max Instruct Preview1203±616.1K4.6%1.1%31 tps1.7s256K$1.43$6.61
4743GPT-5.1 Codex Max1200±126.4K3.9%3.0%118 tps4.1s400K$1.25$10.00
4843MiniMax M2.1 Lightning1197±238753.3%1.7%52 tps2.1s205K$0.30$2.40
4949Qwen3 30B A3B Instruct 25071194±512.7K5.7%1.2%55 tps1.3s131K$0.13$0.72
5049Kimi K2 Thinking Turbo1192±620.3K3.4%2.0%75 tps1.4s262K$1.15$8.00
5149MiniMax M2.11192±819.4K3.6%2.1%66 tps2.6s205K$0.30$1.20
5249DeepSeek V3.21189±85.1K4.7%1.4%83 tps5.1s131K$0.43$1.09
5349MiniMax M2.5 FP81185±176103.2%3.6%33 tps1.7s205K$0.45$1.75
5449GPT-51185±421.3K5.3%3.1%78 tps23.1s400K$1.25$9.67
5549Grok 4 Fast Non-Reasoning1185±58.1K7.1%1.5%93 tps0.6s2M$0.27$0.67
5649MiniMax M21183±519.7K4.2%2.2%39 tps2.3s205K$0.21$0.85
5749Nova Experimental Chat 12-101182±92.9K3.8%2.4%84 tps12.9s98K$0$0
5849GLM 4.61182±717.2K4.4%5.4%39 tps1.5s200K$0.42$1.66
5949GPT-5.3 Codex (Low)1178±285101.0%1.8%61 tps4.3s400K$1.75$14.00
6060Grok 4.1 Fast Reasoning1178±739.5K4.4%1.5%58 tps7.3s2M$0.20$0.50
6160DeepSeek V3.2 Thinking1178±923.3K4.0%9.0%30 tps2.6s131K$0.28$0.42
6260Grok 4 Fast Reasoning1177±314.5K5.0%2.1%102 tps3.1s2M$0.30$0.75
6360Gemini 2.5 Pro1176±337.9K4.8%2.3%45 tps2.6s1M$1.25$10.00
6460Qwen3 235B A22B Instruct 25071172±412.6K6.4%6.8%13 tps1.9s262K$0.13$0.52
6560Claude Sonnet 3.5 v21171±65.5K3.4%<0.1%46 tps1.4s200K$3.00$15.00
6660GPT-5.1 Codex (Medium)1171±143K3.2%4.6%71 tps3.7s400K$1.25$10.00
6760GPT-5.1 Instant1171±88.3K4.1%1.3%50 tps1.9s400K$1.25$10.00
6860Grok 4.20 Beta Reasoning1167±221.2K4.1%1.1%77 tps4.5s2M$2.00$5.50
6969gpt-oss-120b1165±519.2K5.0%0.7%213 tps0.5s131K$0.11$0.50
7069Qwen3.5 35B A3B1164±258653.9%2.1%116 tps2.1s256K$0.63$1.13
7169GPT-5 Codex (Low)1163±105K4.1%2.7%112 tps3.5s400K$1.25$10.00
7269GLM 4.71161±716.8K3.7%5.8%40 tps1.5s200K$0.77$1.73
7369DeepSeek V3.1 Terminus Chat1158±56.5K6.9%3.4%27 tps1.5s131K$0.86$1.80
7474Qwen Plus (Aug'24)1146±517.2K4.7%1.4%53 tps1.3s30K$0.40$1.20
7574Qwen3.5 397B A17B1142±142.5K2.9%4.3%57 tps1.4s256K$0.52$3.00
7674Gemini 2.5 Flash Preview 09251140±67.6K6.0%1.2%5 tps0.9s1M$0.13$0.97
7777Mistral Large 31131±85.4K5.8%2.1%51 tps1.0s256K$0.50$1.50
7877GPT-5 Mini1131±58.6K5.4%2.6%66 tps14.2s400K$0.25$2.00
7977DeepSeek V3.1 Turbo1130±74.8K5.3%0.9%173 tps1.3s164K$2.00$3.75
8077Grok 4.20 Multi Agent Beta1129±199453.6%1.2%56 tps8.8s2M$2.00$6.00
View All (286 models)