Models
More

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

Open license models

Filter the leaderboard to only show models that have an open license.

1200
GPT-5.1 Codex Max
1197
MiniMax M2.1 Lightning
1194
Qwen3 30B A3B Instruct 2507
1192
MiniMax M2.1
1189
DeepSeek V3.2
1185
MiniMax M2.5 FP8
1185
GPT-5
1185
Grok 4 Fast Non-Reasoning
1183
MiniMax M2
1182
Nova Experimental Chat 12-10
1182
GLM 4.6
1178
GPT-5.3 Codex (Low)
1178
Grok 4.1 Fast Reasoning
1177
Grok 4 Fast Reasoning
1176
Gemini 2.5 Pro

Last updated about 1 month ago

RankOverallNameVIBE
Score
Confidence
Interval
VotesDownvote %Abort %SpeedLatencyContextCost
(Input)
Cost
(Output)
4143GPT-5.1 Codex Max1200±126.4K3.9%3.0%118 tps4.1s400K$1.25$10.00
4243MiniMax M2.1 Lightning1197±238753.3%1.7%52 tps2.1s205K$0.30$2.40
4349Qwen3 30B A3B Instruct 25071194±512.7K5.7%1.2%55 tps1.3s131K$0.13$0.72
4449MiniMax M2.11192±819.4K3.6%2.1%66 tps2.6s205K$0.30$1.20
4549DeepSeek V3.21189±85.1K4.7%1.4%83 tps5.1s131K$0.43$1.09
4649MiniMax M2.5 FP81185±176103.2%3.6%33 tps1.7s205K$0.45$1.75
4749GPT-51185±421.3K5.3%3.1%78 tps23.1s400K$1.25$9.67
4849Grok 4 Fast Non-Reasoning1185±58.1K7.1%1.5%93 tps0.6s2M$0.27$0.67
4949MiniMax M21183±519.7K4.2%2.2%39 tps2.3s205K$0.21$0.85
5049Nova Experimental Chat 12-101182±92.9K3.8%2.4%84 tps12.9s98K$0$0
5149GLM 4.61182±717.2K4.4%5.4%39 tps1.5s200K$0.42$1.66
5249GPT-5.3 Codex (Low)1178±285101.0%1.8%61 tps4.3s400K$1.75$14.00
5360Grok 4.1 Fast Reasoning1178±739.5K4.4%1.5%58 tps7.3s2M$0.20$0.50
5460Grok 4 Fast Reasoning1177±314.5K5.0%2.1%102 tps3.1s2M$0.30$0.75
5560Gemini 2.5 Pro1176±337.9K4.8%2.3%45 tps2.6s1M$1.25$10.00
5660Qwen3 235B A22B Instruct 25071172±412.6K6.4%6.8%13 tps1.9s262K$0.13$0.52
5760Claude Sonnet 3.5 v21171±65.5K3.4%<0.1%46 tps1.4s200K$3.00$15.00
5860GPT-5.1 Codex (Medium)1171±143K3.2%4.6%71 tps3.7s400K$1.25$10.00
5960GPT-5.1 Instant1171±88.3K4.1%1.3%50 tps1.9s400K$1.25$10.00
6060Grok 4.20 Beta Reasoning1167±221.2K4.1%1.1%77 tps4.5s2M$2.00$5.50
6169Qwen3.5 35B A3B1164±258653.9%2.1%116 tps2.1s256K$0.63$1.13
6269GPT-5 Codex (Low)1163±105K4.1%2.7%112 tps3.5s400K$1.25$10.00
6369GLM 4.71161±716.8K3.7%5.8%40 tps1.5s200K$0.77$1.73
6469DeepSeek V3.1 Terminus Chat1158±56.5K6.9%3.4%27 tps1.5s131K$0.86$1.80
6574Qwen Plus (Aug'24)1146±517.2K4.7%1.4%53 tps1.3s30K$0.40$1.20
6674Qwen3.5 397B A17B1142±142.5K2.9%4.3%57 tps1.4s256K$0.52$3.00
6774Gemini 2.5 Flash Preview 09251140±67.6K6.0%1.2%5 tps0.9s1M$0.13$0.97
6877GPT-5 Mini1131±58.6K5.4%2.6%66 tps14.2s400K$0.25$2.00
6977DeepSeek V3.1 Turbo1130±74.8K5.3%0.9%173 tps1.3s164K$2.00$3.75
7077Grok 4.20 Multi Agent Beta1129±199453.6%1.2%56 tps8.8s2M$2.00$6.00
7177Qwen3 Max Thinking Preview1127±106.3K5.7%3.1%40 tps2.1s256K$1.20$6.00
7277Grok 41125±339.6K4.4%3.9%29 tps11.1s256K$3.00$15.00
7377GPT-4.11123±532.8K5.2%3.7%112 tps1.3s1M$2.00$8.00
7477Gemini 2.5 Flash Lite Preview 09251122±78.5K6.6%1.2%209 tps0.7s1M$0.25$0.35
7585Gemini 2.5 Flash Thinking1118±413.7K3.6%2.2%88 tps6.4s1M$0.30$2.50
7685GPT-5 Mini Minimal1114±123.2K8.5%1.2%63 tps1.4s400K$0.25$2.00
7785GPT-5.2 Codex (Low)1113±191.2K3.2%4.5%41 tps5.0s400K$1.75$14.00
7885DeepSeek V3.1 Chat1110±74.9K6.6%2.8%21 tps1.6s131K$0.38$1.00
7985Qwen3 Omni 30B A3B Thinking1110±102.3K6.0%3.7%67 tps1.2s66K$0.97$1.79
8090Qwen Max1107±418.3K4.2%1.5%49 tps1.5s33K$1.60$6.40
View All (210 models)