Models
More

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

Open license models

Filter the leaderboard to only show models that have an open license.

1114
GPT-5 Mini Minimal
1118
Gemini 2.5 Flash Thinking
1122
Gemini 2.5 Flash Lite Preview 0925
1123
GPT-4.1
1125
Grok 4
1127
Qwen3 Max Thinking Preview
1129
Grok 4.20 Multi Agent Beta
1130
DeepSeek V3.1 Turbo
1131
GPT-5 Mini
1131
Mistral Large 3
1140
Gemini 2.5 Flash Preview 0925
1142
Qwen3.5 397B A17B
1146
Qwen Plus (Aug'24)
1158
DeepSeek V3.1 Terminus Chat
1161
GLM 4.7

Last updated about 1 month ago

RankNameVIBE
Score
Confidence
Interval
VotesDownvote %Abort %SpeedLatencyContextCost
(Input)
Cost
(Output)
201GPT-5 Mini Minimal1114±123.2K8.5%1.2%63 tps1.4s400K$0.25$2.00
202Gemini 2.5 Flash Thinking1118±413.7K3.6%2.2%88 tps6.4s1M$0.30$2.50
203Gemini 2.5 Flash Lite Preview 09251122±78.5K6.6%1.2%209 tps0.7s1M$0.25$0.35
204GPT-4.11123±532.8K5.2%3.7%112 tps1.3s1M$2.00$8.00
205Grok 41125±339.6K4.4%3.9%29 tps11.1s256K$3.00$15.00
206Qwen3 Max Thinking Preview1127±106.3K5.7%3.1%40 tps2.1s256K$1.20$6.00
207Grok 4.20 Multi Agent Beta1129±199453.6%1.2%56 tps8.8s2M$2.00$6.00
208DeepSeek V3.1 Turbo1130±74.8K5.3%0.9%173 tps1.3s164K$2.00$3.75
209GPT-5 Mini1131±58.6K5.4%2.6%66 tps14.2s400K$0.25$2.00
210Mistral Large 31131±85.4K5.8%2.1%51 tps1.0s256K$0.50$1.50
211Gemini 2.5 Flash Preview 09251140±67.6K6.0%1.2%5 tps0.9s1M$0.13$0.97
212Qwen3.5 397B A17B1142±142.5K2.9%4.3%57 tps1.4s256K$0.52$3.00
213Qwen Plus (Aug'24)1146±517.2K4.7%1.4%53 tps1.3s30K$0.40$1.20
214DeepSeek V3.1 Terminus Chat1158±56.5K6.9%3.4%27 tps1.5s131K$0.86$1.80
215GLM 4.71161±716.8K3.7%5.8%40 tps1.5s200K$0.77$1.73
216GPT-5 Codex (Low)1163±105K4.1%2.7%112 tps3.5s400K$1.25$10.00
217Qwen3.5 35B A3B1164±258653.9%2.1%116 tps2.1s256K$0.63$1.13
218gpt-oss-120b1165±519.2K5.0%0.7%213 tps0.5s131K$0.11$0.50
219Grok 4.20 Beta Reasoning1167±221.2K4.1%1.1%77 tps4.5s2M$2.00$5.50
220GPT-5.1 Instant1171±88.3K4.1%1.3%50 tps1.9s400K$1.25$10.00
221GPT-5.1 Codex (Medium)1171±143K3.2%4.6%71 tps3.7s400K$1.25$10.00
222Claude Sonnet 3.5 v21171±65.5K3.4%<0.1%46 tps1.4s200K$3.00$15.00
223Qwen3 235B A22B Instruct 25071172±412.6K6.4%6.8%13 tps1.9s262K$0.13$0.52
224Gemini 2.5 Pro1176±337.9K4.8%2.3%45 tps2.6s1M$1.25$10.00
225Grok 4 Fast Reasoning1177±314.5K5.0%2.1%102 tps3.1s2M$0.30$0.75
226DeepSeek V3.2 Thinking1178±923.3K4.0%9.0%30 tps2.6s131K$0.28$0.42
227Grok 4.1 Fast Reasoning1178±739.5K4.4%1.5%58 tps7.3s2M$0.20$0.50
228GPT-5.3 Codex (Low)1178±285101.0%1.8%61 tps4.3s400K$1.75$14.00
229GLM 4.61182±717.2K4.4%5.4%39 tps1.5s200K$0.42$1.66
230Nova Experimental Chat 12-101182±92.9K3.8%2.4%84 tps12.9s98K$0$0
231MiniMax M21183±519.7K4.2%2.2%39 tps2.3s205K$0.21$0.85
232Grok 4 Fast Non-Reasoning1185±58.1K7.1%1.5%93 tps0.6s2M$0.27$0.67
233GPT-51185±421.3K5.3%3.1%78 tps23.1s400K$1.25$9.67
234MiniMax M2.5 FP81185±176103.2%3.6%33 tps1.7s205K$0.45$1.75
235DeepSeek V3.21189±85.1K4.7%1.4%83 tps5.1s131K$0.43$1.09
236MiniMax M2.11192±819.4K3.6%2.1%66 tps2.6s205K$0.30$1.20
237Kimi K2 Thinking Turbo1192±620.3K3.4%2.0%75 tps1.4s262K$1.15$8.00
238Qwen3 30B A3B Instruct 25071194±512.7K5.7%1.2%55 tps1.3s131K$0.13$0.72
239MiniMax M2.1 Lightning1197±238753.3%1.7%52 tps2.1s205K$0.30$2.40
240GPT-5.1 Codex Max1200±126.4K3.9%3.0%118 tps4.1s400K$1.25$10.00
View All (286 models)