Models
Topics
Language
More

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

Open license models

Filter the leaderboard to only show models that have an open license.

1147
Kimi K2.5
1147
Claude Opus 4.5 (Thinking)
1147
Qwen3 235B A22B Instruct 2507
1147
GPT-5.2 (Extra High)
1145
Kimi K2 Thinking Turbo
1144
DeepSeek V3.2 Thinking
1142
MiniMax M2.7
1142
Grok 4 Fast Reasoning
1141
GPT-5.4 mini
1134
DeepSeek V3.1 Turbo
1133
Mistral Large 3
1129
MiniMax M2.1 Lightning
1129
Qwen3 235B A22B
1125
DeepSeek V3.1
1125
DeepSeek V3.2 Exp Chat

Last updated about 1 month ago

RankOverallNameVIBE
Score
Confidence
Interval
VotesDownvote %Abort %SpeedLatencyContextCost
(Input)
Cost
(Output)
4133Kimi K2.51147±316.1K1.2%6.5%33 tps1.7s262K$0.34$2.57
427Claude Opus 4.5 (Thinking)1147±421.9K1.6%1.8%49 tps1.4s200K$5.00$25.00
4340Qwen3 235B A22B Instruct 25071147±224.5K1.4%6.8%13 tps1.9s262K$0.13$0.52
4442GPT-5.2 (Extra High) 1147±215.6K1.4%13.2%17 tps20.5s400K$1.75$14.00
4544Kimi K2 Thinking Turbo1145±210.9K1.6%2.0%75 tps1.4s262K$1.15$8.00
4656DeepSeek V3.2 Thinking1144±416.9K1.3%9.0%30 tps2.6s131K$0.28$0.42
4729MiniMax M2.71142±137001.4%3.0%34 tps2.5s205K$0.30$1.20
4848Grok 4 Fast Reasoning1142±314.5K2.0%2.1%102 tps3.1s2M$0.30$0.75
4917GPT-5.4 mini1141±145451.8%0.8%148 tps0.5s400K$0.75$4.50
5056DeepSeek V3.1 Turbo1134±39.5K1.2%0.9%173 tps1.3s164K$2.00$3.75
5165Mistral Large 31133±410.8K2.6%2.1%51 tps1.0s256K$0.50$1.50
5256MiniMax M2.1 Lightning1129±53.6K1.4%1.7%52 tps2.1s205K$0.30$2.40
5386Qwen3 235B A22B1129±37.8K2.1%5.3%71 tps0.9s41K$0.23$0.63
5471DeepSeek V3.11125±44.4K1.1%0.8%197 tps0.4s164K$0.55$1.60
5565DeepSeek V3.2 Exp Chat1125±311.5K1.9%2.6%29 tps1.5s131K$0.27$0.39
5660MiniMax M2.11124±324.4K1.0%2.1%66 tps2.6s205K$0.30$1.20
5786Nemotron 3 Nano (Thinking)1123±35.9K1.5%2.0%200 tps0.5s256K$0$0
5852Qwen3.5 122B A17B1123±52.6K1.3%1.5%82 tps1.4s256K$0.40$3.20
5926Claude Haiku 4.5 (Extended Thinking)1121±314.1K1.8%1.4%115 tps0.7s200K$1.00$5.00
6060Gemini 2.5 Flash Preview 09251118±314.4K2.2%1.2%5 tps0.9s1M$0.13$0.97
6152GPT-51117±231.1K1.7%3.1%78 tps23.1s400K$1.25$9.67
6281OpenAI o3-pro1116±53.2K2.8%5.2%22 tps70.8s200K$20.00$80.00
6368Grok 41110±198.8K0.9%3.9%29 tps11.1s256K$3.00$15.00
6417Claude Opus 4.51110±412.9K2.2%1.5%45 tps1.5s200K$5.00$25.00
6562MiniMax M21110±317.2K2.5%2.2%39 tps2.3s205K$0.21$0.85
6671Qwen3.5 397B A17B1107±65.1K1.6%4.3%57 tps1.4s256K$0.52$3.00
6786DeepSeek V3.1 Nex N11107±81.5K1.3%3.4%24 tps7.2s131K$0.14$0.50
6879Qwen3 Max Thinking Preview1106±413.3K2.0%3.1%40 tps2.1s256K$1.20$6.00
69101DeepSeek V3 (Turbo)1105±53.7K1.5%1.5%32 tps1.5s64K$0.40$1.30
7056Gemini 3.1 Flash Lite Preview Thinking1105±82K1.7%1.7%75 tps4.7s1M$0.25$1.50
7168GLM 4.71105±321K1.0%5.8%40 tps1.5s200K$0.77$1.73
7295DeepSeek-R1 Turbo1104±54.8K1.5%2.6%29 tps1.8s64K$2.85$4.75
7386Amazon Nova 2 Lite1099±410.5K2.7%1.0%137 tps0.6s300K$0.35$2.95
7468Qwen Plus (Aug'24)1098±250.5K1.1%1.4%53 tps1.3s30K$0.40$1.20
75101GPT-5 (Low)1097±71.5K1.0%1.8%75 tps8.2s400K$1.25$10.00
7662GPT-5.1 Instant1096±314.9K1.5%1.3%50 tps1.9s400K$1.25$10.00
7784GPT-5 Mini Minimal1094±34.9K3.0%1.2%63 tps1.4s400K$0.25$2.00
7895Kimi K2 Thinking1092±45.4K2.0%4.2%61 tps5.9s262K$0.24$1.03
7937Claude Sonnet 4.51092±225.2K2.2%1.4%41 tps1.3s200K$1.80$9.00
8071MiniMax M2.5 FP81092±102.1K1.6%3.6%33 tps1.7s205K$0.45$1.75
View All (283 models)