Models
Topics
Language
More

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

Open license models

Filter the leaderboard to only show models that have an open license.

1158
Qwen3 30B A3B Instruct 2507
1151
Qwen3.5 122B A17B
1151
Kimi K2.5
1150
DeepSeek V3.1 Terminus Chat
1148
Qwen3 Max Instruct Preview
1147
Qwen3 235B A22B Instruct 2507
1144
DeepSeek V3.2
1144
gpt-oss-120b
1141
MiniMax M2.5 FP8
1140
DeepSeek V3.1 Turbo
1139
Claude Sonnet 4.5
1138
Claude Sonnet 4 (Thinking)
1137
Kimi K2 Thinking Turbo
1136
Gemini 2.5 Pro
1133
GPT-5.2 (Extra High)

Last updated about 1 month ago

RankOverallNameVIBE
Score
Confidence
Interval
VotesDownvote %Abort %SpeedLatencyContextCost
(Input)
Cost
(Output)
4133Qwen3 30B A3B Instruct 25071158±231.6K4.1%1.2%55 tps1.3s131K$0.13$0.72
4252Qwen3.5 122B A17B1151±44.7K1.6%1.5%82 tps1.4s256K$0.40$3.20
4333Kimi K2.51151±332.5K1.8%6.5%33 tps1.7s262K$0.34$2.57
4444DeepSeek V3.1 Terminus Chat1150±317.8K4.2%3.4%27 tps1.5s131K$0.86$1.80
4542Qwen3 Max Instruct Preview1148±236.6K3.5%1.1%31 tps1.7s256K$1.43$6.61
4640Qwen3 235B A22B Instruct 25071147±232.2K4.7%6.8%13 tps1.9s262K$0.13$0.52
4740DeepSeek V3.21144±320.7K1.9%1.4%83 tps5.1s131K$0.43$1.09
4848gpt-oss-120b1144±240.7K3.7%0.7%213 tps0.5s131K$0.11$0.50
4971MiniMax M2.5 FP81141±42.9K1.7%3.6%33 tps1.7s205K$0.45$1.75
5056DeepSeek V3.1 Turbo1140±214.5K2.3%0.9%173 tps1.3s164K$2.00$3.75
5137Claude Sonnet 4.51139±237.7K4.3%1.4%41 tps1.3s200K$1.80$9.00
5248Claude Sonnet 4 (Thinking)1138±230.7K2.6%1.5%52 tps1.5s200K$3.00$13.67
5344Kimi K2 Thinking Turbo1137±329.8K2.5%2.0%75 tps1.4s262K$1.15$8.00
5444Gemini 2.5 Pro1136±168.8K3.9%2.3%45 tps2.6s1M$1.25$10.00
5542GPT-5.2 (Extra High) 1133±320.9K1.9%13.2%17 tps20.5s400K$1.75$14.00
5660MiniMax M2.11129±241.8K2.0%2.1%66 tps2.6s205K$0.30$1.20
5748Grok 4 Fast Reasoning1128±225.9K3.9%2.1%102 tps3.1s2M$0.30$0.75
5852Claude Haiku 4.51128±231.4K3.7%1.1%100 tps0.9s200K$1.00$5.00
5944Grok 4.1 Fast Reasoning1128±257K3.1%1.5%58 tps7.3s2M$0.20$0.50
6052Grok 4 Fast Non-Reasoning1128±321.3K4.7%1.5%93 tps0.6s2M$0.27$0.67
6156DeepSeek V3.2 Thinking1127±337.6K2.6%9.0%30 tps2.6s131K$0.28$0.42
6286Nemotron 3 Nano (Thinking)1127±37.5K2.4%2.0%200 tps0.5s256K$0$0
6362MiniMax M21125±233.6K3.5%2.2%39 tps2.3s205K$0.21$0.85
6465DeepSeek V3.2 Exp Chat1124±314.3K4.0%2.6%29 tps1.5s131K$0.27$0.39
6571DeepSeek V3.11124±36.8K2.0%0.8%197 tps0.4s164K$0.55$1.60
6665Mistral Large 31122±314.3K3.3%2.1%51 tps1.0s256K$0.50$1.50
6779MiniMax M2.5 Lightning1121±45.6K1.3%1.5%51 tps2.0s205K$0.60$2.40
6852GPT-51119±244.3K3.9%3.1%78 tps23.1s400K$1.25$9.67
6995Qwen3 32B1117±63.3K2.8%3.9%30 tps3.1s41K$0.12$0.42
7086DeepSeek V3.1 Nex N11112±62.1K1.7%3.4%24 tps7.2s131K$0.14$0.50
7165GLM 4.61108±325.8K4.3%5.4%39 tps1.5s200K$0.42$1.66
7284MiniMax M2.51105±82.1K1.6%1.4%70 tps1.9s205K$0.28$1.20
7371Seed 1.8 2512281104±319K1.5%3.7%41 tps2.1s256K$0.25$2.00
7486DeepSeek V3.1 Chat1102±313.4K4.1%2.8%21 tps1.6s131K$0.38$1.00
7560Gemini 2.5 Flash Preview 09251102±219.5K4.3%1.2%5 tps0.9s1M$0.13$0.97
7668GLM 4.71101±335.7K2.1%5.8%40 tps1.5s200K$0.77$1.73
7768Grok 41100±1120.3K2.1%3.9%29 tps11.1s256K$3.00$15.00
78101DeepSeek V3 (Turbo)1100±34.8K2.5%1.5%32 tps1.5s64K$0.40$1.30
7986Amazon Nova 2 Lite1099±312.6K3.1%1.0%137 tps0.6s300K$0.35$2.95
8068Qwen Plus (Aug'24)1098±260.9K2.4%1.4%53 tps1.3s30K$0.40$1.20
View All (288 models)