Models
Topics
Language
More

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

Open license models

Filter the leaderboard to only show models that have an open license.

1157
Qwen3 Omni 30B A3B Instruct
1156
GPT-5 (High)
1156
Grok 4 Fast Non-Reasoning
1153
Gemini 2.5 Pro
1152
DeepSeek V3.1 Terminus Chat
1152
DeepSeek V3.2
1151
Nova Experimental Chat 10-09
1151
Gemini 2.5 Flash Thinking Preview 0925
1151
Qwen3 Max Instruct Preview
1150
Step 3.5 Flash
1149
GLM 5
1148
Claude Sonnet 4.5 (Thinking)
1148
MAI-DS-R1 FP8
1147
Kimi K2.5
1147
Claude Opus 4.5 (Thinking)

Last updated about 1 month ago

RankOverallNameVIBE
Score
Confidence
Interval
VotesDownvote %Abort %SpeedLatencyContextCost
(Input)
Cost
(Output)
4162Qwen3 Omni 30B A3B Instruct1157±62.3K1.9%3.9%65 tps1.2s66K$0.35$0.97
4226GPT-5 (High)1156±312.3K2.3%4.5%81 tps35.9s400K$1.25$10.00
4352Grok 4 Fast Non-Reasoning1156±216.7K2.2%1.5%93 tps0.6s2M$0.27$0.67
4444Gemini 2.5 Pro1153±238.9K1.8%2.3%45 tps2.6s1M$1.25$10.00
4544DeepSeek V3.1 Terminus Chat1152±214.5K1.8%3.4%27 tps1.5s131K$0.86$1.80
4640DeepSeek V3.21152±316.5K1.1%1.4%83 tps5.1s131K$0.43$1.09
4784Nova Experimental Chat 10-091151±45.3K4.0%<0.1%59 tps6.1s98K$0$0
4843Gemini 2.5 Flash Thinking Preview 09251151±313.9K2.1%<0.1%111 tps4.7s1M$0.30$2.50
4942Qwen3 Max Instruct Preview1151±226.4K2.0%1.1%31 tps1.7s256K$1.43$6.61
5048Step 3.5 Flash1150±62.9K1.7%2.2%109 tps0.6s256K$0.05$0.15
5122GLM 51149±46.4K1.2%3.4%36 tps2.7s200K$0.72$2.55
5210Claude Sonnet 4.5 (Thinking)1148±227.2K2.5%1.9%44 tps1.1s200K$3.00$15.00
53182MAI-DS-R1 FP81148±106052.4%<0.1%79 tps2.8s164K$0.25$1.00
5433Kimi K2.51147±316.1K1.2%6.5%33 tps1.7s262K$0.34$2.57
557Claude Opus 4.5 (Thinking)1147±421.9K1.6%1.8%49 tps1.4s200K$5.00$25.00
5640Qwen3 235B A22B Instruct 25071147±224.5K1.4%6.8%13 tps1.9s262K$0.13$0.52
5742GPT-5.2 (Extra High) 1147±215.6K1.4%13.2%17 tps20.5s400K$1.75$14.00
5848Polaris Alpha1146±51.6K1.9%<0.1%48 tps1.1s256K$0$0
5944Kimi K2 Thinking Turbo1145±210.9K1.6%2.0%75 tps1.4s262K$1.15$8.00
6056DeepSeek V3.2 Thinking1144±416.9K1.3%9.0%30 tps2.6s131K$0.28$0.42
6129MiniMax M2.71142±137001.4%3.0%34 tps2.5s205K$0.30$1.20
6248Grok 4 Fast Reasoning1142±314.5K2.0%2.1%102 tps3.1s2M$0.30$0.75
6317GPT-5.4 mini1141±145451.8%0.8%148 tps0.5s400K$0.75$4.50
6456DeepSeek V3.1 Turbo1134±39.5K1.2%0.9%173 tps1.3s164K$2.00$3.75
6565Mistral Large 31133±410.8K2.6%2.1%51 tps1.0s256K$0.50$1.50
6680GPT-5 (Minimal)1132±312.9K2.2%<0.1%67 tps1.4s400K$1.25$10.00
6784Claude Sonnet 3.7 (Thinking)1130±42.3K2.7%<0.1%41 tps2.6s200K$3.00$15.00
6856MiniMax M2.1 Lightning1129±53.6K1.4%1.7%52 tps2.1s205K$0.30$2.40
6986Qwen3 235B A22B1129±37.8K2.1%5.3%71 tps0.9s41K$0.23$0.63
7071DeepSeek V3.11125±44.4K1.1%0.8%197 tps0.4s164K$0.55$1.60
7165DeepSeek V3.2 Exp Chat1125±311.5K1.9%2.6%29 tps1.5s131K$0.27$0.39
7260MiniMax M2.11124±324.4K1.0%2.1%66 tps2.6s205K$0.30$1.20
7386Nemotron 3 Nano (Thinking)1123±35.9K1.5%2.0%200 tps0.5s256K$0$0
7452Qwen3.5 122B A17B1123±52.6K1.3%1.5%82 tps1.4s256K$0.40$3.20
75100Qwen Plus 0728 (Thinking)1123±53K2.1%<0.1%56 tps1.1s1M$0.40$4.00
7626Claude Haiku 4.5 (Extended Thinking)1121±314.1K1.8%1.4%115 tps0.7s200K$1.00$5.00
7760Gemini 2.5 Flash Preview 09251118±314.4K2.2%1.2%5 tps0.9s1M$0.13$0.97
7852GPT-51117±231.1K1.7%3.1%78 tps23.1s400K$1.25$9.67
7981OpenAI o3-pro1116±53.2K2.8%5.2%22 tps70.8s200K$20.00$80.00
8068Grok 41110±198.8K0.9%3.9%29 tps11.1s256K$3.00$15.00
View All (410 models)