Models
Topics
Language
More

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

Open license models

Filter the leaderboard to only show models that have an open license.

1166
Seed 2.0 Lite (Medium)
1162
MiniMax M2.5 FP8
1161
Claude Opus 4.1
1161
Qwen3 VL 235B A22B Instruct
1161
Gemini 2.5 Flash Thinking Preview 0925
1160
Kimi K2 Thinking Turbo
1157
MiniMax M2.1 Lightning
1153
GPT-5 (Minimal)
1153
Gemini 2.5 Flash Thinking
1152
Claude Haiku 4.5 (Extended Thinking)
1150
Qwen3 Max Instruct Preview
1149
MiniMax M2.1
1149
Grok 4.1 Fast Reasoning
1146
Gemini 2.5 Pro Low
1146
Qwen3 235B A22B Instruct 2507

Last updated about 1 month ago

RankOverallNameVIBE
Score
Confidence
Interval
VotesDownvote %Abort %SpeedLatencyContextCost
(Input)
Cost
(Output)
4186Seed 2.0 Lite (Medium)1166±145752.5%6.6%33 tps1.6s256K$0.25$2.00
4271MiniMax M2.5 FP81162±175753.4%3.6%33 tps1.7s205K$0.45$1.75
4377Claude Opus 4.11161±54.1K3.9%3.0%17 tps3.7s200K$15.00$75.00
4429Qwen3 VL 235B A22B Instruct1161±64.5K8.8%3.1%75 tps1.9s129K$0.37$1.81
4543Gemini 2.5 Flash Thinking Preview 09251161±37.2K9.1%<0.1%111 tps4.7s1M$0.30$2.50
4644Kimi K2 Thinking Turbo1160±813.2K3.5%2.0%75 tps1.4s262K$1.15$8.00
4756MiniMax M2.1 Lightning1157±139701.0%1.7%52 tps2.1s205K$0.30$2.40
4880GPT-5 (Minimal)1153±66.8K10.0%<0.1%67 tps1.4s400K$1.25$10.00
4971Gemini 2.5 Flash Thinking1153±73.7K3.6%2.2%88 tps6.4s1M$0.30$2.50
5026Claude Haiku 4.5 (Extended Thinking)1152±57K6.6%1.4%115 tps0.7s200K$1.00$5.00
5142Qwen3 Max Instruct Preview1150±413.5K5.8%1.1%31 tps1.7s256K$1.43$6.61
5260MiniMax M2.11149±610.4K4.3%2.1%66 tps2.6s205K$0.30$1.20
5344Grok 4.1 Fast Reasoning1149±621.2K4.2%1.5%58 tps7.3s2M$0.20$0.50
5456Gemini 2.5 Pro Low1146±47.5K13.0%<0.1%89 tps2.4s1M$1.25$10.00
5540Qwen3 235B A22B Instruct 25071146±38.8K12.2%6.8%13 tps1.9s262K$0.13$0.52
56104Grok 3 Beta1145±91.8K0.6%<0.1%58 tps0.8s131K$3.00$15.00
5733Grok 4.20 Multi Agent Beta1143±167651.9%1.2%56 tps8.8s2M$2.00$6.00
58100Gemini 2.5 Flash Preview1141±82.1K1.0%<0.1%138 tps6.9s1M$0.15$0.60
5933Qwen3 Next 80B A3B Instruct1141±47.6K7.7%0.6%84 tps1.1s256K$0.20$1.42
6084GPT-5 Mini Minimal1139±82.8K9.7%1.2%63 tps1.4s400K$0.25$2.00
6152GPT-51138±414K7.9%3.1%78 tps23.1s400K$1.25$9.67
6226Grok 4.1 Fast Non-Reasoning1137±57.4K6.6%0.9%101 tps0.5s2M$0.20$0.50
6365GLM 4.61136±514.1K4.7%5.4%39 tps1.5s200K$0.42$1.66
64111Claude Sonnet 3.71135±46.5K6.3%<0.1%39 tps1.6s200K$3.00$15.00
6552Claude Haiku 4.51134±59.9K6.9%1.1%100 tps0.9s200K$1.00$5.00
6662GPT-5.1 Instant1134±55.5K5.7%1.3%50 tps1.9s400K$1.25$10.00
6768Grok 41130±223.2K6.4%3.9%29 tps11.1s256K$3.00$15.00
6871GPT-5 Mini1130±46.1K7.9%2.6%66 tps14.2s400K$0.25$2.00
6940DeepSeek V3.21130±54.4K5.1%1.4%83 tps5.1s131K$0.43$1.09
7079MiniMax M2.5 Lightning1128±149952.5%1.5%51 tps2.0s205K$0.60$2.40
7184Nova Experimental Chat 10-091128±62.2K14.1%<0.1%59 tps6.1s98K$0$0
7248Grok 4 Fast Reasoning1125±511.8K5.5%2.1%102 tps3.1s2M$0.30$0.75
7352Qwen3.5 122B A17B1124±171.1K3.2%1.5%82 tps1.4s256K$0.40$3.20
7437Kimi K2.5 Instant1124±131.4K2.4%2.9%32 tps3.0s262K$0.50$3.00
7581GPT-4o1124±56.5K6.1%1.0%49 tps2.4s128K$3.71$12.57
7656DeepSeek V3.2 Thinking1117±610K3.8%9.0%30 tps2.6s131K$0.28$0.42
7768Qwen Plus (Aug'24)1116±58.9K9.4%1.4%53 tps1.3s30K$0.40$1.20
7829Nova Experimental Chat 12-101115±82.2K4.8%2.4%84 tps12.9s98K$0$0
7971Gemini 3.1 Flash Lite Preview1114±276302.3%1.0%8 tps1.2s1M$0.25$1.50
8068GLM 4.71112±88.8K4.7%5.8%40 tps1.5s200K$0.77$1.73
View All (312 models)