Models
Topics
Language
More

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

Open license models

Filter the leaderboard to only show models that have an open license.

1087
Nemotron 3 Nano
1088
GPT-4o
1089
Qwen3 Max Thinking Preview
1090
Qwen3 235B A22B
1092
Qwen3.5 397B A17B
1093
gpt-oss-20b
1097
Qwen3.5 27B
1098
GPT-5.1 Instant
1098
Qwen Plus (Aug'24)
1099
Amazon Nova 2 Lite
1100
DeepSeek V3 (Turbo)
1100
Grok 4
1101
GLM 4.7
1102
Gemini 2.5 Flash Preview 0925
1102
DeepSeek V3.1 Chat

Last updated about 1 month ago

RankOverallNameVIBE
Score
Confidence
Interval
VotesDownvote %Abort %SpeedLatencyContextCost
(Input)
Cost
(Output)
201133Nemotron 3 Nano1087±51.9K2.5%1.3%216 tps0.8s256K$0.05$4.94
20281GPT-4o1088±230.3K2.1%1.0%49 tps2.4s128K$3.71$12.57
20379Qwen3 Max Thinking Preview1089±217.8K3.3%3.1%40 tps2.1s256K$1.20$6.00
20486Qwen3 235B A22B1090±311.9K5.1%5.3%71 tps0.9s41K$0.23$0.63
20571Qwen3.5 397B A17B1092±57.2K1.8%4.3%57 tps1.4s256K$0.52$3.00
206101gpt-oss-20b1093±220.3K4.6%0.5%216 tps0.5s131K$0.06$0.26
20781Qwen3.5 27B1097±62.3K2.4%3.7%55 tps2.6s256K$0.30$2.40
20862GPT-5.1 Instant1098±221.7K2.4%1.3%50 tps1.9s400K$1.25$10.00
20968Qwen Plus (Aug'24)1098±260.9K2.4%1.4%53 tps1.3s30K$0.40$1.20
21086Amazon Nova 2 Lite1099±312.6K3.1%1.0%137 tps0.6s300K$0.35$2.95
211101DeepSeek V3 (Turbo)1100±34.8K2.5%1.5%32 tps1.5s64K$0.40$1.30
21268Grok 41100±1120.3K2.1%3.9%29 tps11.1s256K$3.00$15.00
21368GLM 4.71101±335.7K2.1%5.8%40 tps1.5s200K$0.77$1.73
21460Gemini 2.5 Flash Preview 09251102±219.5K4.3%1.2%5 tps0.9s1M$0.13$0.97
21586DeepSeek V3.1 Chat1102±313.4K4.1%2.8%21 tps1.6s131K$0.38$1.00
21671Seed 1.8 2512281104±319K1.5%3.7%41 tps2.1s256K$0.25$2.00
21784MiniMax M2.51105±82.1K1.6%1.4%70 tps1.9s205K$0.28$1.20
21865GLM 4.61108±325.8K4.3%5.4%39 tps1.5s200K$0.42$1.66
21986DeepSeek V3.1 Nex N11112±62.1K1.7%3.4%24 tps7.2s131K$0.14$0.50
22095Qwen3 32B1117±63.3K2.8%3.9%30 tps3.1s41K$0.12$0.42
22152GPT-51119±244.3K3.9%3.1%78 tps23.1s400K$1.25$9.67
22279MiniMax M2.5 Lightning1121±45.6K1.3%1.5%51 tps2.0s205K$0.60$2.40
22365Mistral Large 31122±314.3K3.3%2.1%51 tps1.0s256K$0.50$1.50
22471DeepSeek V3.11124±36.8K2.0%0.8%197 tps0.4s164K$0.55$1.60
22565DeepSeek V3.2 Exp Chat1124±314.3K4.0%2.6%29 tps1.5s131K$0.27$0.39
22662MiniMax M21125±233.6K3.5%2.2%39 tps2.3s205K$0.21$0.85
22786Nemotron 3 Nano (Thinking)1127±37.5K2.4%2.0%200 tps0.5s256K$0$0
22856DeepSeek V3.2 Thinking1127±337.6K2.6%9.0%30 tps2.6s131K$0.28$0.42
22952Grok 4 Fast Non-Reasoning1128±321.3K4.7%1.5%93 tps0.6s2M$0.27$0.67
23044Grok 4.1 Fast Reasoning1128±257K3.1%1.5%58 tps7.3s2M$0.20$0.50
23152Claude Haiku 4.51128±231.4K3.7%1.1%100 tps0.9s200K$1.00$5.00
23248Grok 4 Fast Reasoning1128±225.9K3.9%2.1%102 tps3.1s2M$0.30$0.75
23360MiniMax M2.11129±241.8K2.0%2.1%66 tps2.6s205K$0.30$1.20
23442GPT-5.2 (Extra High) 1133±320.9K1.9%13.2%17 tps20.5s400K$1.75$14.00
23544Gemini 2.5 Pro1136±168.8K3.9%2.3%45 tps2.6s1M$1.25$10.00
23644Kimi K2 Thinking Turbo1137±329.8K2.5%2.0%75 tps1.4s262K$1.15$8.00
23748Claude Sonnet 4 (Thinking)1138±230.7K2.6%1.5%52 tps1.5s200K$3.00$13.67
23837Claude Sonnet 4.51139±237.7K4.3%1.4%41 tps1.3s200K$1.80$9.00
23956DeepSeek V3.1 Turbo1140±214.5K2.3%0.9%173 tps1.3s164K$2.00$3.75
24071MiniMax M2.5 FP81141±42.9K1.7%3.6%33 tps1.7s205K$0.45$1.75
View All (288 models)