Models
Topics
Language
More

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

Open license models

Filter the leaderboard to only show models that have an open license.

1154
Grok 4 Fast Reasoning
1152
Qwen Plus (Aug'24)
1151
Gemini 2.5 Flash Preview 0925
1150
Qwen3 Omni 30B A3B Thinking
1148
Claude Sonnet 4.5
1147
Claude Sonnet 3.5 v2
1145
Kimi K2 0905
1131
GPT-5 (High)
1125
DeepSeek V3.2
1121
LongCat Flash Chat
1119
MiniMax M2.5 Lightning
1116
Gemini 2.5 Flash Lite Preview 0925
1113
Gemini 2.5 Flash Lite
1108
Grok 4 Fast Non-Reasoning
1105
Qwen3 Max Thinking Preview

Last updated about 1 month ago

RankOverallNameVIBE
Score
Confidence
Interval
VotesDownvote %Abort %SpeedLatencyContextCost
(Input)
Cost
(Output)
4148Grok 4 Fast Reasoning1154±102.2K3.0%2.1%102 tps3.1s2M$0.30$0.75
4268Qwen Plus (Aug'24)1152±94.8K1.1%1.4%53 tps1.3s30K$0.40$1.20
4360Gemini 2.5 Flash Preview 09251151±91.8K3.3%1.2%5 tps0.9s1M$0.13$0.97
4437Qwen3 Omni 30B A3B Thinking1150±168453.4%3.7%67 tps1.2s66K$0.97$1.79
4537Claude Sonnet 4.51148±93.3K2.7%1.4%41 tps1.3s200K$1.80$9.00
46106Claude Sonnet 3.5 v21147±141K2.0%<0.1%46 tps1.4s200K$3.00$15.00
47133Kimi K2 09051145±101.5K2.0%4.0%30 tps1.4s262K$0.63$2.39
4826GPT-5 (High)1131±92.7K3.2%4.5%81 tps35.9s400K$1.25$10.00
4940DeepSeek V3.21125±102.1K1.6%1.4%83 tps5.1s131K$0.43$1.09
50111LongCat Flash Chat1121±177254.6%0.8%85 tps0.9s131K$0.14$0.68
5179MiniMax M2.5 Lightning1119±166500.8%1.5%51 tps2.0s205K$0.60$2.40
5271Gemini 2.5 Flash Lite Preview 09251116±102K2.4%1.2%209 tps0.7s1M$0.25$0.35
53101Gemini 2.5 Flash Lite1113±64.4K1.8%1.3%210 tps0.7s1M$0.10$0.40
5452Grok 4 Fast Non-Reasoning1108±121.8K3.0%1.5%93 tps0.6s2M$0.27$0.67
5579Qwen3 Max Thinking Preview1105±141.9K4.0%3.1%40 tps2.1s256K$1.20$6.00
56124Qwen3 235B A22B Thinking 25071091±207052.1%2.5%53 tps1.6s131K$0.59$5.70
57101gpt-oss-20b1085±122.1K2.6%0.5%216 tps0.5s131K$0.06$0.26
5852GPT-51084±75K2.1%3.1%78 tps23.1s400K$1.25$9.67
5956DeepSeek V3.1 Turbo1082±161.8K2.2%0.9%173 tps1.3s164K$2.00$3.75
60148Qwen3 30B A3B Thinking 25071081±198902.7%0.5%124 tps1.2s131K$0.16$1.70
6193Qwen Max1080±75.4K1.1%1.5%49 tps1.5s33K$1.60$6.40
6222GLM 51080±151.4K1.4%3.4%36 tps2.7s200K$0.72$2.55
63113Mistral Medium1079±92.7K1.8%1.8%48 tps0.6s33K$1.48$4.55
6452Claude Haiku 4.51078±112.6K3.7%1.1%100 tps0.9s200K$1.00$5.00
6595DeepSeek-R1 Turbo1073±167803.7%2.6%29 tps1.8s64K$2.85$4.75
6695Kimi K2 Thinking1072±308808.8%4.2%61 tps5.9s262K$0.24$1.03
6760MiniMax M2.11071±112.6K1.1%2.1%66 tps2.6s205K$0.30$1.20
68113Kimi K2 Fast1069±78.5K1.0%0.8%365 tps0.5s131K$1.00$3.00
6986DeepSeek V3.1 Chat1069±111.3K3.0%2.8%21 tps1.6s131K$0.38$1.00
7093DeepSeek V3 0324 Turbo1068±74.2K0.8%6.3%12 tps2.4s164K$0.73$1.79
7168Grok 41062±610.5K1.3%3.9%29 tps11.1s256K$3.00$15.00
7295Gemini 2.5 Flash1061±710K1.0%1.3%2 tps3.7s1M$0.30$2.50
73121Qwen3 32B Fast1061±93K2.4%11.6%30 tps3.1s41K$0.10$0.25
7444DeepSeek V3.1 Terminus Chat1060±131.4K3.3%3.4%27 tps1.5s131K$0.86$1.80
7586Nemotron 3 Nano (Thinking)1059±188251.8%2.0%200 tps0.5s256K$0$0
7684GPT-5 Mini Minimal1057±117455.7%1.2%63 tps1.4s400K$0.25$2.00
7771Gemini 2.5 Flash Thinking1055±112.8K2.3%2.2%88 tps6.4s1M$0.30$2.50
7865DeepSeek V3.2 Exp Chat1054±141.2K3.7%2.6%29 tps1.5s131K$0.27$0.39
7952Qwen3.5 122B A17B1053±255803.3%1.5%82 tps1.4s256K$0.40$3.20
80133Qwen3 14B1053±131.7K2.9%1.7%109 tps0.8s41K$0.04$0.15
View All (173 models)