Models
Topics
Language
More

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

Open license models

Filter the leaderboard to only show models that have an open license.

1173
GPT-4.5 Preview
1172
GPT-5.2 (Extra High)
1171
Kimi K2 Thinking Turbo
1169
GPT-5 (Minimal)
1167
Qwen Plus 0728
1163
Step 3.5 Flash
1162
Gemini 3.1 Flash Lite Preview Thinking
1161
Gemini 2.5 Pro
1159
Grok 4.1 Fast Reasoning
1158
DeepSeek V3.2 Thinking
1154
Claude Haiku 4.5 (Extended Thinking)
1154
Grok 4 Fast Reasoning
1152
Qwen Plus (Aug'24)
1151
Gemini 2.5 Flash Preview 0925
1150
Qwen3 Omni 30B A3B Thinking

Last updated about 1 month ago

RankOverallNameVIBE
Score
Confidence
Interval
VotesDownvote %Abort %SpeedLatencyContextCost
(Input)
Cost
(Output)
4177GPT-4.5 Preview1173±134952.9%<0.1%36 tps3.0s200K$75.00$150.00
4242GPT-5.2 (Extra High) 1172±133K1.6%13.2%17 tps20.5s400K$1.75$14.00
4344Kimi K2 Thinking Turbo1171±111.8K4.2%2.0%75 tps1.4s262K$1.15$8.00
4480GPT-5 (Minimal)1169±101.9K2.6%<0.1%67 tps1.4s400K$1.25$10.00
4533Qwen Plus 07281167±205857.1%<0.1%55 tps0.9s1M$0.40$1.20
4648Step 3.5 Flash1163±236400.8%2.2%109 tps0.6s256K$0.05$0.15
4756Gemini 3.1 Flash Lite Preview Thinking1162±197302.0%1.7%75 tps4.7s1M$0.25$1.50
4844Gemini 2.5 Pro1161±612.1K1.4%2.3%45 tps2.6s1M$1.25$10.00
4944Grok 4.1 Fast Reasoning1159±114.3K2.9%1.5%58 tps7.3s2M$0.20$0.50
5056DeepSeek V3.2 Thinking1158±192.4K2.3%9.0%30 tps2.6s131K$0.28$0.42
5126Claude Haiku 4.5 (Extended Thinking)1154±82.4K2.8%1.4%115 tps0.7s200K$1.00$5.00
5248Grok 4 Fast Reasoning1154±102.2K3.0%2.1%102 tps3.1s2M$0.30$0.75
5368Qwen Plus (Aug'24)1152±94.8K1.1%1.4%53 tps1.3s30K$0.40$1.20
5460Gemini 2.5 Flash Preview 09251151±91.8K3.3%1.2%5 tps0.9s1M$0.13$0.97
5537Qwen3 Omni 30B A3B Thinking1150±168453.4%3.7%67 tps1.2s66K$0.97$1.79
5637Claude Sonnet 4.51148±93.3K2.7%1.4%41 tps1.3s200K$1.80$9.00
57106Claude Sonnet 3.5 v21147±141K2.0%<0.1%46 tps1.4s200K$3.00$15.00
58111Claude Sonnet 3.71146±92.1K2.1%<0.1%39 tps1.6s200K$3.00$15.00
59133Kimi K2 09051145±101.5K2.0%4.0%30 tps1.4s262K$0.63$2.39
6043Gemini 2.5 Flash Thinking Preview 09251137±81.9K3.0%<0.1%111 tps4.7s1M$0.30$2.50
6126GPT-5 (High)1131±92.7K3.2%4.5%81 tps35.9s400K$1.25$10.00
6240DeepSeek V3.21125±102.1K1.6%1.4%83 tps5.1s131K$0.43$1.09
63111LongCat Flash Chat1121±177254.6%0.8%85 tps0.9s131K$0.14$0.68
6421Claude Opus 41120±139202.1%<0.1%25 tps1.5s200K$15.00$75.00
6579MiniMax M2.5 Lightning1119±166500.8%1.5%51 tps2.0s205K$0.60$2.40
6671Gemini 2.5 Flash Lite Preview 09251116±102K2.4%1.2%209 tps0.7s1M$0.25$0.35
67101Gemini 2.5 Flash Lite1113±64.4K1.8%1.3%210 tps0.7s1M$0.10$0.40
6852Grok 4 Fast Non-Reasoning1108±121.8K3.0%1.5%93 tps0.6s2M$0.27$0.67
6979Qwen3 Max Thinking Preview1105±141.9K4.0%3.1%40 tps2.1s256K$1.20$6.00
7077Claude Opus 4.11097±139654.0%3.0%17 tps3.7s200K$15.00$75.00
7148OpenAI o1-mini1095±84.9K1.0%<0.1%118 tpsN/A128K$1.13$4.51
72124Qwen3 235B A22B Thinking 25071091±207052.1%2.5%53 tps1.6s131K$0.59$5.70
73133Solar Pro 2 2507101086±112.8K0.9%<0.1%9 tpsN/A66K$0.50$0.50
74101gpt-oss-20b1085±122.1K2.6%0.5%216 tps0.5s131K$0.06$0.26
7552GPT-51084±75K2.1%3.1%78 tps23.1s400K$1.25$9.67
76147Arcee AI Maestro Reasoning1084±141.3K1.1%<0.1%85 tps0.3s131K$0.90$3.30
7756DeepSeek V3.1 Turbo1082±161.8K2.2%0.9%173 tps1.3s164K$2.00$3.75
78148Qwen3 30B A3B Thinking 25071081±198902.7%0.5%124 tps1.2s131K$0.16$1.70
7993Qwen Max1080±75.4K1.1%1.5%49 tps1.5s33K$1.60$6.40
8022GLM 51080±151.4K1.4%3.4%36 tps2.7s200K$0.72$2.55
View All (223 models)