Models
Topics
Language
More

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

Open license models

Filter the leaderboard to only show models that have an open license.

1313
GPT-5 Chat
1254
Mistral Medium 3.1
1239
GPT-5.1
1235
Gemini 3 Pro
1214
GPT-5.2 Instant
1211
Gemini 3 Pro (Low)
1203
GPT-5.2
1200
Claude Haiku 4.5 (Extended Thinking)
1195
Gemini 3 Flash Preview Thinking
1188
Claude Opus 4.5 (Thinking)
1186
Kimi K2.5
1183
Grok 4.1 Fast Non-Reasoning
1180
Nova Experimental Chat 11-10
1169
Claude Haiku 4.5
1145
GPT-5.1 (High)

Last updated about 1 month ago

RankOverallNameVIBE
Score
Confidence
Interval
VotesDownvote %Abort %SpeedLatencyContextCost
(Input)
Cost
(Output)
122GPT-5 Chat1313±82.4K0.6%1.3%95 tps0.9s400K$1.25$10.00
219Mistral Medium 3.11254±13805<0.1%<0.1%77 tps0.7s128K$0.40$2.00
38GPT-5.11239±219600.5%2.3%71 tps1.4s400K$1.42$11.33
410Gemini 3 Pro1235±131.6K1.2%2.1%50 tps3.6s1M$2.00$12.00
510GPT-5.2 Instant1214±211.3K0.4%1.7%52 tps2.0s400K$1.75$14.00
614Gemini 3 Pro (Low)1211±211K0.9%2.4%51 tps3.5s1M$2.00$12.00
716GPT-5.21203±266800.7%4.1%18 tps2.7s400K$1.75$14.00
826Claude Haiku 4.5 (Extended Thinking)1200±176100.8%1.4%115 tps0.7s200K$1.00$5.00
914Gemini 3 Flash Preview Thinking1195±209501.6%1.6%3 tps6.2s1M$0.50$3.00
107Claude Opus 4.5 (Thinking)1188±221.1K1.8%1.8%49 tps1.4s200K$5.00$25.00
1133Kimi K2.51186±387600.7%6.5%33 tps1.7s262K$0.34$2.57
1226Grok 4.1 Fast Non-Reasoning1183±289801.0%0.9%101 tps0.5s2M$0.20$0.50
1316Nova Experimental Chat 11-101180±195651.7%0.4%84 tps8.9s98K$0$0
1452Claude Haiku 4.51169±179601.0%1.1%100 tps0.9s200K$1.00$5.00
158GPT-5.1 (High)1145±191.1K3.6%3.2%76 tps6.9s400K$1.25$10.00
1637Claude Sonnet 4.51139±171.3K1.1%1.4%41 tps1.3s200K$1.80$9.00
1733Qwen3 30B A3B Instruct 25071139±167801.3%1.2%55 tps1.3s131K$0.13$0.72
1817Gemini 3 Flash Preview1137±206150.8%1.3%138 tps1.4s1M$0.50$3.00
1948gpt-oss-120b1136±197751.3%0.7%213 tps0.5s131K$0.11$0.50
2033Qwen3 Next 80B A3B Instruct1135±256700.7%0.6%84 tps1.1s256K$0.20$1.42
21106DeepSeek V3 03241125±151.4K1.8%5.8%12 tps2.7s164K$0.38$0.93
2281GPT-4o1121±148951.6%1.0%49 tps2.4s128K$3.71$12.57
2356Gemini 2.5 Pro Low1114±235351.8%<0.1%89 tps2.4s1M$1.25$10.00
2432Gemini 2.5 Pro High1109±201.2K1.3%1.5%48 tps2.3s1M$1.25$10.00
2517Claude Opus 4.51086±216250.8%1.5%45 tps1.5s200K$5.00$25.00
2640DeepSeek V3.21083±256400.8%1.4%83 tps5.1s131K$0.43$1.09
2710Claude Sonnet 4.5 (Thinking)1080±151K1.0%1.9%44 tps1.1s200K$3.00$15.00
2844Gemini 2.5 Pro1076±121.9K1.3%2.3%45 tps2.6s1M$1.25$10.00
2962GPT-5.1 Instant1070±217051.4%1.3%50 tps1.9s400K$1.25$10.00
30106Claude Sonnet 3.5 v21069±236301.6%<0.1%46 tps1.4s200K$3.00$15.00
3143Gemini 2.5 Flash Thinking Preview 09251068±216601.5%<0.1%111 tps4.7s1M$0.30$2.50
3242Qwen3 Max Instruct Preview1059±199801.0%1.1%31 tps1.7s256K$1.43$6.61
3393Qwen Max1056±111.6K2.4%1.5%49 tps1.5s33K$1.60$6.40
3493DeepSeek V3 0324 Turbo1054±161.5K2.0%6.3%12 tps2.4s164K$0.73$1.79
3556DeepSeek V3.2 Thinking1050±236951.4%9.0%30 tps2.6s131K$0.28$0.42
3640Qwen3 235B A22B Instruct 25071050±177951.2%6.8%13 tps1.9s262K$0.13$0.52
3768Qwen Plus (Aug'24)1046±181.5K1.3%1.4%53 tps1.3s30K$0.40$1.20
3817GPT-5.2 (High)1036±201.3K1.1%6.7%18 tps16.3s400K$1.75$14.00
3944Grok 4.1 Fast Reasoning1036±251.2K1.2%1.5%58 tps7.3s2M$0.20$0.50
4044DeepSeek V3.1 Terminus Chat1034±225901.7%3.4%27 tps1.5s131K$0.86$1.80
View All (88 models)