Models
Topics
Language
More

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

Open license models

Filter the leaderboard to only show models that have an open license.

1524
Claude Opus 4.6 (Thinking)
1424
Claude Opus 4.6
1280
Claude Opus 4.5 (Thinking)
1266
Claude Sonnet 4.6
1256
GPT-5.2 Instant
1248
Gemini 3 Pro
1244
Gemini 3.1 Pro
1240
Gemini 3 Pro (Low)
1231
GPT-5.1 (High)
1230
GPT-5.1
1222
Claude Sonnet 4.6 (Thinking)
1178
Mistral Medium 3.1
1167
Gemini 3 Flash Preview Thinking
1165
Gemini 3 Flash Preview
1164
GPT-5 Chat

Last updated about 1 month ago

RankOverallNameVIBE
Score
Confidence
Interval
VotesDownvote %Abort %SpeedLatencyContextCost
(Input)
Cost
(Output)
11Claude Opus 4.6 (Thinking)1524±169801.0%2.5%56 tps1.6s200K$5.00$25.00
22Claude Opus 4.61424±169501.0%2.1%48 tps1.7s200K$5.00$25.00
37Claude Opus 4.5 (Thinking)1280±142.8K1.4%1.8%49 tps1.4s200K$5.00$25.00
44Claude Sonnet 4.61266±276501.5%1.6%47 tps1.2s200K$3.00$15.00
510GPT-5.2 Instant1256±151.3K1.8%1.7%52 tps2.0s400K$1.75$14.00
610Gemini 3 Pro1248±163.5K1.5%2.1%50 tps3.6s1M$2.00$12.00
76Gemini 3.1 Pro1244±231.4K1.7%3.5%35 tps4.1s1M$2.00$12.00
814Gemini 3 Pro (Low)1240±191.2K0.8%2.4%51 tps3.5s1M$2.00$12.00
98GPT-5.1 (High)1231±151.8K1.7%3.2%76 tps6.9s400K$1.25$10.00
108GPT-5.11230±131.3K1.9%2.3%71 tps1.4s400K$1.42$11.33
115Claude Sonnet 4.6 (Thinking)1222±236301.6%4.7%57 tps1.1s200K$3.00$15.00
1219Mistral Medium 3.11178±101.5K1.6%<0.1%77 tps0.7s128K$0.40$2.00
1314Gemini 3 Flash Preview Thinking1167±171.4K1.7%1.6%3 tps6.2s1M$0.50$3.00
1417Gemini 3 Flash Preview1165±216751.5%1.3%138 tps1.4s1M$0.50$3.00
1522GPT-5 Chat1164±123.5K1.6%1.3%95 tps0.9s400K$1.25$10.00
1616GPT-5.21162±187851.9%4.1%18 tps2.7s400K$1.75$14.00
1717GPT-5.2 (High)1145±152.2K1.6%6.7%18 tps16.3s400K$1.75$14.00
1817Claude Opus 4.51135±211.1K1.4%1.5%45 tps1.5s200K$5.00$25.00
1944Gemini 2.5 Pro1125±63.1K3.7%2.3%45 tps2.6s1M$1.25$10.00
2026Claude Haiku 4.5 (Extended Thinking)1123±191.1K1.9%1.4%115 tps0.7s200K$1.00$5.00
2116Nova Experimental Chat 11-101120±205002.0%0.4%84 tps8.9s98K$0$0
2232Gemini 2.5 Pro High1119±102.5K2.4%1.5%48 tps2.3s1M$1.25$10.00
2313GPT-5.3 Instant1110±335151.0%0.9%63 tps0.8s400K$1.75$14.00
2433Kimi K2.51110±267202.0%6.5%33 tps1.7s262K$0.34$2.57
2542GPT-5.2 (Extra High) 1107±248902.7%13.2%17 tps20.5s400K$1.75$14.00
2610Claude Sonnet 4.5 (Thinking)1102±133.2K3.6%1.9%44 tps1.1s200K$3.00$15.00
2743Gemini 2.5 Flash Thinking Preview 09251097±101.3K1.6%<0.1%111 tps4.7s1M$0.30$2.50
2829Qwen3 VL 235B A22B Instruct1094±156752.2%3.1%75 tps1.9s129K$0.37$1.81
2948Claude Sonnet 4 (Thinking)1093±141.6K2.4%1.5%52 tps1.5s200K$3.00$13.67
3042Qwen3 Max Instruct Preview1083±171.1K1.7%1.1%31 tps1.7s256K$1.43$6.61
3144DeepSeek V3.1 Terminus Chat1078±125801.7%3.4%27 tps1.5s131K$0.86$1.80
3226GPT-5 (High)1061±92.5K2.7%4.5%81 tps35.9s400K$1.25$10.00
3352Claude Haiku 4.51060±131.6K3.1%1.1%100 tps0.9s200K$1.00$5.00
3465GLM 4.61059±256402.3%5.4%39 tps1.5s200K$0.42$1.66
3533Qwen3 30B A3B Instruct 25071056±188102.4%1.2%55 tps1.3s131K$0.13$0.72
3640Qwen3 235B A22B Instruct 25071053±196800.7%6.8%13 tps1.9s262K$0.13$0.52
3795Gemini 2.5 Flash1049±182.1K1.9%1.3%2 tps3.7s1M$0.30$2.50
3868Qwen Plus (Aug'24)1048±227302.0%1.4%53 tps1.3s30K$0.40$1.20
3956Gemini 2.5 Pro Low1044±161.3K2.3%<0.1%89 tps2.4s1M$1.25$10.00
4084Claude Sonnet 3.7 (Thinking)1041±225752.5%<0.1%41 tps2.6s200K$3.00$15.00
View All (98 models)