Models
Topics
Language
More

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

Open license models

Filter the leaderboard to only show models that have an open license.

1569
Claude Opus 4.6 (Thinking)
1493
GPT-5.4
1469
Claude Opus 4.6
1418
Gemini 3.1 Pro
1368
GPT-5.1 (High)
1364
Claude Sonnet 4.6
1361
GPT-5.2 Instant
1360
GPT-5.1
1345
Qwen3 30B A3B Instruct 2507
1343
Gemini 3 Pro
1329
GPT-5.2
1313
Claude Opus 4.5 (Thinking)
1300
Gemini 3 Pro (Low)
1281
Gemini 3 Flash Preview
1280
Claude Sonnet 4.6 (Thinking)

Last updated about 1 month ago

RankOverallNameVIBE
Score
Confidence
Interval
VotesDownvote %Abort %SpeedLatencyContextCost
(Input)
Cost
(Output)
11Claude Opus 4.6 (Thinking)1569±101.5K1.7%2.5%56 tps1.6s200K$5.00$25.00
22GPT-5.41493±146951.4%2.6%55 tps0.8s1M$2.50$15.00
32Claude Opus 4.61469±161.6K2.4%2.1%48 tps1.7s200K$5.00$25.00
46Gemini 3.1 Pro1418±114.9K1.0%3.5%35 tps4.1s1M$2.00$12.00
58GPT-5.1 (High)1368±104.9K2.2%3.2%76 tps6.9s400K$1.25$10.00
64Claude Sonnet 4.61364±191.9K1.3%1.6%47 tps1.2s200K$3.00$15.00
710GPT-5.2 Instant1361±114K1.1%1.7%52 tps2.0s400K$1.75$14.00
88GPT-5.11360±92.6K2.4%2.3%71 tps1.4s400K$1.42$11.33
933Qwen3 30B A3B Instruct 25071345±63.4K1.4%1.2%55 tps1.3s131K$0.13$0.72
1010Gemini 3 Pro1343±716.2K1.4%2.1%50 tps3.6s1M$2.00$12.00
1116GPT-5.21329±93.2K1.2%4.1%18 tps2.7s400K$1.75$14.00
127Claude Opus 4.5 (Thinking)1313±96K2.5%1.8%49 tps1.4s200K$5.00$25.00
1314Gemini 3 Pro (Low)1300±84.1K1.8%2.4%51 tps3.5s1M$2.00$12.00
1417Gemini 3 Flash Preview1281±112K1.2%1.3%138 tps1.4s1M$0.50$3.00
155Claude Sonnet 4.6 (Thinking)1280±141.4K2.2%4.7%57 tps1.1s200K$3.00$15.00
1617GPT-5.2 (High)1275±126.5K1.4%6.7%18 tps16.3s400K$1.75$14.00
1729Qwen3 VL 235B A22B Instruct1273±141.1K2.3%3.1%75 tps1.9s129K$0.37$1.81
1833Qwen3 Next 80B A3B Instruct1270±102.3K2.8%0.6%84 tps1.1s256K$0.20$1.42
1922GPT-5 Chat1269±57.9K1.6%1.3%95 tps0.9s400K$1.25$10.00
2048gpt-oss-120b1269±74.6K1.4%0.7%213 tps0.5s131K$0.11$0.50
2140Qwen3 235B A22B Instruct 25071261±83.1K1.4%6.8%13 tps1.9s262K$0.13$0.52
2210Claude Sonnet 4.5 (Thinking)1261±75.5K1.9%1.9%44 tps1.1s200K$3.00$15.00
2326Grok 4.1 Fast Non-Reasoning1260±162.5K3.7%0.9%101 tps0.5s2M$0.20$0.50
2414Gemini 3 Flash Preview Thinking1248±103.7K1.3%1.6%3 tps6.2s1M$0.50$3.00
2532Gemini 2.5 Pro High1234±64.6K2.4%1.5%48 tps2.3s1M$1.25$10.00
2662GPT-5.1 Instant1233±122.6K2.7%1.3%50 tps1.9s400K$1.25$10.00
2781GPT-4o1228±112.9K1.7%1.0%49 tps2.4s128K$3.71$12.57
2813GPT-5.3 Instant1225±132.3K1.3%0.9%63 tps0.8s400K$1.75$14.00
2942Qwen3 Max Instruct Preview1192±92.8K3.0%1.1%31 tps1.7s256K$1.43$6.61
3029Nova Experimental Chat 12-101192±111.4K0.7%2.4%84 tps12.9s98K$0$0
3117Claude Opus 4.51177±151.9K4.7%1.5%45 tps1.5s200K$5.00$25.00
3233Kimi K2.51175±123.1K1.6%6.5%33 tps1.7s262K$0.34$2.57
3342GPT-5.2 (Extra High) 1172±133K1.6%13.2%17 tps20.5s400K$1.75$14.00
3444Kimi K2 Thinking Turbo1171±111.8K4.2%2.0%75 tps1.4s262K$1.15$8.00
3548Step 3.5 Flash1163±236400.8%2.2%109 tps0.6s256K$0.05$0.15
3656Gemini 3.1 Flash Lite Preview Thinking1162±197302.0%1.7%75 tps4.7s1M$0.25$1.50
3744Gemini 2.5 Pro1161±612.1K1.4%2.3%45 tps2.6s1M$1.25$10.00
3844Grok 4.1 Fast Reasoning1159±114.3K2.9%1.5%58 tps7.3s2M$0.20$0.50
3956DeepSeek V3.2 Thinking1158±192.4K2.3%9.0%30 tps2.6s131K$0.28$0.42
4026Claude Haiku 4.5 (Extended Thinking)1154±82.4K2.8%1.4%115 tps0.7s200K$1.00$5.00
View All (173 models)