Models
Topics
Language
More

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

Open license models

Filter the leaderboard to only show models that have an open license.

1398
GPT-5.4 (High)
1364
GPT-5.4
1342
Claude Opus 4.6 (Thinking)
1316
Grok 4.20 Beta Non-reasoning
1289
GPT-5.1 (High)
1288
GPT-5.1
1286
GPT-5.1 (Medium)
1274
Claude Opus 4.6
1270
Gemini 3.1 Pro
1260
GPT-5.2 Instant
1256
Claude Sonnet 4.6 (Thinking)
1249
Nova Experimental Chat 11-10
1249
Mistral Medium 3.1
1222
Gemini 3 Pro
1219
Nova Experimental Chat 10-20

Last updated about 1 month ago

RankOverallNameVIBE
Score
Confidence
Interval
VotesDownvote %Abort %SpeedLatencyContextCost
(Input)
Cost
(Output)
14GPT-5.4 (High)1398±102.6K1.2%4.6%68 tps7.9s1M$2.50$15.00
22GPT-5.41364±72.3K1.1%2.6%55 tps0.8s1M$2.50$15.00
31Claude Opus 4.6 (Thinking)1342±55.7K0.9%2.5%56 tps1.6s200K$5.00$25.00
422Grok 4.20 Beta Non-reasoning1316±136303.8%1.1%151 tps0.6s2M$2.00$6.00
58GPT-5.1 (High)1289±223K1.4%3.2%76 tps6.9s400K$1.25$10.00
68GPT-5.11288±219.7K1.3%2.3%71 tps1.4s400K$1.42$11.33
78GPT-5.1 (Medium)1286±36.6K1.7%<0.1%86 tps3.8s400K$0.83$6.67
82Claude Opus 4.61274±47.1K1.4%2.1%48 tps1.7s200K$5.00$25.00
96Gemini 3.1 Pro1270±512.3K1.1%3.5%35 tps4.1s1M$2.00$12.00
1010GPT-5.2 Instant1260±327.1K0.8%1.7%52 tps2.0s400K$1.75$14.00
115Claude Sonnet 4.6 (Thinking)1256±55.8K1.4%4.7%57 tps1.1s200K$3.00$15.00
1216Nova Experimental Chat 11-101249±314.2K1.6%0.4%84 tps8.9s98K$0$0
1319Mistral Medium 3.11249±225.6K1.6%<0.1%77 tps0.7s128K$0.40$2.00
1410Gemini 3 Pro1222±344.5K1.1%2.1%50 tps3.6s1M$2.00$12.00
1537Nova Experimental Chat 10-201219±310.2K2.8%<0.1%30 tps0.5s98K$0$0
1617Grok 4.20 Beta Reasoning1216±72.1K1.7%1.1%77 tps4.5s2M$2.00$5.50
1728Ministral 8B 25121213±71.5K2.6%<0.1%174 tps0.5s128K$0.15$0.15
1833Qwen Plus 07281213±35.2K2.3%<0.1%55 tps0.9s1M$0.40$1.20
194Claude Sonnet 4.61212±56K1.1%1.6%47 tps1.2s200K$3.00$15.00
2029Qwen3 VL 235B A22B Instruct1211±310.2K2.5%3.1%75 tps1.9s129K$0.37$1.81
2122GPT-5 Chat1208±258K1.4%1.3%95 tps0.9s400K$1.25$10.00
2226Grok 4.1 Fast Non-Reasoning1207±320.1K1.7%0.9%101 tps0.5s2M$0.20$0.50
2313GPT-5.3 Instant1206±45.5K1.0%0.9%63 tps0.8s400K$1.75$14.00
2429Nova Experimental Chat 12-101206±39K1.2%2.4%84 tps12.9s98K$0$0
2537Sherlock Dash Alpha1205±51.7K1.7%<0.1%68 tps0.7s2M$0$0
2633Grok 4.20 Multi Agent Beta1197±71.7K1.8%1.2%56 tps8.8s2M$2.00$6.00
2714Gemini 3 Flash Preview Thinking1195±319.7K1.0%1.6%3 tps6.2s1M$0.50$3.00
2814Gemini 3 Pro (Low)1195±320.3K1.1%2.4%51 tps3.5s1M$2.00$12.00
2916GPT-5.21193±216.3K0.9%4.1%18 tps2.7s400K$1.75$14.00
3037Qwen3 Omni 30B A3B Thinking1188±55.3K1.2%3.7%67 tps1.2s66K$0.97$1.79
3132Gemini 2.5 Pro High1175±127.6K2.0%1.5%48 tps2.3s1M$1.25$10.00
3217Gemini 3 Flash Preview1173±312.8K0.7%1.3%138 tps1.4s1M$0.50$3.00
3317GPT-5.2 (High)1168±231.4K1.1%6.7%18 tps16.3s400K$1.75$14.00
3433Qwen3 30B A3B Instruct 25071167±224.4K1.7%1.2%55 tps1.3s131K$0.13$0.72
3556Gemini 2.5 Pro Low1166±216K2.1%<0.1%89 tps2.4s1M$1.25$10.00
3648OpenAI o1-mini1160±217K1.8%<0.1%118 tpsN/A128K$1.13$4.51
3744Grok 4.1 Fast Reasoning1160±223.1K1.8%1.5%58 tps7.3s2M$0.20$0.50
3862Qwen3 Omni 30B A3B Instruct1157±62.3K1.9%3.9%65 tps1.2s66K$0.35$0.97
3926GPT-5 (High)1156±312.3K2.3%4.5%81 tps35.9s400K$1.25$10.00
4052Grok 4 Fast Non-Reasoning1156±216.7K2.2%1.5%93 tps0.6s2M$0.27$0.67
View All (303 models)