Models
Topics
Language
More

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

Open license models

Filter the leaderboard to only show models that have an open license.

1126
Qwen3 235B A22B Instruct 2507
1124
Gemini 2.5 Flash Preview 0925
1119
Claude Opus 4.1 (Thinking)
1116
Claude Sonnet 4 (Thinking)
1115
GPT-5 (High)
1115
GPT-5
1113
GPT-4o
1110
Nova Experimental Chat 12-10
1106
GPT-4.5 Preview
1104
Gemini 2.5 Flash
1103
Qwen3 Max Instruct Preview
1102
Grok 4
1101
Mistral Medium 3.1
1101
Claude Opus 4
1093
Kimi K2.5 Instant

Last updated about 1 month ago

RankOverallNameVIBE
Score
Confidence
Interval
VotesDownvote %Abort %SpeedLatencyContextCost
(Input)
Cost
(Output)
4140Qwen3 235B A22B Instruct 25071126±82.1K2.5%6.8%13 tps1.9s262K$0.13$0.52
4260Gemini 2.5 Flash Preview 09251124±93.4K3.4%1.2%5 tps0.9s1M$0.13$0.97
4356Claude Opus 4.1 (Thinking)1119±102.1K5.0%<0.1%20 tps3.9s200K$15.00$75.00
4448Claude Sonnet 4 (Thinking)1116±75.3K4.2%1.5%52 tps1.5s200K$3.00$13.67
4526GPT-5 (High)1115±73.7K3.6%4.5%81 tps35.9s400K$1.25$10.00
4652GPT-51115±85.4K3.7%3.1%78 tps23.1s400K$1.25$9.67
4781GPT-4o1113±82.2K3.6%1.0%49 tps2.4s128K$3.71$12.57
4829Nova Experimental Chat 12-101110±257101.4%2.4%84 tps12.9s98K$0$0
4977GPT-4.5 Preview1106±165202.8%<0.1%36 tps3.0s200K$75.00$150.00
5095Gemini 2.5 Flash1104±79.8K2.7%1.3%2 tps3.7s1M$0.30$2.50
5142Qwen3 Max Instruct Preview1103±132.2K2.0%1.1%31 tps1.7s256K$1.43$6.61
5268Grok 41102±77.8K3.3%3.9%29 tps11.1s256K$3.00$15.00
5319Mistral Medium 3.11101±101.9K3.5%<0.1%77 tps0.7s128K$0.40$2.00
5421Claude Opus 41101±82.3K3.2%<0.1%25 tps1.5s200K$15.00$75.00
5537Kimi K2.5 Instant1093±121.4K2.7%2.9%32 tps3.0s262K$0.50$3.00
5633Kimi K2.51090±134.3K2.1%6.5%33 tps1.7s262K$0.34$2.57
5716Nova Experimental Chat 11-101089±237502.6%0.4%84 tps8.9s98K$0$0
5895Gemini 2.5 Flash Lite Thinking Preview 09251086±83.4K2.7%1.5%152 tps3.0s1M$0.10$0.40
5948OpenAI o1-mini1086±82.2K2.2%<0.1%118 tpsN/A128K$1.13$4.51
6071DeepSeek V3.11085±146903.5%0.8%197 tps0.4s164K$0.55$1.60
6177Claude Opus 4.11084±62.4K4.3%3.0%17 tps3.7s200K$15.00$75.00
6233Qwen3 30B A3B Instruct 25071084±82.3K3.2%1.2%55 tps1.3s131K$0.13$0.72
6333Qwen3 Next 80B A3B Instruct1083±161.5K2.6%0.6%84 tps1.1s256K$0.20$1.42
6484Claude Sonnet 3.7 (Thinking)1078±83.8K4.7%<0.1%41 tps2.6s200K$3.00$15.00
6548gpt-oss-120b1074±63K2.6%0.7%213 tps0.5s131K$0.11$0.50
6644Kimi K2 Thinking Turbo1072±171.3K2.2%2.0%75 tps1.4s262K$1.15$8.00
6765DeepSeek V3.2 Exp Chat1072±127552.6%2.6%29 tps1.5s131K$0.27$0.39
6868GLM 4.71071±121.9K2.1%5.8%40 tps1.5s200K$0.77$1.73
69113Gemini 2.5 Flash Lite Thinking1071±102.3K3.2%1.0%118 tps4.4s1M$0.03$0.13
7056DeepSeek V3.1 Turbo1070±121.3K2.6%0.9%173 tps1.3s164K$2.00$3.75
7148Step 3.5 Flash1067±206302.3%2.2%109 tps0.6s256K$0.05$0.15
7271Qwen3.5 397B A17B1067±151.3K2.2%4.3%57 tps1.4s256K$0.52$3.00
7352Qwen3.5 122B A17B1063±149801.5%1.5%82 tps1.4s256K$0.40$3.20
7471Gemini 3.1 Flash Lite Preview1060±221.2K3.3%1.0%8 tps1.2s1M$0.25$1.50
7526Grok 4.1 Fast Non-Reasoning1058±192K4.1%0.9%101 tps0.5s2M$0.20$0.50
7644DeepSeek V3.1 Terminus Chat1056±139552.1%3.4%27 tps1.5s131K$0.86$1.80
7781Qwen3.5 27B1056±176652.9%3.7%55 tps2.6s256K$0.30$2.40
7840DeepSeek V3.21056±151.4K1.4%1.4%83 tps5.1s131K$0.43$1.09
79106Claude Sonnet 3.5 v21055±227703.8%<0.1%46 tps1.4s200K$3.00$15.00
80100Gemini 2.5 Flash Preview1050±185652.6%<0.1%138 tps6.9s1M$0.15$0.60
View All (188 models)