Models
More

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

Open license models

Filter the leaderboard to only show models that have an open license.

1171
Claude Sonnet 3.5 v2
1171
GPT-5.1 Codex (Medium)
1171
GPT-5.1 Instant
1170
Gemini 2.5 Pro Low
1167
Grok 4.20 Beta Reasoning
1165
gpt-oss-120b
1164
Qwen3.5 35B A3B
1163
GPT-5 Codex (Low)
1161
GLM 4.7
1161
Gemini 2.5 Flash Preview
1158
GPT-5 (Minimal)
1158
DeepSeek V3.1 Terminus Chat
1146
Qwen Plus (Aug'24)
1142
Qwen3.5 397B A17B
1140
Gemini 2.5 Flash Preview 0925

Last updated about 1 month ago

RankOverallNameVIBE
Score
Confidence
Interval
VotesDownvote %Abort %SpeedLatencyContextCost
(Input)
Cost
(Output)
8160Claude Sonnet 3.5 v21171±65.5K3.4%<0.1%46 tps1.4s200K$3.00$15.00
8260GPT-5.1 Codex (Medium)1171±143K3.2%4.6%71 tps3.7s400K$1.25$10.00
8360GPT-5.1 Instant1171±88.3K4.1%1.3%50 tps1.9s400K$1.25$10.00
8475Gemini 2.5 Pro Low1170±49.6K8.1%<0.1%89 tps2.4s1M$1.25$10.00
8560Grok 4.20 Beta Reasoning1167±221.2K4.1%1.1%77 tps4.5s2M$2.00$5.50
8669gpt-oss-120b1165±519.2K5.0%0.7%213 tps0.5s131K$0.11$0.50
8769Qwen3.5 35B A3B1164±258653.9%2.1%116 tps2.1s256K$0.63$1.13
8869GPT-5 Codex (Low)1163±105K4.1%2.7%112 tps3.5s400K$1.25$10.00
8969GLM 4.71161±716.8K3.7%5.8%40 tps1.5s200K$0.77$1.73
9086Gemini 2.5 Flash Preview1161±83K1.1%<0.1%138 tps6.9s1M$0.15$0.60
9186GPT-5 (Minimal)1158±58.3K7.4%<0.1%67 tps1.4s400K$1.25$10.00
9269DeepSeek V3.1 Terminus Chat1158±56.5K6.9%3.4%27 tps1.5s131K$0.86$1.80
9374Qwen Plus (Aug'24)1146±517.2K4.7%1.4%53 tps1.3s30K$0.40$1.20
9474Qwen3.5 397B A17B1142±142.5K2.9%4.3%57 tps1.4s256K$0.52$3.00
9574Gemini 2.5 Flash Preview 09251140±67.6K6.0%1.2%5 tps0.9s1M$0.13$0.97
9693Gemini 2.5 Flash Preview Thinking1136±101.4K1.8%<0.1%26 tps1.8s1M$0.15$1.76
9797Grok 3 Beta1134±92K0.8%<0.1%58 tps0.8s131K$3.00$15.00
9877Mistral Large 31131±85.4K5.8%2.1%51 tps1.0s256K$0.50$1.50
9977GPT-5 Mini1131±58.6K5.4%2.6%66 tps14.2s400K$0.25$2.00
10077DeepSeek V3.1 Turbo1130±74.8K5.3%0.9%173 tps1.3s164K$2.00$3.75
10177Grok 4.20 Multi Agent Beta1129±199453.6%1.2%56 tps8.8s2M$2.00$6.00
10277Qwen3 Max Thinking Preview1127±106.3K5.7%3.1%40 tps2.1s256K$1.20$6.00
10377Grok 41125±339.6K4.4%3.9%29 tps11.1s256K$3.00$15.00
10497Ministral 8B 25121125±155107.3%<0.1%174 tps0.5s128K$0.15$0.15
10577GPT-4.11123±532.8K5.2%3.7%112 tps1.3s1M$2.00$8.00
10677Gemini 2.5 Flash Lite Preview 09251122±78.5K6.6%1.2%209 tps0.7s1M$0.25$0.35
10797Gemini 2.5 Pro Preview 06051121±101.7K2.3%<0.1%0 tps3.7s1M$1.25$10.00
10885Gemini 2.5 Flash Thinking1118±413.7K3.6%2.2%88 tps6.4s1M$0.30$2.50
10985GPT-5 Mini Minimal1114±123.2K8.5%1.2%63 tps1.4s400K$0.25$2.00
11085GPT-5.2 Codex (Low)1113±191.2K3.2%4.5%41 tps5.0s400K$1.75$14.00
111108Gemini 2.5 Pro Preview 03251111±111.5K3.2%<0.1%3 tps16.6s1M$1.25$10.00
11285DeepSeek V3.1 Chat1110±74.9K6.6%2.8%21 tps1.6s131K$0.38$1.00
11385Qwen3 Omni 30B A3B Thinking1110±102.3K6.0%3.7%67 tps1.2s66K$0.97$1.79
11490DeepSeek V3.2 Exp Chat1107±45.5K6.1%2.6%29 tps1.5s131K$0.27$0.39
11590Qwen Max1107±418.3K4.2%1.5%49 tps1.5s33K$1.60$6.40
116114GPT-5 Mini Low1104±82.8K7.2%<0.1%69 tps3.2s400K$0.25$2.00
11790Gemini 2.5 Flash Lite1103±521.3K6.2%1.3%210 tps0.7s1M$0.10$0.40
11890Grok 3 Fast1102±142.5K4.7%1.7%52 tps2.4s131K$5.00$25.00
11990GPT-4o1102±58.5K3.7%1.0%49 tps2.4s128K$3.71$12.57
12090Step 3.5 Flash1102±248103.6%2.2%109 tps0.6s256K$0.05$0.15
View All (404 models)