Models
Topics
Language
More

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

Open license models

Filter the leaderboard to only show models that have an open license.

1113
GPT-4o
1110
Nova Experimental Chat 12-10
1104
Gemini 2.5 Flash
1103
Qwen3 Max Instruct Preview
1102
Grok 4
1093
Kimi K2.5 Instant
1090
Kimi K2.5
1086
Gemini 2.5 Flash Lite Thinking Preview 0925
1085
DeepSeek V3.1
1084
Qwen3 30B A3B Instruct 2507
1083
Qwen3 Next 80B A3B Instruct
1074
gpt-oss-120b
1072
Kimi K2 Thinking Turbo
1072
DeepSeek V3.2 Exp Chat
1071
GLM 4.7

Last updated about 1 month ago

RankOverallNameVIBE
Score
Confidence
Interval
VotesDownvote %Abort %SpeedLatencyContextCost
(Input)
Cost
(Output)
4181GPT-4o1113±82.2K3.6%1.0%49 tps2.4s128K$3.71$12.57
4229Nova Experimental Chat 12-101110±257101.4%2.4%84 tps12.9s98K$0$0
4395Gemini 2.5 Flash1104±79.8K2.7%1.3%2 tps3.7s1M$0.30$2.50
4442Qwen3 Max Instruct Preview1103±132.2K2.0%1.1%31 tps1.7s256K$1.43$6.61
4568Grok 41102±77.8K3.3%3.9%29 tps11.1s256K$3.00$15.00
4637Kimi K2.5 Instant1093±121.4K2.7%2.9%32 tps3.0s262K$0.50$3.00
4733Kimi K2.51090±134.3K2.1%6.5%33 tps1.7s262K$0.34$2.57
4895Gemini 2.5 Flash Lite Thinking Preview 09251086±83.4K2.7%1.5%152 tps3.0s1M$0.10$0.40
4971DeepSeek V3.11085±146903.5%0.8%197 tps0.4s164K$0.55$1.60
5033Qwen3 30B A3B Instruct 25071084±82.3K3.2%1.2%55 tps1.3s131K$0.13$0.72
5133Qwen3 Next 80B A3B Instruct1083±161.5K2.6%0.6%84 tps1.1s256K$0.20$1.42
5248gpt-oss-120b1074±63K2.6%0.7%213 tps0.5s131K$0.11$0.50
5344Kimi K2 Thinking Turbo1072±171.3K2.2%2.0%75 tps1.4s262K$1.15$8.00
5465DeepSeek V3.2 Exp Chat1072±127552.6%2.6%29 tps1.5s131K$0.27$0.39
5568GLM 4.71071±121.9K2.1%5.8%40 tps1.5s200K$0.77$1.73
56113Gemini 2.5 Flash Lite Thinking1071±102.3K3.2%1.0%118 tps4.4s1M$0.03$0.13
5756DeepSeek V3.1 Turbo1070±121.3K2.6%0.9%173 tps1.3s164K$2.00$3.75
5848Step 3.5 Flash1067±206302.3%2.2%109 tps0.6s256K$0.05$0.15
5971Qwen3.5 397B A17B1067±151.3K2.2%4.3%57 tps1.4s256K$0.52$3.00
6052Qwen3.5 122B A17B1063±149801.5%1.5%82 tps1.4s256K$0.40$3.20
6171Gemini 3.1 Flash Lite Preview1060±221.2K3.3%1.0%8 tps1.2s1M$0.25$1.50
6226Grok 4.1 Fast Non-Reasoning1058±192K4.1%0.9%101 tps0.5s2M$0.20$0.50
6344DeepSeek V3.1 Terminus Chat1056±139552.1%3.4%27 tps1.5s131K$0.86$1.80
6481Qwen3.5 27B1056±176652.9%3.7%55 tps2.6s256K$0.30$2.40
6540DeepSeek V3.21056±151.4K1.4%1.4%83 tps5.1s131K$0.43$1.09
66106Claude Sonnet 3.5 v21055±227703.8%<0.1%46 tps1.4s200K$3.00$15.00
6748Grok 4 Fast Reasoning1049±102.3K3.6%2.1%102 tps3.1s2M$0.30$0.75
68118GPT-4.1 mini1045±83.4K2.5%1.1%67 tps0.9s1M$0.34$1.60
69113Mistral Medium1043±111.1K3.1%1.8%48 tps0.6s33K$1.48$4.55
7086Amazon Nova 2 Lite1042±236902.1%1.0%137 tps0.6s300K$0.35$2.95
71101Gemini 2.5 Flash Lite1042±67.8K4.3%1.3%210 tps0.7s1M$0.10$0.40
7237Qwen3 Omni 30B A3B Thinking1040±207502.0%3.7%67 tps1.2s66K$0.97$1.79
73113GLM 4.51038±129153.2%3.7%46 tps1.4s131K$0.43$1.63
7493DeepSeek V3 0324 Turbo1038±92.2K1.8%6.3%12 tps2.4s164K$0.73$1.79
7595DeepSeek V3.2 Exp Thinking1038±176553.7%7.2%26 tps3.0s131K$0.28$0.42
7686DeepSeek V3.1 Chat1038±139752.5%2.8%21 tps1.6s131K$0.38$1.00
7729Qwen3 VL 235B A22B Instruct1036±161.3K4.2%3.1%75 tps1.9s129K$0.37$1.81
78106Grok 31034±82.8K2.8%1.5%53 tps0.6s1M$3.67$18.33
7979MiniMax M2.5 Lightning1031±208201.8%1.5%51 tps2.0s205K$0.60$2.40
8052Grok 4 Fast Non-Reasoning1030±171.5K4.1%1.5%93 tps0.6s2M$0.27$0.67
View All (154 models)