Models
Topics
Language
More

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

Open license models

Filter the leaderboard to only show models that have an open license.

1049
Grok 4 Fast Reasoning
1045
GPT-4.1 mini
1043
Mistral Medium
1042
Amazon Nova 2 Lite
1042
Gemini 2.5 Flash Lite
1040
Qwen3 Omni 30B A3B Thinking
1038
GLM 4.5
1038
DeepSeek V3 0324 Turbo
1038
DeepSeek V3.2 Exp Thinking
1038
DeepSeek V3.1 Chat
1036
Qwen3 VL 235B A22B Instruct
1034
Grok 3
1031
MiniMax M2.5 Lightning
1030
Grok 4 Fast Non-Reasoning
1027
Claude Sonnet 3.7

Last updated about 1 month ago

RankOverallNameVIBE
Score
Confidence
Interval
VotesDownvote %Abort %SpeedLatencyContextCost
(Input)
Cost
(Output)
8148Grok 4 Fast Reasoning1049±102.3K3.6%2.1%102 tps3.1s2M$0.30$0.75
82118GPT-4.1 mini1045±83.4K2.5%1.1%67 tps0.9s1M$0.34$1.60
83113Mistral Medium1043±111.1K3.1%1.8%48 tps0.6s33K$1.48$4.55
8486Amazon Nova 2 Lite1042±236902.1%1.0%137 tps0.6s300K$0.35$2.95
85101Gemini 2.5 Flash Lite1042±67.8K4.3%1.3%210 tps0.7s1M$0.10$0.40
8637Qwen3 Omni 30B A3B Thinking1040±207502.0%3.7%67 tps1.2s66K$0.97$1.79
87113GLM 4.51038±129153.2%3.7%46 tps1.4s131K$0.43$1.63
8893DeepSeek V3 0324 Turbo1038±92.2K1.8%6.3%12 tps2.4s164K$0.73$1.79
8995DeepSeek V3.2 Exp Thinking1038±176553.7%7.2%26 tps3.0s131K$0.28$0.42
9086DeepSeek V3.1 Chat1038±139752.5%2.8%21 tps1.6s131K$0.38$1.00
9129Qwen3 VL 235B A22B Instruct1036±161.3K4.2%3.1%75 tps1.9s129K$0.37$1.81
92106Grok 31034±82.8K2.8%1.5%53 tps0.6s1M$3.67$18.33
9379MiniMax M2.5 Lightning1031±208201.8%1.5%51 tps2.0s205K$0.60$2.40
9452Grok 4 Fast Non-Reasoning1030±171.5K4.1%1.5%93 tps0.6s2M$0.27$0.67
95111Claude Sonnet 3.71027±94K4.9%<0.1%39 tps1.6s200K$3.00$15.00
9668Qwen Plus (Aug'24)1023±92.4K2.9%1.4%53 tps1.3s30K$0.40$1.20
9756DeepSeek V3.2 Thinking1021±131.9K1.8%9.0%30 tps2.6s131K$0.28$0.42
9844Grok 4.1 Fast Reasoning1020±73.7K3.0%1.5%58 tps7.3s2M$0.20$0.50
9956MiniMax M2.1 Lightning1019±248301.8%1.7%52 tps2.1s205K$0.30$2.40
10071GPT-5 Mini1017±103.1K5.2%2.6%66 tps14.2s400K$0.25$2.00
101106DeepSeek V3 03241013±112.1K3.1%5.8%12 tps2.7s164K$0.38$0.93
102124Qwen3 235B A22B Thinking 25071010±167453.2%2.5%53 tps1.6s131K$0.59$5.70
10395DeepSeek-R1 Turbo1009±206603.6%2.6%29 tps1.8s64K$2.85$4.75
10493Qwen Max1009±112.7K2.7%1.5%49 tps1.5s33K$1.60$6.40
10580GPT-5 (Minimal)1003±101.9K5.4%<0.1%67 tps1.4s400K$1.25$10.00
106133DeepSeek-R1 05281001±151.1K4.1%1.3%93 tps0.5s64K$1.60$3.67
107106DeepSeek V3.1 Terminus Thinking1000±147452.6%5.9%27 tps1.8s131K$0.56$1.68
10865GLM 4.6991±159453.6%5.4%39 tps1.5s200K$0.42$1.66
109147GLM 4.5 Air991±161.1K2.7%<0.1%22 tps1.4s131K$0.10$0.38
11037Nova Experimental Chat 10-20984±205555.1%<0.1%30 tps0.5s98K$0$0
11171Seed 1.8 251228983±103K2.6%3.7%41 tps2.1s256K$0.25$2.00
112113Kimi K2 Fast975±104.8K2.3%0.8%365 tps0.5s131K$1.00$3.00
113143Gemini 2.0 Flash974±191.9K4.7%<0.1%76 tps0.5s1M$0.14$0.56
114133GPT-4.1 nano974±112.3K3.4%0.6%175 tps0.5s1M$0.10$0.40
115148OpenAI o3970±101.2K3.1%0.9%85 tps6.8s128K$7.33$29.33
116129Command A965±83K2.9%2.2%42 tps0.8s256K$2.00$7.33
117111LongCat Flash Chat963±255604.3%0.8%85 tps0.9s131K$0.14$0.68
118111Solar Pro 3 (Reasoning)960±235051.0%3.2%118 tps1.2s131K$0.15$0.60
119153OpenAI o1960±112.3K2.4%4.2%92 tps5.5s200K$15.00$60.00
120126DeepSeek V3960±73.4K2.3%0.9%69 tps1.1s64K$0.59$1.49
View All (188 models)