Models
Topics
Language
More

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

Open license models

Filter the leaderboard to only show models that have an open license.

922
Qwen3 VL 235B A22B Thinking
922
Kimi K2 0905
925
Kimi K2 0905 Turbo
926
Kimi K2 Thinking
928
Seed 1.6 250615
947
Mistral Large 3
949
Qwen3.5 35B A3B
950
gpt-oss-20b
951
OpenAI o3-pro
952
Qwen3 Max Thinking Preview
955
DeepSeek V3.1 Thinking
956
OpenAI o4-mini
958
OpenAI o4-mini-high
960
DeepSeek V3
960
OpenAI o1

Last updated about 1 month ago

RankOverallNameVIBE
Score
Confidence
Interval
VotesDownvote %Abort %SpeedLatencyContextCost
(Input)
Cost
(Output)
41126Qwen3 VL 235B A22B Thinking922±187454.5%4.3%47 tps3.0s127K$0.47$3.31
42133Kimi K2 0905922±218054.2%4.0%30 tps1.4s262K$0.63$2.39
43124Kimi K2 0905 Turbo925±131.5K4.7%0.7%373 tps0.5s262K$1.70$6.50
4495Kimi K2 Thinking926±177402.0%4.2%61 tps5.9s262K$0.24$1.03
45143Seed 1.6 250615928±216355.2%3.1%46 tps2.2s256K$0.25$2.00
4665Mistral Large 3947±201.3K4.4%2.1%51 tps1.0s256K$0.50$1.50
47101Qwen3.5 35B A3B949±275302.8%2.1%116 tps2.1s256K$0.63$1.13
48101gpt-oss-20b950±181.4K4.7%0.5%216 tps0.5s131K$0.06$0.26
4981OpenAI o3-pro951±191.6K3.4%5.2%22 tps70.8s200K$20.00$80.00
5079Qwen3 Max Thinking Preview952±201.1K2.2%3.1%40 tps2.1s256K$1.20$6.00
51129DeepSeek V3.1 Thinking955±141.1K2.2%7.1%18 tps1.8s131K$0.23$0.75
52139OpenAI o4-mini956±161.4K2.8%1.4%97 tps7.0s128K$1.10$4.40
53148OpenAI o4-mini-high958±112.2K3.1%1.9%117 tps15.9s200K$1.10$4.40
54126DeepSeek V3960±73.4K2.3%0.9%69 tps1.1s64K$0.59$1.49
55153OpenAI o1960±112.3K2.4%4.2%92 tps5.5s200K$15.00$60.00
56111LongCat Flash Chat963±255604.3%0.8%85 tps0.9s131K$0.14$0.68
57129Command A965±83K2.9%2.2%42 tps0.8s256K$2.00$7.33
58148OpenAI o3970±101.2K3.1%0.9%85 tps6.8s128K$7.33$29.33
59133GPT-4.1 nano974±112.3K3.4%0.6%175 tps0.5s1M$0.10$0.40
60143Gemini 2.0 Flash974±191.9K4.7%<0.1%76 tps0.5s1M$0.14$0.56
61113Kimi K2 Fast975±104.8K2.3%0.8%365 tps0.5s131K$1.00$3.00
6271Seed 1.8 251228983±103K2.6%3.7%41 tps2.1s256K$0.25$2.00
6365GLM 4.6991±159453.6%5.4%39 tps1.5s200K$0.42$1.66
64106DeepSeek V3.1 Terminus Thinking1000±147452.6%5.9%27 tps1.8s131K$0.56$1.68
65133DeepSeek-R1 05281001±151.1K4.1%1.3%93 tps0.5s64K$1.60$3.67
6693Qwen Max1009±112.7K2.7%1.5%49 tps1.5s33K$1.60$6.40
6795DeepSeek-R1 Turbo1009±206603.6%2.6%29 tps1.8s64K$2.85$4.75
68124Qwen3 235B A22B Thinking 25071010±167453.2%2.5%53 tps1.6s131K$0.59$5.70
69106DeepSeek V3 03241013±112.1K3.1%5.8%12 tps2.7s164K$0.38$0.93
7071GPT-5 Mini1017±103.1K5.2%2.6%66 tps14.2s400K$0.25$2.00
7156MiniMax M2.1 Lightning1019±248301.8%1.7%52 tps2.1s205K$0.30$2.40
7244Grok 4.1 Fast Reasoning1020±73.7K3.0%1.5%58 tps7.3s2M$0.20$0.50
7356DeepSeek V3.2 Thinking1021±131.9K1.8%9.0%30 tps2.6s131K$0.28$0.42
7468Qwen Plus (Aug'24)1023±92.4K2.9%1.4%53 tps1.3s30K$0.40$1.20
7552Grok 4 Fast Non-Reasoning1030±171.5K4.1%1.5%93 tps0.6s2M$0.27$0.67
7679MiniMax M2.5 Lightning1031±208201.8%1.5%51 tps2.0s205K$0.60$2.40
77106Grok 31034±82.8K2.8%1.5%53 tps0.6s1M$3.67$18.33
7829Qwen3 VL 235B A22B Instruct1036±161.3K4.2%3.1%75 tps1.9s129K$0.37$1.81
7986DeepSeek V3.1 Chat1038±139752.5%2.8%21 tps1.6s131K$0.38$1.00
8095DeepSeek V3.2 Exp Thinking1038±176553.7%7.2%26 tps3.0s131K$0.28$0.42
View All (154 models)