Models
Topics
Language
More

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

Open license models

Filter the leaderboard to only show models that have an open license.

1098
Gemini 2.5 Flash
1093
Grok 4
1090
Grok 4 Fast Reasoning
1083
Gemini 3.1 Flash Lite Preview Thinking
1083
Kimi K2.5
1083
gpt-oss-120b
1076
Grok 4.1 Fast Reasoning
1075
GPT-5 Mini
1075
GPT-5.1 Instant
1074
DeepSeek V3 0324 Turbo
1066
Gemini 2.5 Flash Lite Preview 0925
1066
Claude Sonnet 4
1063
Qwen3 Max Instruct Preview
1063
DeepSeek V3.2
1057
Claude Haiku 4.5

Last updated about 1 month ago

RankOverallNameVIBE
Score
Confidence
Interval
VotesDownvote %Abort %SpeedLatencyContextCost
(Input)
Cost
(Output)
4195Gemini 2.5 Flash1098±94.8K2.7%1.3%2 tps3.7s1M$0.30$2.50
4268Grok 41093±55.7K4.0%3.9%29 tps11.1s256K$3.00$15.00
4348Grok 4 Fast Reasoning1090±112.1K3.1%2.1%102 tps3.1s2M$0.30$0.75
4456Gemini 3.1 Flash Lite Preview Thinking1083±324853.0%1.7%75 tps4.7s1M$0.25$1.50
4533Kimi K2.51083±161.7K3.2%6.5%33 tps1.7s262K$0.34$2.57
4648gpt-oss-120b1083±73.5K3.0%0.7%213 tps0.5s131K$0.11$0.50
4744Grok 4.1 Fast Reasoning1076±102.6K4.2%1.5%58 tps7.3s2M$0.20$0.50
4871GPT-5 Mini1075±92.1K4.3%2.6%66 tps14.2s400K$0.25$2.00
4962GPT-5.1 Instant1075±92.2K2.6%1.3%50 tps1.9s400K$1.25$10.00
5093DeepSeek V3 0324 Turbo1074±142.1K1.9%6.3%12 tps2.4s164K$0.73$1.79
5171Gemini 2.5 Flash Lite Preview 09251066±112.2K2.8%1.2%209 tps0.7s1M$0.25$0.35
5286Claude Sonnet 41066±85.3K2.5%1.8%49 tps1.3s200K$3.00$15.00
5342Qwen3 Max Instruct Preview1063±72.7K1.5%1.1%31 tps1.7s256K$1.43$6.61
5440DeepSeek V3.21063±161.1K2.5%1.4%83 tps5.1s131K$0.43$1.09
5552Claude Haiku 4.51057±63.4K3.4%1.1%100 tps0.9s200K$1.00$5.00
5652Grok 4 Fast Non-Reasoning1054±81.6K2.5%1.5%93 tps0.6s2M$0.27$0.67
5781GPT-4o1046±151.4K2.5%1.0%49 tps2.4s128K$3.71$12.57
5860MiniMax M2.11044±121.7K2.8%2.1%66 tps2.6s205K$0.30$1.20
5995Gemini 2.5 Flash Lite Thinking Preview 09251044±91.7K3.5%1.5%152 tps3.0s1M$0.10$0.40
6062MiniMax M21043±91.8K4.2%2.2%39 tps2.3s205K$0.21$0.85
6165GLM 4.61041±111.6K2.9%5.4%39 tps1.5s200K$0.42$1.66
6244DeepSeek V3.1 Terminus Chat1037±91.3K2.2%3.4%27 tps1.5s131K$0.86$1.80
6356DeepSeek V3.2 Thinking1033±151.7K2.0%9.0%30 tps2.6s131K$0.28$0.42
64129Qwen3 Max Thinking1029±316002.4%13.5%32 tps2.3s256K$1.20$6.00
6565Mistral Large 31026±221.1K4.1%2.1%51 tps1.0s256K$0.50$1.50
6695DeepSeek-R1 Turbo1021±134853.0%2.6%29 tps1.8s64K$2.85$4.75
6793Qwen Max1021±141.8K2.7%1.5%49 tps1.5s33K$1.60$6.40
6879Qwen3 Max Thinking Preview1020±101.2K2.4%3.1%40 tps2.1s256K$1.20$6.00
6986DeepSeek V3.1 Chat1018±121.1K3.1%2.8%21 tps1.6s131K$0.38$1.00
70133Kimi K2 09051014±138102.4%4.0%30 tps1.4s262K$0.63$2.39
71101Gemini 2.5 Flash Lite1014±95.3K3.9%1.3%210 tps0.7s1M$0.10$0.40
7286Amazon Nova 2 Lite1013±188154.7%1.0%137 tps0.6s300K$0.35$2.95
73106Grok 31003±92K2.6%1.5%53 tps0.6s1M$3.67$18.33
74170Kimi K2 07111002±157203.4%1.6%29 tps1.3s131K$0.72$2.60
7595DeepSeek V3.2 Exp Thinking999±227751.9%7.2%26 tps3.0s131K$0.28$0.42
76113Gemini 2.5 Flash Lite Thinking996±112.5K3.7%1.0%118 tps4.4s1M$0.03$0.13
77124Kimi K2 0905 Turbo991±121.6K1.8%0.7%373 tps0.5s262K$1.70$6.50
78106DeepSeek V3 0324990±112.1K3.0%5.8%12 tps2.7s164K$0.38$0.93
79113Kimi K2 Fast989±107.4K2.2%0.8%365 tps0.5s131K$1.00$3.00
8086Qwen3 235B A22B989±197403.9%5.3%71 tps0.9s41K$0.23$0.63
View All (133 models)