Models
Topics
Language
More

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

Open license models

Filter the leaderboard to only show models that have an open license.

1040
GPT-5.1 Instant
1040
Claude Sonnet 4.5
1039
Claude Sonnet 3.7
1038
Qwen3 Next 80B A3B Instruct
1036
MiniMax M2.1
1025
Gemini 2.5 Flash Preview 0925
1023
Grok 4.1 Fast Non-Reasoning
1023
Grok 4 Fast Non-Reasoning
1022
Grok 4
1022
Grok 4 Fast Reasoning
1016
Grok 4.1 Fast Reasoning
1011
Claude Sonnet 4
1000
Gemini 2.5 Flash Thinking
1000
gpt-oss-120b
997
Claude Opus 4.1 (Thinking)

Last updated about 1 month ago

RankOverallNameVIBE
Score
Confidence
Interval
VotesDownvote %Abort %SpeedLatencyContextCost
(Input)
Cost
(Output)
4162GPT-5.1 Instant1040±139152.7%1.3%50 tps1.9s400K$1.25$10.00
4237Claude Sonnet 4.51040±82K3.2%1.4%41 tps1.3s200K$1.80$9.00
43111Claude Sonnet 3.71039±198002.4%<0.1%39 tps1.6s200K$3.00$15.00
4433Qwen3 Next 80B A3B Instruct1038±159202.6%0.6%84 tps1.1s256K$0.20$1.42
4560MiniMax M2.11036±226951.4%2.1%66 tps2.6s205K$0.30$1.20
4660Gemini 2.5 Flash Preview 09251025±131.2K2.0%1.2%5 tps0.9s1M$0.13$0.97
4726Grok 4.1 Fast Non-Reasoning1023±218201.8%0.9%101 tps0.5s2M$0.20$0.50
4852Grok 4 Fast Non-Reasoning1023±168701.7%1.5%93 tps0.6s2M$0.27$0.67
4968Grok 41022±102.1K2.5%3.9%29 tps11.1s256K$3.00$15.00
5048Grok 4 Fast Reasoning1022±141.2K2.0%2.1%102 tps3.1s2M$0.30$0.75
5144Grok 4.1 Fast Reasoning1016±181.4K2.0%1.5%58 tps7.3s2M$0.20$0.50
5286Claude Sonnet 41011±191.8K1.4%1.8%49 tps1.3s200K$3.00$15.00
5371Gemini 2.5 Flash Thinking1000±181K1.9%2.2%88 tps6.4s1M$0.30$2.50
5448gpt-oss-120b1000±151.1K1.3%0.7%213 tps0.5s131K$0.11$0.50
5556Claude Opus 4.1 (Thinking)997±147403.9%<0.1%20 tps3.9s200K$15.00$75.00
5668GLM 4.7992±336352.3%5.8%40 tps1.5s200K$0.77$1.73
5793Qwen Max979±196952.1%1.5%49 tps1.5s33K$1.60$6.40
5856DeepSeek V3.1 Turbo969±376651.5%0.9%173 tps1.3s164K$2.00$3.75
59108GPT-5 Mini Low968±145903.3%<0.1%69 tps3.2s400K$0.25$2.00
6077Claude Opus 4.1964±226104.7%3.0%17 tps3.7s200K$15.00$75.00
6152GPT-5957±201.6K2.9%3.1%78 tps23.1s400K$1.25$9.67
6284GPT-5 Mini Minimal953±165953.3%1.2%63 tps1.4s400K$0.25$2.00
63101Gemini 2.5 Flash Lite948±161.6K2.7%1.3%210 tps0.7s1M$0.10$0.40
6471Gemini 2.5 Flash Lite Preview 0925948±161.1K2.2%1.2%209 tps0.7s1M$0.25$0.35
6581GPT-4o945±315053.8%1.0%49 tps2.4s128K$3.71$12.57
6656DeepSeek V3.2 Thinking942±267052.8%9.0%30 tps2.6s131K$0.28$0.42
6748OpenAI o1-mini937±275802.5%<0.1%118 tpsN/A128K$1.13$4.51
6879Qwen3 Max Thinking Preview925±265301.9%3.1%40 tps2.1s256K$1.20$6.00
69126Qwen3 VL 235B A22B Thinking922±196453.0%4.3%47 tps3.0s127K$0.47$3.31
7080GPT-5 (Minimal)922±139554.0%<0.1%67 tps1.4s400K$1.25$10.00
71106DeepSeek V3 0324920±255700.9%5.8%12 tps2.7s164K$0.38$0.93
7244Kimi K2 Thinking Turbo917±275301.9%2.0%75 tps1.4s262K$1.15$8.00
73126DeepSeek V3910±385651.7%0.9%69 tps1.1s64K$0.59$1.49
74118GPT-4.1 mini900±169501.0%1.1%67 tps0.9s1M$0.34$1.60
7562MiniMax M2900±247202.7%2.2%39 tps2.3s205K$0.21$0.85
76113Kimi K2 Fast887±141.6K1.0%0.8%365 tps0.5s131K$1.00$3.00
77129DeepSeek V3.1 Thinking886±165102.9%7.1%18 tps1.8s131K$0.23$0.75
7865Mistral Large 3881±274953.9%2.1%51 tps1.0s256K$0.50$1.50
79124Kimi K2 0905 Turbo881±177102.1%0.7%373 tps0.5s262K$1.70$6.50
80106Grok 3872±267451.3%1.5%53 tps0.6s1M$3.67$18.33
View All (98 models)