Models
Topics
Language
More

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

Open license models

Filter the leaderboard to only show models that have an open license.

725
Grok 3 Mini
733
Grok 3 Mini Fast
736
OpenAI o3-mini-low
738
OpenAI o3-mini
805
Gemma 3n E4B
877
OpenAI o4-mini-high
883
Gemini 2.0 Flash Lite
886
Gemini 2.5 Flash Thinking
887
Gemini 2.0 Flash
890
Llama 4 Maverick
894
Seed 1.8 251228
903
Qwen3 Max Thinking Preview
908
Mistral Medium
911
Mistral Large 3
917
OpenAI o4-mini

Last updated about 1 month ago

RankOverallNameVIBE
Score
Confidence
Interval
VotesDownvote %Abort %SpeedLatencyContextCost
(Input)
Cost
(Output)
1186Grok 3 Mini725±296253.1%1.2%43 tps0.5s131K$0.30$0.50
2186Grok 3 Mini Fast733±255403.6%1.6%44 tps0.5s131K$0.60$4.00
3175OpenAI o3-mini-low736±285552.6%0.7%139 tps1.5s200K$1.10$4.40
4177OpenAI o3-mini738±217002.1%0.8%143 tps3.3s200K$1.10$4.40
5186Gemma 3n E4B805±255451.8%2.0%30 tps0.5s8K$0.01$0.02
6148OpenAI o4-mini-high877±196852.1%1.9%117 tps15.9s200K$1.10$4.40
7143Gemini 2.0 Flash Lite883±221.6K1.2%<0.1%42 tps0.5s1M$0.08$0.30
871Gemini 2.5 Flash Thinking886±226301.6%2.2%88 tps6.4s1M$0.30$2.50
9143Gemini 2.0 Flash887±236802.2%<0.1%76 tps0.5s1M$0.14$0.56
10161Llama 4 Maverick890±142.2K1.8%1.2%88 tps2.4s1M$0.23$0.83
1171Seed 1.8 251228894±256601.5%3.7%41 tps2.1s256K$0.25$2.00
1279Qwen3 Max Thinking Preview903±335350.9%3.1%40 tps2.1s256K$1.20$6.00
13113Mistral Medium908±218451.7%1.8%48 tps0.6s33K$1.48$4.55
1465Mistral Large 3911±305353.6%2.1%51 tps1.0s256K$0.50$1.50
15139OpenAI o4-mini917±156502.3%1.4%97 tps7.0s128K$1.10$4.40
16129Command A924±172K2.0%2.2%42 tps0.8s256K$2.00$7.33
17170Kimi K2 0711925±205103.8%1.6%29 tps1.3s131K$0.72$2.60
1868GLM 4.7933±208701.7%5.8%40 tps1.5s200K$0.77$1.73
19133GPT-4.1 nano944±131.5K2.0%0.6%175 tps0.5s1M$0.10$0.40
20124Kimi K2 0905 Turbo946±197951.9%0.7%373 tps0.5s262K$1.70$6.50
21160Llama 4 Scout947±141.6K1.9%0.6%88 tps5.1s131K$0.18$0.46
2244Kimi K2 Thinking Turbo954±285051.0%2.0%75 tps1.4s262K$1.15$8.00
2386Claude Sonnet 4958±123.4K1.0%1.8%49 tps1.3s200K$3.00$15.00
24118GPT-4.1 mini963±131.9K1.3%1.1%67 tps0.9s1M$0.34$1.60
2560Gemini 2.5 Flash Preview 0925964±247001.4%1.2%5 tps0.9s1M$0.13$0.97
26121QwQ 32B968±206501.5%5.4%41 tps2.1s16K$0.43$0.56
2771Gemini 2.5 Flash Lite Preview 0925968±186801.4%1.2%209 tps0.7s1M$0.25$0.35
2862MiniMax M2974±217601.3%2.2%39 tps2.3s205K$0.21$0.85
29126DeepSeek V3974±161.5K1.4%0.9%69 tps1.1s64K$0.59$1.49
3095Gemini 2.5 Flash976±143.7K1.6%1.3%2 tps3.7s1M$0.30$2.50
31119ERNIE 4.5 300B A47B976±151.2K2.0%4.7%23 tps2.3s123K$0.28$1.10
3271GPT-5 Mini979±275350.9%2.6%66 tps14.2s400K$0.25$2.00
33113Gemini 2.5 Flash Lite Thinking986±195801.7%1.0%118 tps4.4s1M$0.03$0.13
3448Grok 4 Fast Reasoning988±176350.8%2.1%102 tps3.1s2M$0.30$0.75
3568Grok 4989±113.3K0.9%3.9%29 tps11.1s256K$3.00$15.00
36113Kimi K2 Fast992±103.5K2.8%0.8%365 tps0.5s131K$1.00$3.00
37106Grok 3996±121.5K1.6%1.5%53 tps0.6s1M$3.67$18.33
3860MiniMax M2.11007±209551.0%2.1%66 tps2.6s205K$0.30$1.20
39101Gemini 2.5 Flash Lite1012±161.5K2.2%1.3%210 tps0.7s1M$0.10$0.40
4042GPT-5.2 (Extra High) 1020±246000.8%13.2%17 tps20.5s400K$1.75$14.00
4152GPT-51023±141.1K1.7%3.1%78 tps23.1s400K$1.25$9.67
4248Claude Sonnet 4 (Thinking)1031±207751.3%1.5%52 tps1.5s200K$3.00$13.67
4352Grok 4 Fast Non-Reasoning1031±296000.8%1.5%93 tps0.6s2M$0.27$0.67
4444DeepSeek V3.1 Terminus Chat1034±225901.7%3.4%27 tps1.5s131K$0.86$1.80
4544Grok 4.1 Fast Reasoning1036±251.2K1.2%1.5%58 tps7.3s2M$0.20$0.50
4617GPT-5.2 (High)1036±201.3K1.1%6.7%18 tps16.3s400K$1.75$14.00
4768Qwen Plus (Aug'24)1046±181.5K1.3%1.4%53 tps1.3s30K$0.40$1.20
4840Qwen3 235B A22B Instruct 25071050±177951.2%6.8%13 tps1.9s262K$0.13$0.52
4956DeepSeek V3.2 Thinking1050±236951.4%9.0%30 tps2.6s131K$0.28$0.42
5093DeepSeek V3 0324 Turbo1054±161.5K2.0%6.3%12 tps2.4s164K$0.73$1.79
5193Qwen Max1056±111.6K2.4%1.5%49 tps1.5s33K$1.60$6.40
5242Qwen3 Max Instruct Preview1059±199801.0%1.1%31 tps1.7s256K$1.43$6.61
53106Claude Sonnet 3.5 v21069±236301.6%<0.1%46 tps1.4s200K$3.00$15.00
5462GPT-5.1 Instant1070±217051.4%1.3%50 tps1.9s400K$1.25$10.00
5544Gemini 2.5 Pro1076±121.9K1.3%2.3%45 tps2.6s1M$1.25$10.00
5610Claude Sonnet 4.5 (Thinking)1080±151K1.0%1.9%44 tps1.1s200K$3.00$15.00
5740DeepSeek V3.21083±256400.8%1.4%83 tps5.1s131K$0.43$1.09
5817Claude Opus 4.51086±216250.8%1.5%45 tps1.5s200K$5.00$25.00
5932Gemini 2.5 Pro High1109±201.2K1.3%1.5%48 tps2.3s1M$1.25$10.00
6081GPT-4o1121±148951.6%1.0%49 tps2.4s128K$3.71$12.57
61106DeepSeek V3 03241125±151.4K1.8%5.8%12 tps2.7s164K$0.38$0.93
6233Qwen3 Next 80B A3B Instruct1135±256700.7%0.6%84 tps1.1s256K$0.20$1.42
6348gpt-oss-120b1136±197751.3%0.7%213 tps0.5s131K$0.11$0.50
6417Gemini 3 Flash Preview1137±206150.8%1.3%138 tps1.4s1M$0.50$3.00
6533Qwen3 30B A3B Instruct 25071139±167801.3%1.2%55 tps1.3s131K$0.13$0.72
6637Claude Sonnet 4.51139±171.3K1.1%1.4%41 tps1.3s200K$1.80$9.00
678GPT-5.1 (High)1145±191.1K3.6%3.2%76 tps6.9s400K$1.25$10.00
6852Claude Haiku 4.51169±179601.0%1.1%100 tps0.9s200K$1.00$5.00
6926Grok 4.1 Fast Non-Reasoning1183±289801.0%0.9%101 tps0.5s2M$0.20$0.50
7033Kimi K2.51186±387600.7%6.5%33 tps1.7s262K$0.34$2.57
717Claude Opus 4.5 (Thinking)1188±221.1K1.8%1.8%49 tps1.4s200K$5.00$25.00
7214Gemini 3 Flash Preview Thinking1195±209501.6%1.6%3 tps6.2s1M$0.50$3.00
7326Claude Haiku 4.5 (Extended Thinking)1200±176100.8%1.4%115 tps0.7s200K$1.00$5.00
7416GPT-5.21203±266800.7%4.1%18 tps2.7s400K$1.75$14.00
7514Gemini 3 Pro (Low)1211±211K0.9%2.4%51 tps3.5s1M$2.00$12.00
7610GPT-5.2 Instant1214±211.3K0.4%1.7%52 tps2.0s400K$1.75$14.00
7710Gemini 3 Pro1235±131.6K1.2%2.1%50 tps3.6s1M$2.00$12.00
788GPT-5.11239±219600.5%2.3%71 tps1.4s400K$1.42$11.33
7922GPT-5 Chat1313±82.4K0.6%1.3%95 tps0.9s400K$1.25$10.00
Show Less