Models
Topics
Language
More

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

Open license models

Filter the leaderboard to only show models that have an open license.

1504
Claude Opus 4.6 (Thinking)
1478
GPT-5.4 (High)
1445
Claude Opus 4.6
1371
GPT-5.4
1364
Claude Sonnet 4.6
1359
Gemini 3.1 Pro
1324
Claude Sonnet 4.6 (Thinking)
1282
Gemini 3 Pro
1280
Claude Opus 4.5 (Thinking)
1278
GPT-5.1
1273
GPT-5.1 (High)
1266
Gemini 3 Pro (Low)
1264
Claude Sonnet 4.5 (Thinking)
1254
GPT-5.2 Instant
1246
GPT-5.1 (Medium)

Last updated about 1 month ago

RankOverallNameVIBE
Score
Confidence
Interval
VotesDownvote %Abort %SpeedLatencyContextCost
(Input)
Cost
(Output)
11Claude Opus 4.6 (Thinking)1504±93.2K1.5%2.5%56 tps1.6s200K$5.00$25.00
24GPT-5.4 (High)1478±121.4K1.8%4.6%68 tps7.9s1M$2.50$15.00
32Claude Opus 4.61445±84.3K1.2%2.1%48 tps1.7s200K$5.00$25.00
42GPT-5.41371±131.1K1.3%2.6%55 tps0.8s1M$2.50$15.00
54Claude Sonnet 4.61364±83K1.1%1.6%47 tps1.2s200K$3.00$15.00
66Gemini 3.1 Pro1359±95.8K2.3%3.5%35 tps4.1s1M$2.00$12.00
75Claude Sonnet 4.6 (Thinking)1324±93K2.6%4.7%57 tps1.1s200K$3.00$15.00
810Gemini 3 Pro1282±437.7K3.1%2.1%50 tps3.6s1M$2.00$12.00
97Claude Opus 4.5 (Thinking)1280±423K2.2%1.8%49 tps1.4s200K$5.00$25.00
108GPT-5.11278±58.5K4.7%2.3%71 tps1.4s400K$1.42$11.33
118GPT-5.1 (High)1273±416.6K4.0%3.2%76 tps6.9s400K$1.25$10.00
1214Gemini 3 Pro (Low)1266±58.4K4.6%2.4%51 tps3.5s1M$2.00$12.00
1310Claude Sonnet 4.5 (Thinking)1264±225.1K4.5%1.9%44 tps1.1s200K$3.00$15.00
1410GPT-5.2 Instant1254±59K3.9%1.7%52 tps2.0s400K$1.75$14.00
158GPT-5.1 (Medium)1246±63.1K6.8%<0.1%86 tps3.8s400K$0.83$6.67
1648Polaris Alpha1244±76257.4%<0.1%48 tps1.1s256K$0$0
1722GLM 51231±122.3K2.5%3.4%36 tps2.7s200K$0.72$2.55
1848Claude Sonnet 4 (Thinking)1228±65.2K3.4%1.5%52 tps1.5s200K$3.00$13.67
1917Gemini 3 Flash Preview1226±65.3K4.2%1.3%138 tps1.4s1M$0.50$3.00
2014Gemini 3 Flash Preview Thinking1225±613.8K3.4%1.6%3 tps6.2s1M$0.50$3.00
2117GPT-5.2 (High)1221±419.4K3.1%6.7%18 tps16.3s400K$1.75$14.00
2237Claude Sonnet 4.51219±513.7K6.6%1.4%41 tps1.3s200K$1.80$9.00
2316GPT-5.21219±66.2K3.8%4.1%18 tps2.7s400K$1.75$14.00
2484Claude Sonnet 3.7 (Thinking)1214±72K2.5%<0.1%41 tps2.6s200K$3.00$15.00
2521Claude Opus 41211±63.5K1.9%<0.1%25 tps1.5s200K$15.00$75.00
2617Grok 4.20 Beta Reasoning1209±218501.7%1.1%77 tps4.5s2M$2.00$5.50
2717Claude Opus 4.51207±56.6K3.6%1.5%45 tps1.5s200K$5.00$25.00
2816Nova Experimental Chat 11-101202±83.7K8.0%0.4%84 tps8.9s98K$0$0
2956Gemini 3.1 Flash Lite Preview Thinking1199±139503.1%1.7%75 tps4.7s1M$0.25$1.50
3022GPT-5 Chat1193±323.9K6.8%1.3%95 tps0.9s400K$1.25$10.00
3113GPT-5.3 Instant1191±82.2K1.8%0.9%63 tps0.8s400K$1.75$14.00
3242GPT-5.2 (Extra High) 1189±66.7K3.5%13.2%17 tps20.5s400K$1.75$14.00
3319Mistral Medium 3.11185±514.2K6.6%<0.1%77 tps0.7s128K$0.40$2.00
3456Claude Opus 4.1 (Thinking)1185±64.1K4.2%<0.1%20 tps3.9s200K$15.00$75.00
3521Claude Opus 4 (Thinking)1183±51.7K2.0%<0.1%28 tps1.3s200K$15.00$75.00
3626GPT-5 (High)1171±412K3.7%4.5%81 tps35.9s400K$1.25$10.00
3737Sherlock Dash Alpha1170±147509.6%<0.1%68 tps0.7s2M$0$0
3832Gemini 2.5 Pro High1169±416.8K7.8%1.5%48 tps2.3s1M$1.25$10.00
3933Kimi K2.51169±65.3K2.8%6.5%33 tps1.7s262K$0.34$2.57
4044Gemini 2.5 Pro1168±419.1K9.1%2.3%45 tps2.6s1M$1.25$10.00
4186Seed 2.0 Lite (Medium)1166±145752.5%6.6%33 tps1.6s256K$0.25$2.00
4271MiniMax M2.5 FP81162±175753.4%3.6%33 tps1.7s205K$0.45$1.75
4377Claude Opus 4.11161±54.1K3.9%3.0%17 tps3.7s200K$15.00$75.00
4429Qwen3 VL 235B A22B Instruct1161±64.5K8.8%3.1%75 tps1.9s129K$0.37$1.81
4543Gemini 2.5 Flash Thinking Preview 09251161±37.2K9.1%<0.1%111 tps4.7s1M$0.30$2.50
4644Kimi K2 Thinking Turbo1160±813.2K3.5%2.0%75 tps1.4s262K$1.15$8.00
4756MiniMax M2.1 Lightning1157±139701.0%1.7%52 tps2.1s205K$0.30$2.40
4880GPT-5 (Minimal)1153±66.8K10.0%<0.1%67 tps1.4s400K$1.25$10.00
4971Gemini 2.5 Flash Thinking1153±73.7K3.6%2.2%88 tps6.4s1M$0.30$2.50
5026Claude Haiku 4.5 (Extended Thinking)1152±57K6.6%1.4%115 tps0.7s200K$1.00$5.00
5142Qwen3 Max Instruct Preview1150±413.5K5.8%1.1%31 tps1.7s256K$1.43$6.61
5260MiniMax M2.11149±610.4K4.3%2.1%66 tps2.6s205K$0.30$1.20
5344Grok 4.1 Fast Reasoning1149±621.2K4.2%1.5%58 tps7.3s2M$0.20$0.50
5456Gemini 2.5 Pro Low1146±47.5K13.0%<0.1%89 tps2.4s1M$1.25$10.00
5540Qwen3 235B A22B Instruct 25071146±38.8K12.2%6.8%13 tps1.9s262K$0.13$0.52
56104Grok 3 Beta1145±91.8K0.6%<0.1%58 tps0.8s131K$3.00$15.00
5733Grok 4.20 Multi Agent Beta1143±167651.9%1.2%56 tps8.8s2M$2.00$6.00
58100Gemini 2.5 Flash Preview1141±82.1K1.0%<0.1%138 tps6.9s1M$0.15$0.60
5933Qwen3 Next 80B A3B Instruct1141±47.6K7.7%0.6%84 tps1.1s256K$0.20$1.42
6084GPT-5 Mini Minimal1139±82.8K9.7%1.2%63 tps1.4s400K$0.25$2.00
6152GPT-51138±414K7.9%3.1%78 tps23.1s400K$1.25$9.67
6226Grok 4.1 Fast Non-Reasoning1137±57.4K6.6%0.9%101 tps0.5s2M$0.20$0.50
6365GLM 4.61136±514.1K4.7%5.4%39 tps1.5s200K$0.42$1.66
64111Claude Sonnet 3.71135±46.5K6.3%<0.1%39 tps1.6s200K$3.00$15.00
6552Claude Haiku 4.51134±59.9K6.9%1.1%100 tps0.9s200K$1.00$5.00
6662GPT-5.1 Instant1134±55.5K5.7%1.3%50 tps1.9s400K$1.25$10.00
6768Grok 41130±223.2K6.4%3.9%29 tps11.1s256K$3.00$15.00
6871GPT-5 Mini1130±46.1K7.9%2.6%66 tps14.2s400K$0.25$2.00
6940DeepSeek V3.21130±54.4K5.1%1.4%83 tps5.1s131K$0.43$1.09
7079MiniMax M2.5 Lightning1128±149952.5%1.5%51 tps2.0s205K$0.60$2.40
7184Nova Experimental Chat 10-091128±62.2K14.1%<0.1%59 tps6.1s98K$0$0
7248Grok 4 Fast Reasoning1125±511.8K5.5%2.1%102 tps3.1s2M$0.30$0.75
7352Qwen3.5 122B A17B1124±171.1K3.2%1.5%82 tps1.4s256K$0.40$3.20
7437Kimi K2.5 Instant1124±131.4K2.4%2.9%32 tps3.0s262K$0.50$3.00
7581GPT-4o1124±56.5K6.1%1.0%49 tps2.4s128K$3.71$12.57
7656DeepSeek V3.2 Thinking1117±610K3.8%9.0%30 tps2.6s131K$0.28$0.42
7768Qwen Plus (Aug'24)1116±58.9K9.4%1.4%53 tps1.3s30K$0.40$1.20
7829Nova Experimental Chat 12-101115±82.2K4.8%2.4%84 tps12.9s98K$0$0
7971Gemini 3.1 Flash Lite Preview1114±276302.3%1.0%8 tps1.2s1M$0.25$1.50
8068GLM 4.71112±88.8K4.7%5.8%40 tps1.5s200K$0.77$1.73
8133Qwen Plus 07281110±101.7K9.8%<0.1%55 tps0.9s1M$0.40$1.20
8260Gemini 2.5 Flash Preview 09251110±46.7K7.5%1.2%5 tps0.9s1M$0.13$0.97
8352Grok 4 Fast Non-Reasoning1110±57.1K8.3%1.5%93 tps0.6s2M$0.27$0.67
84106Claude Sonnet 3.5 v21109±72.9K8.2%<0.1%46 tps1.4s200K$3.00$15.00
8565Mistral Large 31108±74K6.3%2.1%51 tps1.0s256K$0.50$1.50
86108GPT-5 Mini Low1108±42.5K9.0%<0.1%69 tps3.2s400K$0.25$2.00
8733Qwen3 30B A3B Instruct 25071108±58.5K9.7%1.2%55 tps1.3s131K$0.13$0.72
8877GPT-4.5 Preview1108±101.7K2.1%<0.1%36 tps3.0s200K$75.00$150.00
8956DeepSeek V3.1 Turbo1106±92.6K5.1%0.9%173 tps1.3s164K$2.00$3.75
9095Gemini 2.5 Flash Lite Thinking Preview 09251104±54.9K7.8%1.5%152 tps3.0s1M$0.10$0.40
91106Grok 31102±59.3K9.3%1.5%53 tps0.6s1M$3.67$18.33
9286Claude Sonnet 41102±418.3K7.0%1.8%49 tps1.3s200K$3.00$15.00
9344DeepSeek V3.1 Terminus Chat1100±45.1K9.6%3.4%27 tps1.5s131K$0.86$1.80
94159Gemini 2.5 Pro Preview 03251100±184755.0%<0.1%3 tps16.6s1M$1.25$10.00
9595Gemini 2.5 Flash1098±521.4K5.2%1.3%2 tps3.7s1M$0.30$2.50
96179Switchpoint Router1097±111.1K9.5%1.7%71 tps4.9s131K$0.85$3.40
9771Seed 1.8 2512281096±64.1K3.4%3.7%41 tps2.1s256K$0.25$2.00
98106DeepSeek V3 03241090±59.7K8.2%5.8%12 tps2.7s164K$0.38$0.93
99113Mistral Medium1087±45.3K9.0%1.8%48 tps0.6s33K$1.48$4.55
10062MiniMax M21087±616.5K5.2%2.2%39 tps2.3s205K$0.21$0.85
10148gpt-oss-120b1086±415.1K7.5%0.7%213 tps0.5s131K$0.11$0.50
10262Qwen3 Omni 30B A3B Instruct1085±145706.6%3.9%65 tps1.2s66K$0.35$0.97
10386DeepSeek V3.1 Chat1084±63.7K10.1%2.8%21 tps1.6s131K$0.38$1.00
10481Qwen3.5 27B1082±265502.7%3.7%55 tps2.6s256K$0.30$2.40
10548Step 3.5 Flash1079±236452.3%2.2%109 tps0.6s256K$0.05$0.15
10665DeepSeek V3.2 Exp Chat1079±44.3K8.8%2.6%29 tps1.5s131K$0.27$0.39
10793Qwen Max1077±58.8K9.1%1.5%49 tps1.5s33K$1.60$6.40
108182GLM 4.6 FP81076±1085016.7%<0.1%56 tps1.8s200K$0.40$1.75
10937Nova Experimental Chat 10-201076±43.6K11.6%<0.1%30 tps0.5s98K$0$0
11086Qwen3 235B A22B1074±102.8K14.4%5.3%71 tps0.9s41K$0.23$0.63
11171Gemini 2.5 Flash Lite Preview 09251070±56.7K8.6%1.2%209 tps0.7s1M$0.25$0.35
11295Qwen3 32B1070±186207.5%3.9%30 tps3.1s41K$0.12$0.42
11395Kimi K2 Thinking1064±121.6K6.8%4.2%61 tps5.9s262K$0.24$1.03
11495DeepSeek V3.2 Exp Thinking1063±85K3.4%7.2%26 tps3.0s131K$0.28$0.42
11548OpenAI o1-mini1062±46.2K12.1%<0.1%118 tpsN/A128K$1.13$4.51
116118GPT-4.1 mini1060±411.7K6.8%1.1%67 tps0.9s1M$0.34$1.60
117113Gemini 2.5 Flash Lite Thinking1059±56.6K9.5%1.0%118 tps4.4s1M$0.03$0.13
11893DeepSeek V3 0324 Turbo1055±59.3K10.3%6.3%12 tps2.4s164K$0.73$1.79
11971Qwen3.5 397B A17B1055±111.6K2.1%4.3%57 tps1.4s256K$0.52$3.00
120139Seed 2.0 Mini (Medium)1053±216054.0%11.9%33 tps1.7s256K$0.15$0.60
121111Grok 3 Fast1051±171.1K2.6%1.7%52 tps2.4s131K$5.00$25.00
122147Grok 4 0709 EU1051±1292010.7%<0.1%33 tps8.2s128K$3.00$15.00
12381OpenAI o3-pro1048±82.2K3.5%5.2%22 tps70.8s200K$20.00$80.00
12437Qwen3 Omni 30B A3B Thinking1047±111.3K5.9%3.7%67 tps1.2s66K$0.97$1.79
125106DeepSeek V3.1 Terminus Thinking1047±72.5K11.6%5.9%27 tps1.8s131K$0.56$1.68
126119GLM 4.7 FP81046±194903.0%6.9%40 tps1.3s200K$0.30$1.20
127100Qwen Plus 0728 (Thinking)1045±1089011.0%<0.1%56 tps1.1s1M$0.40$4.00
12879Qwen3 Max Thinking Preview1044±55.1K7.7%3.1%40 tps2.1s256K$1.20$6.00
129143Seed 1.6 2506151042±131.2K4.8%3.1%46 tps2.2s256K$0.25$2.00
130165Pixtral Large1042±82.5K5.1%2.5%57 tps1.3s128K$1.50$4.50
131133Solar Pro 2 2507101041±66.4K14.2%<0.1%9 tpsN/A66K$0.50$0.50
132124Kimi K2 0905 Turbo1041±46.8K12.4%0.7%373 tps0.5s262K$1.70$6.50
133143Gemini 2.0 Flash1038±63.7K8.9%<0.1%76 tps0.5s1M$0.14$0.56
134101Gemini 2.5 Flash Lite1038±512.8K12.6%1.3%210 tps0.7s1M$0.10$0.40
135153OpenAI o11037±151.2K4.8%4.2%92 tps5.5s200K$15.00$60.00
136157GPT-5 Nano1035±63.8K10.6%3.2%113 tps20.9s400K$0.05$0.40
137101DeepSeek V3 (Turbo)1034±111K5.9%1.5%32 tps1.5s64K$0.40$1.30
138119ERNIE 4.5 300B A47B1032±66.1K8.7%4.7%23 tps2.3s123K$0.28$1.10
13995DeepSeek-R1 Turbo1032±101.4K5.5%2.6%29 tps1.8s64K$2.85$4.75
14071DeepSeek V3.11029±101.1K4.5%0.8%197 tps0.4s164K$0.55$1.60
141129Command A1029±511K8.4%2.2%42 tps0.8s256K$2.00$7.33
142126DeepSeek V31028±75.9K5.7%0.9%69 tps1.1s64K$0.59$1.49
14386Amazon Nova 2 Lite1027±102.6K7.9%1.0%137 tps0.6s300K$0.35$2.95
144200Claude Sonnet 3.51023±91.6K8.4%1.0%40 tps2.7s200K$3.00$15.00
145241GPT-5 Mini High1021±72.4K13.6%<0.1%33 tps3.9s400K$0.25$2.00
146113GLM 4.5 AirX1020±108059.0%3.3%75 tps1.2s131K$1.10$4.50
147133GPT-4.1 nano1020±48.8K9.7%0.6%175 tps0.5s1M$0.10$0.40
148124Qwen3 235B A22B Thinking 25071018±111.1K4.2%2.5%53 tps1.6s131K$0.59$5.70
149200NVIDIA Llama 3.1 Nemotron 70B1018±82.4K5.9%<0.1%9 tps0.1s128K$0.33$0.39
150111Solar Pro 3 (Reasoning)1017±156201.6%3.2%118 tps1.2s131K$0.15$0.60
151148OpenAI o31016±111.3K4.6%0.9%85 tps6.8s128K$7.33$29.33
152129DeepSeek V3.1 Thinking1014±73.9K14.0%7.1%18 tps1.8s131K$0.23$0.75
153139Qwen3 VL 30B A3B Instruct1012±171K6.5%1.8%80 tps2.6s129K$0.18$0.67
154153GLM 4.5 FP81012±1746012.4%<0.1%59 tps1.2s131K$0.41$1.65
155143Gemini 2.0 Flash Lite1011±65.7K6.9%<0.1%42 tps0.5s1M$0.08$0.30
156113Kimi K2 Fast1006±426.2K13.8%0.8%365 tps0.5s131K$1.00$3.00
157219NVIDIA Llama 3.3 Nemotron Super 49B v11002±131.2K9.7%<0.1%13 tpsN/A131K$0.07$0.20
158113GLM 4.51002±53.7K14.3%3.7%46 tps1.4s131K$0.43$1.63
159121NVIDIA Llama 3.3 Nemotron Super 49B v1.51000±161K9.9%2.0%50 tps0.6s131K$0.09$0.33
160139OpenAI o4-mini1000±54.8K10.2%1.4%97 tps7.0s128K$1.10$4.40
161200K2 Think999±141.1K6.2%<0.1%418 tps2.8sN/A$0$0
162170Llama 3.1 8B Turbo998±121.1K2.8%2.1%650 tps0.5s128K$0.13$0.14
163111LongCat Flash Chat996±149307.0%0.8%85 tps0.9s131K$0.14$0.68
164219Arcee AI Virtuoso-Large992±101.4K15.2%<0.1%64 tps0.5s131K$0.75$1.20
165133Kimi K2 0905991±67.5K5.6%4.0%30 tps1.4s262K$0.63$2.39
166129Qwen3 Max Thinking990±131.7K2.3%13.5%32 tps2.3s256K$1.20$6.00
167159Qwen Turbo987±44.8K12.9%<0.1%53 tps1.1s1M$0.05$0.20
168157Cogito v2.1 671B984±177155.9%0.8%85 tps0.5s128K$1.25$1.25
169133DeepSeek-R1 0528983±121.3K4.6%1.3%93 tps0.5s64K$1.60$3.67
170201Gemma 3 27B IT983±1590510.4%2.0%60 tps0.8s128K$0.17$0.29
171147GLM 4.5 Air981±64.6K15.0%<0.1%22 tps1.4s131K$0.10$0.38
172147Arcee AI Maestro Reasoning980±91.8K12.0%<0.1%85 tps0.3s131K$0.90$3.30
173241Arcee AI Blitz979±136105.4%<0.1%6 tpsN/A33K$0.45$0.75
174126Qwen3 VL 235B A22B Thinking979±63.5K11.5%4.3%47 tps3.0s127K$0.47$3.31
175165DeepSeek R1T2 Chimera978±101.1K11.0%3.0%28 tps1.8s164K$0.13$0.45
176214Qwen 2.5 VL 32B Instruct977±208507.6%6.3%43 tps3.2s128K$0.35$0.62
177209Seed 1.6 Flash 250715974±169806.2%2.5%108 tps1.6s256K$0.07$0.30
178265Llama 3.1 405B Instruct Turbo973±1862510.1%<0.1%26 tps0.8s131K$3.50$3.50
179222Sky T1 32B Preview972±1680510.6%7.8%73 tps0.6s16K$0.12$0.18
180302YouTube968±123.7K3.4%<0.1%34 tps2.7s32K$0.99$0.99
181177Mistral Small 3.1 24B Instruct966±121K10.6%7.5%15 tps2.4s131K$0.06$0.18
182161Mistral Small 3.1960±1691511.2%7.4%13 tps2.6s32K$0.17$0.28
183302OLMo 2 0425 1B Instruct956±195701.7%<0.1%68 tps0.0s4K$0$0
184161Llama 4 Maverick956±411.2K8.2%1.2%88 tps2.4s1M$0.23$0.83
185121QwQ 32B955±75K15.3%5.4%41 tps2.1s16K$0.43$0.56
186253Magistral Medium955±1479518.0%<0.1%95 tps0.5s41K$2.00$5.00
187101gpt-oss-20b954±56.1K10.8%0.5%216 tps0.5s131K$0.06$0.26
188233Llama 3.1 70B Instruct Turbo951±101.9K10.3%<0.1%110 tps0.8s128K$0.88$0.88
189165Qwen3 VL 30B A3B Thinking949±81.5K11.2%4.5%84 tps2.9s127K$0.20$1.47
190213Claude Haiku 3.5949±83.4K9.7%0.8%40 tps2.8s200K$0.80$4.00
191177OpenAI o3-mini946±66.7K12.3%0.8%143 tps3.3s200K$1.10$4.40
192153Ministral 14B 3.0945±2849011.7%2.0%119 tps0.5s128K$0.20$0.20
193170Devstral Medium945±111.6K14.7%1.5%77 tps0.6s131K$0.40$2.00
194292GPT-5 Nano Minimal945±111.3K12.9%<0.1%88 tps0.8s400K$0.05$0.40
195241Claude Haiku 3944±1188010.7%0.4%62 tps0.5s200K$0.25$1.25
196194Llama 3.2 11B Instruct943±1474514.4%1.5%152 tps0.5s8K$0.16$0.16
197219EXAONE Deep 32B941±175254.5%<0.1%24 tpsN/A33K$0$0
198126Qwen3 30B A3B939±53.7K12.1%5.1%163 tps1.0s41K$0.06$0.21
199201GPT-4o mini939±91.4K9.2%2.1%71 tps1.7s128K$0.15$0.60
200148DeepSeek-R1939±121.6K5.5%0.8%133 tps0.6s64K$0.91$3.07
20186Nemotron 3 Nano (Thinking)938±141.3K7.6%2.0%200 tps0.5s256K$0$0
202139GLM 4.6V938±112.5K6.1%6.4%21 tps1.8s128K$0.38$0.90
203148OpenAI o4-mini-high937±66K14.9%1.9%117 tps15.9s200K$1.10$4.40
204175OpenAI o3-mini-low937±46.1K13.6%0.7%139 tps1.5s200K$1.10$4.40
205177Llama 3 70B Turbo935±91.9K2.1%<0.1%31 tps0.0s8K$0.73$0.83
206133Qwen3 14B933±112.7K17.1%1.7%109 tps0.8s41K$0.04$0.15
207270Arcee AI Virtuoso-Medium932±154906.7%<0.1%3 tpsN/A131K$0.50$0.80
208121Qwen3 32B Fast932±54.5K12.9%11.6%30 tps3.1s41K$0.10$0.25
209265Qwen 2.5 VL 72B Instruct929±121.2K7.9%5.3%25 tps3.7s128K$1.01$2.79
210209Qwen 2.5 14B Instruct928±1391011.7%2.4%40 tps1.6s1M$0.40$1.61
211219Grok 3 Mini Beta926±111.1K1.3%<0.1%75 tps0.5s131K$0.45$2.25
212133DeepSeek V3.2 Speciale924±121.6K6.3%6.0%43 tps1.4s131K$0.84$1.52
213201Devstral Small924±1657012.3%2.4%180 tps0.6s131K$0.10$0.30
214211Gemini 1.5 Pro922±81.8K3.0%<0.1%15 tps0.0s2M$0.78$3.13
215194Magistral Small 2506920±152K6.9%1.6%156 tps0.5s40K$0.37$1.10
216179Inception Mercury919±102.8K11.5%0.4%257 tps1.1s32K$0.25$1.00
217148Qwen3 30B A3B Thinking 2507919±101.3K3.6%0.5%124 tps1.2s131K$0.16$1.70
218229Magistral Medium 2509918±82.1K11.3%4.0%58 tps0.9s131K$2.00$5.00
219153Qwen 2.5 32B Instruct916±91.9K18.0%2.5%48 tps1.0s131K$0.21$0.25
220179Amazon Nova Pro 1.0916±162.1K10.3%0.9%96 tps0.7s300K$0.80$1.70
221186Mistral Small 3.2 24B Instruct915±225259.5%1.9%113 tps1.1s131K$0.02$0.08
222214Llama 3.3 70B Instruct Turbo914±2464011.7%2.0%78 tps1.0s131K$0.88$0.88
223160Llama 4 Scout911±68K9.6%0.6%88 tps5.1s131K$0.18$0.46
224179Baichuan-M2-32B911±2550513.7%<0.1%32 tps3.3s131K$0.07$0.07
225170Kimi K2 0711911±83.2K9.2%1.6%29 tps1.3s131K$0.72$2.60
226170Mistral Small 3.2 24B911±102K12.4%2.8%141 tps0.7s33K$0.02$0.08
227182Fauna Fox909±112.5K10.1%<0.1%194 tps0.3s128K$0.04$0.15
228253R1 1776908±1787014.3%<0.1%61 tps1.0s128K$2.00$8.00
229214OpenAI o3-mini-high908±141.1K6.5%2.4%231 tps10.5s200K$1.10$4.40
230246DeepSeek-R1 Distill Llama 70B907±235357.0%3.6%27 tps1.6s32K$0.73$0.95
231186Gemma 3n E4B905±102.6K8.4%2.0%30 tps0.5s8K$0.01$0.02
232201ERNIE 4.5 VL 424B A47B905±128057.5%4.9%36 tps3.5s123K$0.42$1.25
233292NVIDIA Llama 3.1 Nemotron Ultra 253B v1905±1478511.3%<0.1%40 tps0.8s128K$0.30$0.90
234209Llama 3.3 Swallow 70B Instruct904±81.6K15.2%1.4%153 tps1.3s131K$0.13$0.39
235186Grok 3 Mini903±56K12.8%1.2%43 tps0.5s131K$0.30$0.50
236157Qwen3 Next 80B A3B Thinking903±74.9K11.2%0.6%175 tps1.3s256K$0.21$2.26
237161Qwen3 8B902±122K17.8%2.4%61 tps1.4s41K$0.02$0.07
238186Gemma 3 27B902±1861513.4%1.8%35 tps1.1s66K$0.06$0.10
239186Grok 3 Mini Fast897±75.2K14.9%1.6%44 tps0.5s131K$0.60$4.00
240277Dobby Unhinged Llama 3.3 70B896±227751.9%<0.1%41 tps0.4s128K$0.90$0.90
241186Jamba 1.6 Large895±118809.7%2.0%59 tps1.2s256K$1.33$5.33
242277Jamba 1.7 Mini895±1762515.5%<0.1%84 tps0.9s256K$0.20$0.40
243194Llama 3.3 70B893±102.2K9.2%0.3%500 tps0.5s8K$0.48$0.66
244235GLM 4 32B893±121.2K11.1%2.6%40 tps1.6s33K$0.14$0.14
245214C4AI Aya Expanse 32B889±111.3K10.0%1.5%43 tps0.5s128K$0.50$1.50
246179GLM 4.7 Flash887±138452.9%5.8%61 tps2.8s128K$0.07$0.39
247246Mixtral 8x22B880±2346512.3%1.2%140 tps0.6s64K$2.00$6.00
248361Magistral Small 2507879±1858018.3%<0.1%148 tps0.4s41K$0.50$1.50
249186Jamba 1.7 Large877±1762015.1%1.3%58 tps1.0s256K$1.33$5.33
250214Gemma 3 12B874±1591011.2%4.2%73 tps0.8s131K$0.05$0.12
251225Command R872±1677012.5%5.8%54 tps0.6s128K$0.30$0.99
252165Qwen3 4B871±73.3K16.3%1.9%94 tps1.5s128K$0.01$0.01
253256Solar Mini 250422869±1564016.3%1.8%90 tps1.7s33K$0.15$0.15
254229Ministral 8B868±1580013.5%1.4%177 tps0.4s128K$0.14$0.14
255361Meridian867±2349014.0%<0.1%92 tps1.2s131K$0$0
256246Mixtral 8x22B Instruct866±3157010.2%1.8%142 tps0.7s66K$0.45$0.45
257274Moonshot V1 128k Vision862±175857.1%3.1%44 tps3.8s131K$2.00$5.00
258277Grok 2862±177158.9%<0.1%55 tps1.1s131K$2.00$10.00
259270AFM 4.5B Preview861±191.8K3.3%<0.1%32 tps0.0s66K$0$0
260194Mistral Small 3 24B Instruct858±1858512.7%2.6%77 tps0.6s33K$0.07$0.14
261314MAI-DS-R1855±92.4K19.9%<0.1%73 tps3.2s64K$0.10$0.40
262179Qwen 2.5 72B854±2653511.6%1.2%96 tps1.2s131K$0.14$0.26
263222Rnj-1 Instruct853±245907.8%0.6%103 tps0.3s33K$0.15$0.15
264213DeepSeek R1T Chimera852±167459.7%<0.1%46 tps1.1s164K$0.09$0.36
265240Hermes 4 405B FP8850±1650013.0%3.5%31 tps0.9s131K$0.52$1.73
266222Jamba 1.5 Large844±1391512.0%1.7%48 tps0.9s256K$1.50$6.00
267260Hermes 4 405B Reasoning FP8844±122.1K18.8%3.6%32 tps0.8s131K$1.00$3.00
268265Magistral Small 2509844±267458.0%2.7%116 tps0.6s131K$0.50$1.50
269277Wikipedia843±161.2K7.7%<0.1%47 tps2.1s32K$0$0
270201Llama 3 8B840±1983515.7%6.0%85 tps0.7s8K$0.12$0.16
271186GLM 4.6V Flash837±92K9.0%3.7%64 tps2.1s128K$0.04$0.40
272369Magistral Medium 2507837±1752020.0%<0.1%86 tps0.7s41K$2.00$5.00
273292Arcee AI Spotlight833±91.7K16.2%<0.1%121 tps0.4s131K$0.18$0.18
274214Krutrim 2832±145602.6%12.5%33 tps2.1s128K$1.00$1.00
275235Command R+828±2166510.1%2.8%36 tps0.7s128K$2.08$9.45
276274Pixtral 12B828±112.2K6.8%2.2%101 tps1.2s131K$0.08$0.08
277235Gemma 3 4B827±111.2K9.9%1.3%138 tps0.7s131K$0.02$0.04
278214Qwen 2.5 7B823±1568511.6%3.7%40 tps1.9s131K$0.08$0.27
279225Open Mistral Nemo818±2459012.6%1.5%171 tps0.5s131K$0.15$0.15
280256Mixtral 8x7B Instruct815±2851012.1%0.2%79 tps0.7s33K$0.23$0.31
281314DeepSeek-R1 0528 Qwen3 8B812±151.2K12.3%<0.1%45 tps2.4s128K$0.05$0.09
282225Command R 7B808±131.3K12.9%1.1%76 tps0.4s128K$0.04$0.15
283225GPT-3.5 Turbo 16k807±111.1K10.6%<0.1%22 tps0.6s16K$3.00$4.00
284361Zenith803±2146014.0%<0.1%36 tps1.8s131K$0$0
285235Mixtral 8x7B801±2952512.5%2.2%142 tps0.6s33K$0.23$0.23
286260Mistral Small799±2346512.3%1.7%142 tps0.6s32K$0.43$1.30
287246WizardLM-2 8x22B795±185157.2%11.6%11 tps2.5s66K$0.77$0.77
288241OLMo 3 7B Think793±135807.9%4.2%77 tps0.4s66K$0.12$0.20
289271Hermes 3 405B Instruct792±2149010.1%2.3%20 tps1.1s131K$0.80$0.80
290265LFM2 2.6B787±1754515.5%6.7%184 tps0.4s33K$0.01$0.02
291256Phi 4780±1756511.7%5.1%28 tps1.3s128K$0.10$0.32
292277GLM Z1 32B773±91.1K20.7%<0.1%18 tps9.3s33K$0.09$0.11
293246Ministral 3B772±1778512.3%0.8%248 tps0.4s131K$0.08$0.08
294260Open Mistral 7B770±2851012.8%0.7%176 tps0.4s33K$0.25$0.25
295284MiniMax M1769±161.1K17.9%<0.1%31 tps2.8s1M$0.55$2.20
296256Gemma 3 1B758±2396513.1%0.6%176 tps1.0s33K$0.06$0.10
297201Mistral Small 24B Instruct752±2248015.0%1.5%84 tps0.4s33K$0.80$0.80
298240GPT-3.5 Turbo Instruct751±2153010.9%<0.1%46 tps1.2s4K$1.50$2.00
299399Phi 4 Multimodal Instruct746±2087012.1%<0.1%17 tps1.4s128K$0.03$0.05
300271Inflection 3 Pi738±2154511.4%1.1%33 tps3.4s8K$2.50$10.00
301292AFM 4.5B736±72.5K23.4%<0.1%81 tps0.3s66K$0.05$0.20
302253Gemma 2 27B727±1558511.4%1.4%44 tps1.4s8K$0.80$0.80
303274LFM2 8B A1B726±2657017.4%<0.1%142 tps0.3s33K$0.01$0.02
304339Refuel LLM 2 Small722±111K14.0%<0.1%116 tps0.5s8K$0.20$0.20
305265Inflection 3 Productivity713±1660011.8%0.6%50 tps3.2s8K$2.50$10.00
306271Mistral Large706±1750012.3%1.5%54 tps0.7s33K$2.00$6.00
307265Mixtral-8x7B Instruct v0.1675±3946512.3%1.3%54 tps0.4s33K$0.60$0.60
308288Qwen 2.5 VL 3B Instruct670±122.5K9.2%3.0%44 tps2.5s128K$0.21$0.63
309281MythoMax L2 13B569±3278515.1%1.2%22 tps1.1s4K$0.18$0.18
310285Hunyuan A13B Instruct485±2765021.2%2.3%67 tps2.0s33K$0.01$0.01
311291Phi 4 Mini Reasoning475±121.9K22.5%9.7%30 tps0.9s128K$0.07$0.30
312434QwQ 32B RpR v1467±2850021.9%<0.1%34 tps3.3s33K$0.02$0.07
Show Less