Models
More

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

Open license models

Filter the leaderboard to only show models that have an open license.

1596
Claude Opus 4.6
1594
Claude Sonnet 4.6
1594
GPT-5.4
1566
Claude Opus 4.6 (Thinking)
1506
Claude Sonnet 4.6 (Thinking)
1446
Claude Opus 4.5 (Thinking)
1418
Gemini 3.1 Pro
1409
Claude Opus 4.5
1381
GPT-5.3 Codex (High)
1362
Claude Sonnet 4.5 (Thinking)
1358
GPT-5.2 Instant
1353
Claude Haiku 4.5 (Extended Thinking)
1340
GPT-5.2
1337
Gemini 3 Pro
1324
GLM 5

Last updated about 1 month ago

RankNameVIBE
Score
Confidence
Interval
VotesDownvote %Abort %SpeedLatencyContextCost
(Input)
Cost
(Output)
1Claude Opus 4.61596±621.6K1.1%2.1%48 tps1.7s200K$5.00$25.00
2Claude Sonnet 4.61594±1015.7K1.4%1.6%47 tps1.2s200K$3.00$15.00
3GPT-5.41594±144.4K1.6%2.6%55 tps0.8s1M$2.50$15.00
4Claude Opus 4.6 (Thinking)1566±816.5K1.6%2.5%56 tps1.6s200K$5.00$25.00
5Claude Sonnet 4.6 (Thinking)1506±816.2K3.5%4.7%57 tps1.1s200K$3.00$15.00
6Claude Opus 4.5 (Thinking)1446±460.8K1.9%1.8%49 tps1.4s200K$5.00$25.00
7Gemini 3.1 Pro1418±922K2.5%3.5%35 tps4.1s1M$2.00$12.00
8Claude Opus 4.51409±515.1K2.2%1.5%45 tps1.5s200K$5.00$25.00
9GPT-5.3 Codex (High)1381±93.2K1.2%2.0%61 tps17.8s400K$1.75$14.00
10Claude Sonnet 4.5 (Thinking)1362±458.2K3.3%1.9%44 tps1.1s200K$3.00$15.00
11GPT-5.2 Instant1358±615.7K3.3%1.7%52 tps2.0s400K$1.75$14.00
12Claude Haiku 4.5 (Extended Thinking)1353±414.3K3.8%1.4%115 tps0.7s200K$1.00$5.00
13GPT-5.21340±811.3K3.2%4.1%18 tps2.7s400K$1.75$14.00
14Gemini 3 Pro1337±559.4K2.6%2.1%50 tps3.6s1M$2.00$12.00
15GLM 51324±1411.7K3.3%3.4%36 tps2.7s200K$0.72$2.55
16GPT-5.11319±712.9K3.4%2.3%71 tps1.4s400K$1.42$11.33
17Claude Sonnet 4.51307±320.9K5.0%1.4%41 tps1.3s200K$1.80$9.00
18GPT-5.2 (High)1297±830.7K2.8%6.7%18 tps16.3s400K$1.75$14.00
19Kimi K2.51291±1116.5K3.4%6.5%33 tps1.7s262K$0.34$2.57
20Gemini 3 Pro (Low)1291±611.9K4.2%2.4%51 tps3.5s1M$2.00$12.00
21GPT-5.1 (High)1290±619.1K3.5%3.2%76 tps6.9s400K$1.25$10.00
22Gemini 3 Flash Preview Thinking1286±632.7K3.3%1.6%3 tps6.2s1M$0.50$3.00
23Claude Haiku 4.51283±316.4K4.5%1.1%100 tps0.9s200K$1.00$5.00
24MiniMax M2.51283±285103.8%1.4%70 tps1.9s205K$0.28$1.20
25GPT-5.3 Codex (Medium)1278±271.1K2.3%2.3%62 tps10.3s400K$1.75$14.00
26GPT-5.3 Instant1271±124.2K2.5%0.9%63 tps0.8s400K$1.75$14.00
27Claude Sonnet 4 (Thinking)1261±325.9K2.9%1.5%52 tps1.5s200K$3.00$13.67
28GPT-5 Codex (High)1260±718.5K3.3%3.2%122 tps7.1s400K$1.25$10.00
29GPT-5 (High)1259±416.2K3.5%4.5%81 tps35.9s400K$1.25$10.00
30GPT-5.2 Codex (High)1257±123.1K2.8%8.8%41 tps12.9s400K$1.75$14.00
31GPT-5.1 Codex (High)1240±837K3.3%3.2%96 tps3.9s400K$1.25$10.00
32Grok 4.1 Fast Non-Reasoning1239±69.4K5.4%0.9%101 tps0.5s2M$0.20$0.50
33GPT-5 Chat1231±435K4.5%1.3%95 tps0.9s400K$1.25$10.00
34Qwen3 Next 80B A3B Instruct1231±58.8K5.8%0.6%84 tps1.1s256K$0.20$1.42
35MiniMax M2.5 Lightning1228±141.7K3.2%1.5%51 tps2.0s205K$0.60$2.40
36GPT-5.2 (Extra High) 1221±98K3.5%13.2%17 tps20.5s400K$1.75$14.00
37Qwen3 VL 235B A22B Instruct1220±75.6K6.7%3.1%75 tps1.9s129K$0.37$1.81
38Qwen3.5 122B A17B1216±151.9K3.1%1.5%82 tps1.4s256K$0.40$3.20
39GPT-5 Codex (Medium)1214±68.8K3.9%4.1%122 tps5.2s400K$1.25$10.00
40GPT-5.2 Codex (Medium)1211±122.4K3.0%5.7%37 tps6.3s400K$1.75$14.00
41Qwen3.5 27B1211±169104.7%3.7%55 tps2.6s256K$0.30$2.40
42Kimi K2.5 Instant1210±81.8K3.2%2.9%32 tps3.0s262K$0.50$3.00
43Claude Sonnet 41205±343.2K3.7%1.8%49 tps1.3s200K$3.00$15.00
44Gemini 3 Flash Preview1205±117.2K3.7%1.3%138 tps1.4s1M$0.50$3.00
45Gemini 2.5 Pro High1204±321.1K5.7%1.5%48 tps2.3s1M$1.25$10.00
46Qwen3 Max Instruct Preview1203±616.1K4.6%1.1%31 tps1.7s256K$1.43$6.61
47GPT-5.1 Codex Max1200±126.4K3.9%3.0%118 tps4.1s400K$1.25$10.00
48MiniMax M2.1 Lightning1197±238753.3%1.7%52 tps2.1s205K$0.30$2.40
49Qwen3 30B A3B Instruct 25071194±512.7K5.7%1.2%55 tps1.3s131K$0.13$0.72
50Kimi K2 Thinking Turbo1192±620.3K3.4%2.0%75 tps1.4s262K$1.15$8.00
51MiniMax M2.11192±819.4K3.6%2.1%66 tps2.6s205K$0.30$1.20
52DeepSeek V3.21189±85.1K4.7%1.4%83 tps5.1s131K$0.43$1.09
53MiniMax M2.5 FP81185±176103.2%3.6%33 tps1.7s205K$0.45$1.75
54GPT-51185±421.3K5.3%3.1%78 tps23.1s400K$1.25$9.67
55Grok 4 Fast Non-Reasoning1185±58.1K7.1%1.5%93 tps0.6s2M$0.27$0.67
56MiniMax M21183±519.7K4.2%2.2%39 tps2.3s205K$0.21$0.85
57Nova Experimental Chat 12-101182±92.9K3.8%2.4%84 tps12.9s98K$0$0
58GLM 4.61182±717.2K4.4%5.4%39 tps1.5s200K$0.42$1.66
59GPT-5.3 Codex (Low)1178±285101.0%1.8%61 tps4.3s400K$1.75$14.00
60Grok 4.1 Fast Reasoning1178±739.5K4.4%1.5%58 tps7.3s2M$0.20$0.50
61DeepSeek V3.2 Thinking1178±923.3K4.0%9.0%30 tps2.6s131K$0.28$0.42
62Grok 4 Fast Reasoning1177±314.5K5.0%2.1%102 tps3.1s2M$0.30$0.75
63Gemini 2.5 Pro1176±337.9K4.8%2.3%45 tps2.6s1M$1.25$10.00
64Qwen3 235B A22B Instruct 25071172±412.6K6.4%6.8%13 tps1.9s262K$0.13$0.52
65Claude Sonnet 3.5 v21171±65.5K3.4%<0.1%46 tps1.4s200K$3.00$15.00
66GPT-5.1 Codex (Medium)1171±143K3.2%4.6%71 tps3.7s400K$1.25$10.00
67GPT-5.1 Instant1171±88.3K4.1%1.3%50 tps1.9s400K$1.25$10.00
68Grok 4.20 Beta Reasoning1167±221.2K4.1%1.1%77 tps4.5s2M$2.00$5.50
69gpt-oss-120b1165±519.2K5.0%0.7%213 tps0.5s131K$0.11$0.50
70Qwen3.5 35B A3B1164±258653.9%2.1%116 tps2.1s256K$0.63$1.13
71GPT-5 Codex (Low)1163±105K4.1%2.7%112 tps3.5s400K$1.25$10.00
72GLM 4.71161±716.8K3.7%5.8%40 tps1.5s200K$0.77$1.73
73DeepSeek V3.1 Terminus Chat1158±56.5K6.9%3.4%27 tps1.5s131K$0.86$1.80
74Qwen Plus (Aug'24)1146±517.2K4.7%1.4%53 tps1.3s30K$0.40$1.20
75Qwen3.5 397B A17B1142±142.5K2.9%4.3%57 tps1.4s256K$0.52$3.00
76Gemini 2.5 Flash Preview 09251140±67.6K6.0%1.2%5 tps0.9s1M$0.13$0.97
77Mistral Large 31131±85.4K5.8%2.1%51 tps1.0s256K$0.50$1.50
78GPT-5 Mini1131±58.6K5.4%2.6%66 tps14.2s400K$0.25$2.00
79DeepSeek V3.1 Turbo1130±74.8K5.3%0.9%173 tps1.3s164K$2.00$3.75
80Grok 4.20 Multi Agent Beta1129±199453.6%1.2%56 tps8.8s2M$2.00$6.00
81Qwen3 Max Thinking Preview1127±106.3K5.7%3.1%40 tps2.1s256K$1.20$6.00
82Grok 41125±339.6K4.4%3.9%29 tps11.1s256K$3.00$15.00
83GPT-4.11123±532.8K5.2%3.7%112 tps1.3s1M$2.00$8.00
84Gemini 2.5 Flash Lite Preview 09251122±78.5K6.6%1.2%209 tps0.7s1M$0.25$0.35
85Gemini 2.5 Flash Thinking1118±413.7K3.6%2.2%88 tps6.4s1M$0.30$2.50
86GPT-5 Mini Minimal1114±123.2K8.5%1.2%63 tps1.4s400K$0.25$2.00
87GPT-5.2 Codex (Low)1113±191.2K3.2%4.5%41 tps5.0s400K$1.75$14.00
88DeepSeek V3.1 Chat1110±74.9K6.6%2.8%21 tps1.6s131K$0.38$1.00
89Qwen3 Omni 30B A3B Thinking1110±102.3K6.0%3.7%67 tps1.2s66K$0.97$1.79
90DeepSeek V3.2 Exp Chat1107±45.5K6.1%2.6%29 tps1.5s131K$0.27$0.39
91Qwen Max1107±418.3K4.2%1.5%49 tps1.5s33K$1.60$6.40
92Gemini 2.5 Flash Lite1103±521.3K6.2%1.3%210 tps0.7s1M$0.10$0.40
93Grok 3 Fast1102±142.5K4.7%1.7%52 tps2.4s131K$5.00$25.00
94GPT-4o1102±58.5K3.7%1.0%49 tps2.4s128K$3.71$12.57
95Step 3.5 Flash1102±248103.6%2.2%109 tps0.6s256K$0.05$0.15
96DeepSeek V3 03241100±415.1K4.3%5.8%12 tps2.7s164K$0.38$0.93
97Qwen3 Coder 480B A35B Instruct1099±83.1K4.5%3.3%61 tps2.0s262K$0.71$1.34
98Gemini 2.5 Flash1098±435.9K3.2%1.3%2 tps3.7s1M$0.30$2.50
99Grok 31098±419.1K5.5%1.5%53 tps0.6s1M$3.67$18.33
100DeepSeek V3 0324 Turbo1093±515.5K5.7%6.3%12 tps2.4s164K$0.73$1.79
101Qwen3 235B A22B1093±64.5K8.0%5.3%71 tps0.9s41K$0.23$0.63
102OpenAI o3-pro1090±85.4K4.3%5.2%22 tps70.8s200K$20.00$80.00
103DeepSeek V3.11089±122.3K4.7%0.8%197 tps0.4s164K$0.55$1.60
104DeepSeek V3.2 Exp Thinking1089±75.9K3.5%7.2%26 tps3.0s131K$0.28$0.42
105GPT-4.1 mini1087±519.7K4.2%1.1%67 tps0.9s1M$0.34$1.60
106GPT-4.1 nano1085±517K5.0%0.6%175 tps0.5s1M$0.10$0.40
107Qwen3 Omni 30B A3B Instruct1085±137754.3%3.9%65 tps1.2s66K$0.35$0.97
108DeepSeek V3 (Turbo)1082±201.5K5.1%1.5%32 tps1.5s64K$0.40$1.30
109Seed 1.8 2512281081±103.2K3.1%3.7%41 tps2.1s256K$0.25$2.00
110Mistral Medium1080±49.6K5.6%1.8%48 tps0.6s33K$1.48$4.55
111Qwen3 Max Thinking1080±181.5K2.0%13.5%32 tps2.3s256K$1.20$6.00
112GLM 4.51075±56K7.0%3.7%46 tps1.4s131K$0.43$1.63
113Kimi K2 09051074±78.7K4.3%4.0%30 tps1.4s262K$0.63$2.39
114Kimi K2 Fast1073±535K6.4%0.8%365 tps0.5s131K$1.00$3.00
115GPT-5 (Low)1070±146903.5%1.8%75 tps8.2s400K$1.25$10.00
116Kimi K2 0905 Turbo1070±67.5K9.1%0.7%373 tps0.5s262K$1.70$6.50
117gpt-oss-20b1066±67.7K7.1%0.5%216 tps0.5s131K$0.06$0.26
118Grok 4.20 Beta Non-reasoning1063±365004.8%1.1%151 tps0.6s2M$2.00$6.00
119OpenAI o11062±69.9K3.3%4.2%92 tps5.5s200K$15.00$60.00
120DeepSeek V3.1 Terminus Thinking1061±92.9K9.4%5.9%27 tps1.8s131K$0.56$1.68
121OpenAI o1-pro1061±206807.5%5.2%33 tps72.8s200K$150.00$600.00
122Gemini 2.5 Flash Lite Thinking1061±49.8K6.2%1.0%118 tps4.4s1M$0.03$0.13
123Seed 2.0 Lite (Medium)1058±205253.7%6.6%33 tps1.6s256K$0.25$2.00
124LongCat Flash Chat1058±122.7K5.9%0.8%85 tps0.9s131K$0.14$0.68
125GPT-5.1 Codex Mini (Medium)1057±151.9K4.9%4.6%69 tps4.1s400K$0.25$2.00
126GPT-5.1 Codex Mini (High)1054±152.2K3.9%5.9%70 tps4.6s400K$0.25$2.00
127Qwen3 32B Fast1052±611.4K5.2%11.6%30 tps3.1s41K$0.10$0.25
128ERNIE 4.5 300B A47B1049±413.5K3.9%4.7%23 tps2.3s123K$0.28$1.10
129Cogito v2.1 671B1044±191.2K4.6%0.8%85 tps0.5s128K$1.25$1.25
130Qwen3 32B1044±198506.6%3.9%30 tps3.1s41K$0.12$0.42
131GLM 4.5 AirX1042±151.1K6.9%3.3%75 tps1.2s131K$1.10$4.50
132Kimi K2 Thinking1042±103.3K5.1%4.2%61 tps5.9s262K$0.24$1.03
133OpenAI o4-mini1042±58.5K6.4%1.4%97 tps7.0s128K$1.10$4.40
134Gemini 3.1 Flash Lite Preview Thinking1039±161.4K4.2%1.7%75 tps4.7s1M$0.25$1.50
135QwQ 32B1035±411.6K6.4%5.4%41 tps2.1s16K$0.43$0.56
136Qwen3 Next 80B A3B Thinking1035±56.2K7.4%0.6%175 tps1.3s256K$0.21$2.26
137Gemini 2.5 Flash Lite Thinking Preview 09251035±75.8K6.8%1.5%152 tps3.0s1M$0.10$0.40
138Gemini 3.1 Flash Lite Preview1034±219804.4%1.0%8 tps1.2s1M$0.25$1.50
139Qwen3 VL 30B A3B Instruct1034±151K6.7%1.8%80 tps2.6s129K$0.18$0.67
140DeepSeek V31032±517.6K3.7%0.9%69 tps1.1s64K$0.59$1.49
141DeepSeek V3.2 Speciale1030±102.3K6.1%6.0%43 tps1.4s131K$0.84$1.52
142Gemini 2.0 Flash Lite1029±514.7K9.5%<0.1%42 tps0.5s1M$0.08$0.30
143Amazon Nova 2 Lite1026±103.6K6.0%1.0%137 tps0.6s300K$0.35$2.95
144Command A1024±422.4K4.8%2.2%42 tps0.8s256K$2.00$7.33
145DeepSeek V3.1 Nex N11021±195655.0%3.4%24 tps7.2s131K$0.14$0.50
146OpenAI o31020±75.9K4.0%0.9%85 tps6.8s128K$7.33$29.33
147Gemini 2.0 Flash1018±78.2K3.8%<0.1%76 tps0.5s1M$0.14$0.56
148Nemotron 3 Nano (Thinking)1012±132K6.7%2.0%200 tps0.5s256K$0$0
149Qwen3 VL 235B A22B Thinking1009±64.6K8.3%4.3%47 tps3.0s127K$0.47$3.31
150Qwen3 Coder Plus1007±226104.7%5.1%56 tps2.3s128K$1.80$9.80
151DeepSeek-R1 Turbo1003±92.5K5.6%2.6%29 tps1.8s64K$2.85$4.75
152Qwen 2.5 VL 32B Instruct1001±218654.9%6.3%43 tps3.2s128K$0.35$0.62
153Qwen3 235B A22B Thinking 25071000±102.8K4.4%2.5%53 tps1.6s131K$0.59$5.70
154OpenAI o3-mini-high999±58.3K4.1%2.4%231 tps10.5s200K$1.10$4.40
155OpenAI o3-mini999±415K5.5%0.8%143 tps3.3s200K$1.10$4.40
156OpenAI o4-mini-high995±713.6K6.2%1.9%117 tps15.9s200K$1.10$4.40
157Seed 1.6 250615995±211.6K6.0%3.1%46 tps2.2s256K$0.25$2.00
158Qwen3 30B A3B994±86.3K6.9%5.1%163 tps1.0s41K$0.06$0.21
159GPT-5 Nano989±64.6K8.0%3.2%113 tps20.9s400K$0.05$0.40
160OpenAI o3-mini-low988±612.2K6.4%0.7%139 tps1.5s200K$1.10$4.40
161Grok Code Fast 1987±92.5K6.0%5.9%294 tps0.5s256K$0.20$1.50
162GLM 4.6V986±83K5.5%6.4%21 tps1.8s128K$0.38$0.90
163Kimi K2 0711981±67K4.5%1.6%29 tps1.3s131K$0.72$2.60
164Seed 2.0 Mini (Medium)981±355705.8%11.9%33 tps1.7s256K$0.15$0.60
165Mistral Small 3.1 24B Instruct980±112.9K4.3%7.5%15 tps2.4s131K$0.06$0.18
166DeepSeek-R1 0528979±65.5K3.5%1.3%93 tps0.5s64K$1.60$3.67
167DeepSeek V3.1 Thinking976±95.2K9.5%7.1%18 tps1.8s131K$0.23$0.75
168Nemotron 3 Nano974±465806.5%1.3%216 tps0.8s256K$0.05$4.94
169Qwen 2.5 32B Instruct972±74.1K6.5%2.5%48 tps1.0s131K$0.21$0.25
170Llama 4 Maverick971±521K5.0%1.2%88 tps2.4s1M$0.23$0.83
171Mistral Small 3.2 24B970±134.6K4.9%2.8%141 tps0.7s33K$0.02$0.08
172Pixtral Large969±143.5K3.9%2.5%57 tps1.3s128K$1.50$4.50
173Qwen3 VL 30B A3B Thinking967±111.9K8.9%4.5%84 tps2.9s127K$0.20$1.47
174Llama 4 Scout965±517.5K5.3%0.6%88 tps5.1s131K$0.18$0.46
175Devstral Medium962±113.5K5.2%1.5%77 tps0.6s131K$0.40$2.00
176Qwen3 14B962±85.3K8.4%1.7%109 tps0.8s41K$0.04$0.15
177Qwen 2.5 72B960±151.4K4.5%1.2%96 tps1.2s131K$0.14$0.26
178Llama 3.1 8B Turbo958±142.2K2.0%2.1%650 tps0.5s128K$0.13$0.14
179Qwen3 30B A3B Thinking 2507953±103.5K4.7%0.5%124 tps1.2s131K$0.16$1.70
180Switchpoint Router949±102.7K3.6%1.7%71 tps4.9s131K$0.85$3.40
181Qwen3 8B948±94.2K8.2%2.4%61 tps1.4s41K$0.02$0.07
182Ministral 14B 3.0948±168058.5%2.0%119 tps0.5s128K$0.20$0.20
183Grok 3 Mini Fast943±79K7.0%1.6%44 tps0.5s131K$0.60$4.00
184ERNIE 4.5 21B A3B943±285406.9%2.3%78 tps1.5s120K$0.05$0.19
185ERNIE 4.5 VL 424B A47B942±187256.5%4.9%36 tps3.5s123K$0.42$1.25
186NVIDIA Llama 3.3 Nemotron Super 49B v1.5941±191.4K6.9%2.0%50 tps0.6s131K$0.09$0.33
187DeepSeek-R1939±66.4K4.3%0.8%133 tps0.6s64K$0.91$3.07
188DeepSeek Prover v2935±101.3K3.0%5.2%14 tps1.3s164K$0.40$1.56
189Llama 3.3 70B933±103K6.6%0.3%500 tps0.5s8K$0.48$0.66
190Codestral931±198556.0%5.2%151 tps0.9s262K$0.15$0.45
191Grok 3 Mini927±69.9K6.4%1.2%43 tps0.5s131K$0.30$0.50
192Mistral Small 3.1925±142.4K4.0%7.4%13 tps2.6s32K$0.17$0.28
193Jamba 1.6 Large924±93.3K3.8%2.0%59 tps1.2s256K$1.33$5.33
194Llama 3.3 Swallow 70B Instruct919±83.5K5.5%1.4%153 tps1.3s131K$0.13$0.39
195Open Mistral Nemo918±211.7K4.5%1.5%171 tps0.5s131K$0.15$0.15
196Jamba 1.7 Large917±169758.0%1.3%58 tps1.0s256K$1.33$5.33
197Devstral Small916±171.4K5.0%2.4%180 tps0.6s131K$0.10$0.30
198Rnj-1 Instruct915±219706.7%0.6%103 tps0.3s33K$0.15$0.15
199Seed 1.6 Flash 250715913±161.2K5.7%2.5%108 tps1.6s256K$0.07$0.30
200Inception Mercury Coder Small Beta912±206103.2%1.7%270 tps1.4s32K$0.25$1.00
201Magistral Small 2506908±114.3K3.1%1.6%156 tps0.5s40K$0.37$1.10
202GPT-3.5 Turbo908±151.2K2.5%1.3%74 tps0.9s16K$0.75$1.75
203Llama 3 8B904±103.2K3.5%6.0%85 tps0.7s8K$0.12$0.16
204Mistral Small 3.2 24B Instruct903±208508.6%1.9%113 tps1.1s131K$0.02$0.08
205GPT-4o mini900±142.8K5.3%2.1%71 tps1.7s128K$0.15$0.60
206Moonshot V1 Auto899±229304.1%1.2%54 tps1.5s8K$2.00$5.00
207Amazon Nova Pro 1.0894±105.7K4.0%0.9%96 tps0.7s300K$0.80$1.70
208GLM 4.6V Flash892±102.5K7.6%3.7%64 tps2.1s128K$0.04$0.40
209Llama 3.2 11B Instruct885±152.1K4.1%1.5%152 tps0.5s8K$0.16$0.16
210Magistral Medium 2509883±162.6K9.5%4.0%58 tps0.9s131K$2.00$5.00
211Gemma 3n E4B882±76K4.5%2.0%30 tps0.5s8K$0.01$0.02
212Qwen3 4B880±85.1K9.6%1.9%94 tps1.5s128K$0.01$0.01
213Mistral Small 3 24B Instruct880±101.7K3.6%2.6%77 tps0.6s33K$0.07$0.14
214Moonshot V1 128k879±191.1K4.6%1.4%54 tps1.5s131K$2.00$5.00
215Inception Mercury878±56.9K3.7%0.4%257 tps1.1s32K$0.25$1.00
216DeepSeek R1T2 Chimera876±102.1K5.9%3.0%28 tps1.8s164K$0.13$0.45
217Mistral Medium 3875±234856.7%2.4%47 tps0.8s33K$0.40$2.00
218Mistral Nemo875±159152.7%<0.1%112 tps0.4s131K$0.07$0.13
219Solar Mini 250422874±171.3K5.9%1.8%90 tps1.7s33K$0.15$0.15
220GLM 4.7 Flash871±286104.7%5.8%61 tps2.8s128K$0.07$0.39
221Mixtral 8x22B871±221.3K5.0%1.2%140 tps0.6s64K$2.00$6.00
222Qwen 2.5 7B Turbo870±256156.1%0.5%125 tps0.4s131K$0.30$0.30
223Krutrim Spectre V2868±161.3K3.6%<0.1%33 tps3.1s4K$0.19$0.19
224GLM 4 32B868±122.9K4.9%2.6%40 tps1.6s33K$0.14$0.14
225Gemma 3 12B867±112.5K4.9%4.2%73 tps0.8s131K$0.05$0.12
226Hermes 2 Pro Llama 3 8B864±211.8K2.5%<0.1%76 tps1.0s131K$0.08$0.09
227Mistral Small 24B Instruct864±161.5K4.1%1.5%84 tps0.4s33K$0.80$0.80
228Moonshot V1 8k863±139155.2%1.0%55 tps1.5s8K$0.20$2.00
229Qwen 2.5 14B Instruct861±162.4K5.7%2.4%40 tps1.6s1M$0.40$1.61
230Gemma 3 27B856±271.1K6.9%1.8%35 tps1.1s66K$0.06$0.10
231Mixtral 8x7B855±181.3K5.1%2.2%142 tps0.6s33K$0.23$0.23
232Ministral 3B 2512854±575158.0%2.8%339 tps0.6s131K$0.10$0.10
233Mixtral 8x7B Instruct854±161.4K4.4%0.2%79 tps0.7s33K$0.23$0.31
234Gemma 3 27B IT853±102.3K3.9%2.0%60 tps0.8s128K$0.17$0.29
235Jamba 1.5 Large851±92.9K4.0%1.7%48 tps0.9s256K$1.50$6.00
236Llama 3.3 70B Instruct Turbo851±191.2K6.0%2.0%78 tps1.0s131K$0.88$0.88
237Command R 7B849±153.3K4.8%1.1%76 tps0.4s128K$0.04$0.15
238GPT-3.5 Turbo 16k838±102.7K3.6%<0.1%22 tps0.6s16K$3.00$4.00
239ERNIE 4.5 21B A3B Thinking838±231.1K6.9%1.8%87 tps1.5s120K$0.07$0.28
240DeepSeek-R1 Distill Llama 70B835±93.4K5.2%3.6%27 tps1.6s32K$0.73$0.95
241GLM 4.5 Flash834±375208.8%12.2%15 tps2.2s131K$0$0
242Mixtral-8x7B Instruct v0.1832±231.3K4.6%1.3%54 tps0.4s33K$0.60$0.60
243Qwen 2.5 7B831±172K5.1%3.7%40 tps1.9s131K$0.08$0.27
244Sky T1 32B Preview829±142.4K4.5%7.8%73 tps0.6s16K$0.12$0.18
245LFM2 2.6B826±2681010.0%6.7%184 tps0.4s33K$0.01$0.02
246Krutrim 2825±102.3K2.3%12.5%33 tps2.1s128K$1.00$1.00
247Ministral 8B825±172.2K5.5%1.4%177 tps0.4s128K$0.14$0.14
248C4AI Aya Expanse 32B821±73.8K4.0%1.5%43 tps0.5s128K$0.50$1.50
249Moonshot V1 32k820±179503.1%1.4%53 tps1.4s33K$1.00$3.00
250LFM2 8B A1B818±1882511.3%<0.1%142 tps0.3s33K$0.01$0.02
251Gemma 2 27B815±171.5K4.1%1.4%44 tps1.4s8K$0.80$0.80
252Ministral 3B806±162.3K5.1%0.8%248 tps0.4s131K$0.08$0.08
253Magistral Small 2509802±181.8K7.5%2.7%116 tps0.6s131K$0.50$1.50
254Gemma 3 1B802±112K6.1%0.6%176 tps1.0s33K$0.06$0.10
255WizardLM-2 8x22B801±121.9K3.1%11.6%11 tps2.5s66K$0.77$0.77
256Phi 4798±161.7K3.4%5.1%28 tps1.3s128K$0.10$0.32
257Hermes 4 405B FP8797±218158.4%3.5%31 tps0.9s131K$0.52$1.73
258Mercury Coder793±275103.8%<0.1%247 tps2.2s32K$0.25$1.00
259GPT-3.5 Turbo Instruct787±92K2.7%<0.1%46 tps1.2s4K$1.50$2.00
260Mistral Large785±161.1K5.8%1.5%54 tps0.7s33K$2.00$6.00
261Hermes 4 70B781±294608.9%1.1%67 tps0.6s131K$0.12$0.39
262Command R778±182.2K4.9%5.8%54 tps0.6s128K$0.30$0.99
263Baichuan-M2-32B770±3074010.8%<0.1%32 tps3.3s131K$0.07$0.07
264Mistral Small770±121.2K4.5%1.7%142 tps0.6s32K$0.43$1.30
265Open Mistral 7B762±181.3K4.7%0.7%176 tps0.4s33K$0.25$0.25
266Hermes 4 405B Reasoning FP8759±112.7K12.8%3.6%32 tps0.8s131K$1.00$3.00
267Goliath 120B754±247455.7%2.7%21 tps2.2s6K$6.56$9.38
268Qwen 2.5 VL 72B Instruct746±202.1K6.0%5.3%25 tps3.7s128K$1.01$2.79
269Gemma 3 4B742±103.3K4.7%1.3%138 tps0.7s131K$0.02$0.04
270Mixtral 8x22B Instruct738±171.4K5.6%1.8%142 tps0.7s66K$0.45$0.45
271Command R+738±151.6K5.6%2.8%36 tps0.7s128K$2.08$9.45
272Inflection 3 Productivity737±241.5K5.0%0.6%50 tps3.2s8K$2.50$10.00
273Pixtral 12B722±213K6.3%2.2%101 tps1.2s131K$0.08$0.08
274Inflection 3 Pi719±181.5K4.1%1.1%33 tps3.4s8K$2.50$10.00
275DeepHermes 3 Mistral 24B Preview706±307155.9%2.5%50 tps1.0s33K$0.06$0.25
276Hermes 3 405B Instruct702±201.4K4.1%2.3%20 tps1.1s131K$0.80$0.80
277DeepSeek-R1 Distill Qwen 32B696±202K5.5%6.2%22 tps1.8s131K$0.37$0.39
278MiniMax M1686±133.8K5.3%<0.1%31 tps2.8s1M$0.55$2.20
279UI-TARS 1.5 7B610±4053011.7%4.0%75 tps0.9s128K$0.10$0.20
280MythoMax L2 13B600±212.3K5.8%1.2%22 tps1.1s4K$0.18$0.18
281Phi 4 Mini Instruct599±211K7.1%7.4%40 tps1.1s128K$0.07$0.30
282Hunyuan A13B Instruct588±221.6K9.2%2.3%67 tps2.0s33K$0.01$0.01
283Phi 4 Reasoning573±172.1K5.5%21.0%29 tps1.0s33K$0.06$0.25
284Qwen 2.5 VL 3B Instruct523±254.1K6.1%3.0%44 tps2.5s128K$0.21$0.63
285CodeLlama 7B Instruct Solidity463±544858.5%3.6%33 tps0.7s16K$0.80$1.20
286Phi 4 Mini Reasoning447±153.4K12.0%9.7%30 tps0.9s128K$0.07$0.30
Show Less