Models
More

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

Open license models

Filter the leaderboard to only show models that have an open license.

1635
GPT-5.4
1616
Claude Sonnet 4.6
1611
Claude Opus 4.6
1596
Claude Opus 4.6 (Thinking)
1540
Claude Sonnet 4.6 (Thinking)
1485
Claude Opus 4.5
1464
Gemini 3.1 Pro
1462
Claude Opus 4.5 (Thinking)
1425
Claude Haiku 4.5 (Extended Thinking)
1417
Claude Sonnet 4.5 (Thinking)
1404
Claude Sonnet 4.5
1393
GPT-5.2 Instant
1393
GPT-5.3 Codex (High)
1378
GPT-5.2
1370
GPT-5.1

Last updated about 1 month ago

RankOverallNameVIBE
Score
Confidence
Interval
VotesDownvote %Abort %SpeedLatencyContextCost
(Input)
Cost
(Output)
11GPT-5.41635±134K1.6%2.6%55 tps0.8s1M$2.50$15.00
21Claude Sonnet 4.61616±913.6K1.3%1.6%47 tps1.2s200K$3.00$15.00
31Claude Opus 4.61611±718.4K0.9%2.1%48 tps1.7s200K$5.00$25.00
44Claude Opus 4.6 (Thinking)1596±1012.8K1.2%2.5%56 tps1.6s200K$5.00$25.00
55Claude Sonnet 4.6 (Thinking)1540±811.3K2.6%4.7%57 tps1.1s200K$3.00$15.00
67Claude Opus 4.51485±611.7K1.8%1.5%45 tps1.5s200K$5.00$25.00
77Gemini 3.1 Pro1464±915.6K1.9%3.5%35 tps4.1s1M$2.00$12.00
86Claude Opus 4.5 (Thinking)1462±543.2K1.6%1.8%49 tps1.4s200K$5.00$25.00
912Claude Haiku 4.5 (Extended Thinking)1425±710K3.7%1.4%115 tps0.7s200K$1.00$5.00
1010Claude Sonnet 4.5 (Thinking)1417±537.7K2.9%1.9%44 tps1.1s200K$3.00$15.00
1117Claude Sonnet 4.51404±612.9K5.0%1.4%41 tps1.3s200K$1.80$9.00
1210GPT-5.2 Instant1393±912.1K3.2%1.7%52 tps2.0s400K$1.75$14.00
139GPT-5.3 Codex (High)1393±142.8K0.9%2.0%61 tps17.8s400K$1.75$14.00
1413GPT-5.21378±88.5K2.9%4.1%18 tps2.7s400K$1.75$14.00
1515GPT-5.11370±79.1K3.6%2.3%71 tps1.4s400K$1.42$11.33
1613Gemini 3 Pro1362±934.5K2.5%2.1%50 tps3.6s1M$2.00$12.00
1717GPT-5.2 (High)1344±916.4K2.8%6.7%18 tps16.3s400K$1.75$14.00
1819Claude Haiku 4.51344±710.5K4.5%1.1%100 tps0.9s200K$1.00$5.00
1915GLM 51329±143.9K3.1%3.4%36 tps2.7s200K$0.72$2.55
2019GPT-5.3 Instant1321±153.4K2.2%0.9%63 tps0.8s400K$1.75$14.00
2119Gemini 3 Pro (Low)1319±88.9K4.2%2.4%51 tps3.5s1M$2.00$12.00
2219Kimi K2.51307±156.4K3.2%6.5%33 tps1.7s262K$0.34$2.57
2327Claude Sonnet 4 (Thinking)1304±519.2K2.6%1.5%52 tps1.5s200K$3.00$13.67
2419GPT-5.3 Codex (Medium)1299±158902.2%2.3%62 tps10.3s400K$1.75$14.00
2519Gemini 3 Flash Preview Thinking1289±914.1K3.6%1.6%3 tps6.2s1M$0.50$3.00
2619GPT-5.1 (High)1289±109.9K4.1%3.2%76 tps6.9s400K$1.25$10.00
2727GPT-5 Codex (High)1282±511K3.5%3.2%122 tps7.1s400K$1.25$10.00
2836Qwen3.5 122B A17B1282±181.3K2.7%1.5%82 tps1.4s256K$0.40$3.20
2931GPT-5 Chat1272±420.8K4.7%1.3%95 tps0.9s400K$1.25$10.00
3031GPT-5.1 Codex (High)1271±814.1K3.4%3.2%96 tps3.9s400K$1.25$10.00
3143Claude Sonnet 41270±623.4K3.2%1.8%49 tps1.3s200K$3.00$15.00
3236GPT-5.2 Codex (Medium)1264±162.1K2.6%5.7%37 tps6.3s400K$1.75$14.00
3327GPT-5.2 Codex (High)1255±142.6K2.6%8.8%41 tps12.9s400K$1.75$14.00
3443GPT-5.1 Codex Max1254±94.3K4.0%3.0%118 tps4.1s400K$1.25$10.00
3543Gemini 3 Flash Preview1245±115.5K3.7%1.3%138 tps1.4s1M$0.50$3.00
3631Qwen3 Next 80B A3B Instruct1241±113.5K8.9%0.6%84 tps1.1s256K$0.20$1.42
3777Grok 4.20 Multi Agent Beta1240±236853.5%1.2%56 tps8.8s2M$2.00$6.00
3849Grok 4 Fast Non-Reasoning1232±73.9K9.8%1.5%93 tps0.6s2M$0.27$0.67
3974Qwen3.5 397B A17B1231±201.7K2.8%4.3%57 tps1.4s256K$0.52$3.00
4036Qwen3.5 27B1230±176904.2%3.7%55 tps2.6s256K$0.30$2.40
4131Grok 4.1 Fast Non-Reasoning1226±135.8K6.3%0.9%101 tps0.5s2M$0.20$0.50
4243Qwen3 Max Instruct Preview1226±94.7K7.7%1.1%31 tps1.7s256K$1.43$6.61
4336GPT-5 Codex (Medium)1224±106.2K3.9%4.1%122 tps5.2s400K$1.25$10.00
4449Qwen3 30B A3B Instruct 25071220±75.7K7.3%1.2%55 tps1.3s131K$0.13$0.72
4536GPT-5.2 (Extra High) 1218±135.4K3.6%13.2%17 tps20.5s400K$1.75$14.00
4660GPT-5.1 Instant1211±85.8K4.2%1.3%50 tps1.9s400K$1.25$10.00
4749GPT-51209±612.4K6.7%3.1%78 tps23.1s400K$1.25$9.67
4849MiniMax M2.11204±106.9K5.1%2.1%66 tps2.6s205K$0.30$1.20
4960GPT-5.1 Codex (Medium)1202±202.5K2.9%4.6%71 tps3.7s400K$1.25$10.00
5036Qwen3 VL 235B A22B Instruct1197±92.9K7.7%3.1%75 tps1.9s129K$0.37$1.81
5127GPT-5 (High)1194±88.9K4.0%4.5%81 tps35.9s400K$1.25$10.00
5260Grok 4 Fast Reasoning1194±87.7K6.5%2.1%102 tps3.1s2M$0.30$0.75
5336Kimi K2.5 Instant1191±141.4K3.4%2.9%32 tps3.0s262K$0.50$3.00
5469GPT-5 Codex (Low)1188±103.3K4.2%2.7%112 tps3.5s400K$1.25$10.00
5543MiniMax M2.1 Lightning1187±316553.0%1.7%52 tps2.1s205K$0.30$2.40
5660Claude Sonnet 3.5 v21186±84.9K3.4%<0.1%46 tps1.4s200K$3.00$15.00
5760Qwen3 235B A22B Instruct 25071181±76.8K7.8%6.8%13 tps1.9s262K$0.13$0.52
5849GLM 4.61179±104.4K8.3%5.4%39 tps1.5s200K$0.42$1.66
5990Grok 3 Fast1179±221.1K1.7%1.7%52 tps2.4s131K$5.00$25.00
6049Kimi K2 Thinking Turbo1177±135.3K4.5%2.0%75 tps1.4s262K$1.15$8.00
6149DeepSeek V3.21173±133K5.9%1.4%83 tps5.1s131K$0.43$1.09
6269DeepSeek V3.1 Terminus Chat1171±92.6K10.5%3.4%27 tps1.5s131K$0.86$1.80
6331MiniMax M2.5 Lightning1171±271.1K2.3%1.5%51 tps2.0s205K$0.60$2.40
6460Gemini 2.5 Pro1167±523K5.7%2.3%45 tps2.6s1M$1.25$10.00
6549MiniMax M21166±85.4K7.5%2.2%39 tps2.3s205K$0.21$0.85
6643Gemini 2.5 Pro High1164±610.4K7.1%1.5%48 tps2.3s1M$1.25$10.00
6760Grok 4.20 Beta Reasoning1157±209304.1%1.1%77 tps4.5s2M$2.00$5.50
6860DeepSeek V3.2 Thinking1153±96.7K4.6%9.0%30 tps2.6s131K$0.28$0.42
6969gpt-oss-120b1152±78.5K8.0%0.7%213 tps0.5s131K$0.11$0.50
7060Grok 4.1 Fast Reasoning1151±612.8K5.4%1.5%58 tps7.3s2M$0.20$0.50
7174Qwen Plus (Aug'24)1134±78.3K4.9%1.4%53 tps1.3s30K$0.40$1.20
7269GLM 4.71134±95.8K5.5%5.8%40 tps1.5s200K$0.77$1.73
7377GPT-5 Mini1133±66.5K6.0%2.6%66 tps14.2s400K$0.25$2.00
7485GPT-5.2 Codex (Low)1131±271K3.3%4.5%41 tps5.0s400K$1.75$14.00
7577Grok 41129±521.3K5.4%3.9%29 tps11.1s256K$3.00$15.00
7677GPT-4.11127±516.2K1.8%3.7%112 tps1.3s1M$2.00$8.00
7785Gemini 2.5 Flash Thinking1123±710.7K3.7%2.2%88 tps6.4s1M$0.30$2.50
7877Qwen3 Max Thinking Preview1122±143K7.8%3.1%40 tps2.1s256K$1.20$6.00
7998Grok 31120±89.2K4.8%1.5%53 tps0.6s1M$3.67$18.33
8090Qwen Max1119±69K4.7%1.5%49 tps1.5s33K$1.60$6.40
8177Gemini 2.5 Flash Lite Preview 09251117±84.3K7.4%1.2%209 tps0.7s1M$0.25$0.35
82105Qwen3 Max Thinking1115±239452.1%13.5%32 tps2.3s256K$1.20$6.00
8374Gemini 2.5 Flash Preview 09251115±114.1K6.8%1.2%5 tps0.9s1M$0.13$0.97
8498Gemini 2.5 Flash1115±517.3K3.5%1.3%2 tps3.7s1M$0.30$2.50
8590DeepSeek V3 03241113±77.7K5.1%5.8%12 tps2.7s164K$0.38$0.93
8698DeepSeek V3 0324 Turbo1111±56.5K6.3%6.3%12 tps2.4s164K$0.73$1.79
8785DeepSeek V3.1 Chat1108±122.4K9.3%2.8%21 tps1.6s131K$0.38$1.00
88119GPT-5.1 Codex Mini (Medium)1104±151.6K4.6%4.6%69 tps4.1s400K$0.25$2.00
8949Nova Experimental Chat 12-101101±151.4K5.3%2.4%84 tps12.9s98K$0$0
9069Qwen3.5 35B A3B1095±306753.6%2.1%116 tps2.1s256K$0.63$1.13
9185Qwen3 Omni 30B A3B Thinking1091±121.7K6.5%3.7%67 tps1.2s66K$0.97$1.79
9290GPT-4o1090±86.5K3.8%1.0%49 tps2.4s128K$3.71$12.57
93105GPT-4.1 nano1086±79.8K4.1%0.6%175 tps0.5s1M$0.10$0.40
9490Qwen3 Coder 480B A35B Instruct1085±141.8K4.0%3.3%61 tps2.0s262K$0.71$1.34
95128Gemini 3.1 Flash Lite Preview Thinking1085±311.1K4.0%1.7%75 tps4.7s1M$0.25$1.50
9698OpenAI o3-pro1085±152.8K3.8%5.2%22 tps70.8s200K$20.00$80.00
9790Gemini 2.5 Flash Lite1083±711.8K6.9%1.3%210 tps0.7s1M$0.10$0.40
98105Qwen3 Omni 30B A3B Instruct1083±216004.0%3.9%65 tps1.2s66K$0.35$0.97
9985GPT-5 Mini Minimal1082±122.3K9.3%1.2%63 tps1.4s400K$0.25$2.00
10098DeepSeek V3.11078±151.8K4.8%0.8%197 tps0.4s164K$0.55$1.60
10177DeepSeek V3.1 Turbo1068±162.8K6.1%0.9%173 tps1.3s164K$2.00$3.75
102119GPT-5.1 Codex Mini (High)1065±191.9K3.6%5.9%70 tps4.6s400K$0.25$2.00
10377Mistral Large 31065±103.2K5.6%2.1%51 tps1.0s256K$0.50$1.50
10490DeepSeek V3.2 Exp Chat1065±142.2K9.0%2.6%29 tps1.5s131K$0.27$0.39
105119Gemini 2.5 Flash Lite Thinking1064±87.4K6.8%1.0%118 tps4.4s1M$0.03$0.13
106112Kimi K2 Fast1061±516.4K9.2%0.8%365 tps0.5s131K$1.00$3.00
107148Qwen3 VL 235B A22B Thinking1055±122.5K9.4%4.3%47 tps3.0s127K$0.47$3.31
108105Mistral Medium1053±75.5K4.6%1.8%48 tps0.6s33K$1.48$4.55
109128Cogito v2.1 671B1053±219205.2%0.8%85 tps0.5s128K$1.25$1.25
11098Qwen3 235B A22B1050±93.5K8.8%5.3%71 tps0.9s41K$0.23$0.63
111135Gemini 2.0 Flash Lite1050±89.1K3.4%<0.1%42 tps0.5s1M$0.08$0.30
112112Kimi K2 0905 Turbo1050±133.2K13.0%0.7%373 tps0.5s262K$1.70$6.50
113112Kimi K2 09051047±131.8K8.5%4.0%30 tps1.4s262K$0.63$2.39
114105GPT-4.1 mini1046±88.7K4.0%1.1%67 tps0.9s1M$0.34$1.60
11598DeepSeek V3.2 Exp Thinking1046±171.7K6.2%7.2%26 tps3.0s131K$0.28$0.42
116128Qwen3 32B1046±345357.8%3.9%30 tps3.1s41K$0.12$0.42
117128ERNIE 4.5 300B A47B1044±78.7K3.6%4.7%23 tps2.3s123K$0.28$1.10
118135Qwen3 Next 80B A3B Thinking1043±152.7K11.3%0.6%175 tps1.3s256K$0.21$2.26
119119Qwen3 32B Fast1043±106.6K6.4%11.6%30 tps3.1s41K$0.10$0.25
120135Qwen3 VL 30B A3B Instruct1042±218607.0%1.8%80 tps2.6s129K$0.18$0.67
121112GLM 4.51038±93.2K8.8%3.7%46 tps1.4s131K$0.43$1.63
122135Gemini 2.5 Flash Lite Thinking Preview 09251035±104.6K7.3%1.5%152 tps3.0s1M$0.10$0.40
123135Gemini 3.1 Flash Lite Preview1034±307754.3%1.0%8 tps1.2s1M$0.25$1.50
124144Gemini 2.0 Flash1034±77.7K3.6%<0.1%76 tps0.5s1M$0.14$0.56
125135DeepSeek V31032±68.4K2.7%0.9%69 tps1.1s64K$0.59$1.49
126119LongCat Flash Chat1032±122.2K5.9%0.8%85 tps0.9s131K$0.14$0.68
127135QwQ 32B1030±96.4K7.9%5.4%41 tps2.1s16K$0.43$0.56
128128OpenAI o4-mini1027±114.1K8.7%1.4%97 tps7.0s128K$1.10$4.40
129135DeepSeek V3.2 Speciale1026±181.6K6.9%6.0%43 tps1.4s131K$0.84$1.52
130148Qwen3 235B A22B Thinking 25071024±221.9K5.1%2.5%53 tps1.6s131K$0.59$5.70
131144Command A1021±711.9K4.1%2.2%42 tps0.8s256K$2.00$7.33
132105DeepSeek V3 (Turbo)1020±258656.0%1.5%32 tps1.5s64K$0.40$1.30
133119OpenAI o11019±134.3K4.2%4.2%92 tps5.5s200K$15.00$60.00
134159Grok Code Fast 11018±102K5.6%5.9%294 tps0.5s256K$0.20$1.50
135112GPT-5 (Low)1018±284805.0%1.8%75 tps8.2s400K$1.25$10.00
136148Nemotron 3 Nano (Thinking)1018±171.4K6.3%2.0%200 tps0.5s256K$0$0
137119DeepSeek V3.1 Terminus Thinking1016±152.1K10.7%5.9%27 tps1.8s131K$0.56$1.68
138128Kimi K2 Thinking1013±132.5K5.4%4.2%61 tps5.9s262K$0.24$1.03
139105Seed 1.8 2512281010±192.1K3.0%3.7%41 tps2.1s256K$0.25$2.00
140148DeepSeek-R1 Turbo1005±151.6K6.3%2.6%29 tps1.8s64K$2.85$4.75
141159GPT-5 Nano1001±123.6K8.3%3.2%113 tps20.9s400K$0.05$0.40
142167Llama 3.1 8B Turbo999±161.6K1.5%2.1%650 tps0.5s128K$0.13$0.14
143128GLM 4.5 AirX999±396358.6%3.3%75 tps1.2s131K$1.10$4.50
144144OpenAI o3991±134.7K3.7%0.9%85 tps6.8s128K$7.33$29.33
145167Mistral Small 3.2 24B990±114.4K4.7%2.8%141 tps0.7s33K$0.02$0.08
146167Pixtral Large988±202.3K3.6%2.5%57 tps1.3s128K$1.50$4.50
147148Seed 1.6 250615988±221.2K6.0%3.1%46 tps2.2s256K$0.25$2.00
148148Qwen3 30B A3B987±104.5K8.3%5.1%163 tps1.0s41K$0.06$0.21
149112gpt-oss-20b983±104.1K10.1%0.5%216 tps0.5s131K$0.06$0.26
150148Qwen3 Coder Plus981±335003.8%5.1%56 tps2.3s128K$1.80$9.80
15190Step 3.5 Flash981±445103.8%2.2%109 tps0.6s256K$0.05$0.15
152167Devstral Medium978±173.4K5.0%1.5%77 tps0.6s131K$0.40$2.00
153167Qwen 2.5 32B Instruct976±112.9K6.2%2.5%48 tps1.0s131K$0.21$0.25
154167DeepSeek V3.1 Thinking974±113.7K11.1%7.1%18 tps1.8s131K$0.23$0.75
155167Qwen3 VL 30B A3B Thinking973±171.5K9.3%4.5%84 tps2.9s127K$0.20$1.47
156148OpenAI o3-mini-high972±83.9K5.1%2.4%231 tps10.5s200K$1.10$4.40
157159GLM 4.6V972±182.3K5.8%6.4%21 tps1.8s128K$0.38$0.90
158159Mistral Small 3.1 24B Instruct970±152.8K4.2%7.5%15 tps2.4s131K$0.06$0.18
159167Qwen 2.5 72B969±221.3K4.0%1.2%96 tps1.2s131K$0.14$0.26
160167Llama 4 Scout958±79K4.2%0.6%88 tps5.1s131K$0.18$0.46
161167Llama 4 Maverick958±511K4.1%1.2%88 tps2.4s1M$0.23$0.83
162179Qwen3 30B A3B Thinking 2507958±122.3K5.2%0.5%124 tps1.2s131K$0.16$1.70
163148Qwen 2.5 VL 32B Instruct957±246903.5%6.3%43 tps3.2s128K$0.35$0.62
164179Switchpoint Router955±102.5K3.4%1.7%71 tps4.9s131K$0.85$3.40
165159DeepSeek-R1 0528954±114.2K3.7%1.3%93 tps0.5s64K$1.60$3.67
166179DeepSeek-R1953±95.4K4.2%0.8%133 tps0.6s64K$0.91$3.07
167189Llama 3.3 Swallow 70B Instruct949±123.4K5.4%1.4%153 tps1.3s131K$0.13$0.39
168189Open Mistral Nemo948±291.6K4.6%1.5%171 tps0.5s131K$0.15$0.15
169189Jamba 1.6 Large945±83K2.6%2.0%59 tps1.2s256K$1.33$5.33
170135Amazon Nova 2 Lite945±182K7.3%1.0%137 tps0.6s300K$0.35$2.95
171148OpenAI o4-mini-high940±115.9K8.3%1.9%117 tps15.9s200K$1.10$4.40
172148OpenAI o3-mini939±96.4K8.1%0.8%143 tps3.3s200K$1.10$4.40
173201Llama 3 8B939±113.1K3.4%6.0%85 tps0.7s8K$0.12$0.16
174179Ministral 14B 3.0937±197608.4%2.0%119 tps0.5s128K$0.20$0.20
175201Moonshot V1 Auto937±239003.7%1.2%54 tps1.5s8K$2.00$5.00
176201Magistral Small 2506937±123.5K2.8%1.6%156 tps0.5s40K$0.37$1.10
177189Mistral Small 3.1936±122.4K3.8%7.4%13 tps2.6s32K$0.17$0.28
178159Kimi K2 0711936±103.3K5.7%1.6%29 tps1.3s131K$0.72$2.60
179167Qwen3 14B936±104.2K9.1%1.7%109 tps0.8s41K$0.04$0.15
180189Devstral Small934±321.3K5.0%2.4%180 tps0.6s131K$0.10$0.30
181189Grok 3 Mini934±94.6K7.5%1.2%43 tps0.5s131K$0.30$0.50
182179Qwen3 8B929±123.4K9.0%2.4%61 tps1.4s41K$0.02$0.07
183179Grok 3 Mini Fast928±94.8K7.5%1.6%44 tps0.5s131K$0.60$4.00
184179DeepSeek Prover v2926±171.1K3.2%5.2%14 tps1.3s164K$0.40$1.56
185210Magistral Medium 2509923±172.1K9.6%4.0%58 tps0.9s131K$2.00$5.00
186159OpenAI o3-mini-low920±95.5K8.9%0.7%139 tps1.5s200K$1.10$4.40
187179ERNIE 4.5 VL 424B A47B918±195455.2%4.9%36 tps3.5s123K$0.42$1.25
188210Qwen 2.5 7B Turbo918±385655.8%0.5%125 tps0.4s131K$0.30$0.30
189179NVIDIA Llama 3.3 Nemotron Super 49B v1.5917±338207.9%2.0%50 tps0.6s131K$0.09$0.33
190201GPT-3.5 Turbo916±151.1K2.2%1.3%74 tps0.9s16K$0.75$1.75
191189Inception Mercury Coder Small Beta914±345702.6%1.7%270 tps1.4s32K$0.25$1.00
192189Codestral911±297255.8%5.2%151 tps0.9s262K$0.15$0.45
193210Inception Mercury911±66.6K3.2%0.4%257 tps1.1s32K$0.25$1.00
194179ERNIE 4.5 21B A3B910±245107.3%2.3%78 tps1.5s120K$0.05$0.19
195201Amazon Nova Pro 1.0908±95.2K3.6%0.9%96 tps0.7s300K$0.80$1.70
196210Mistral Small 3 24B Instruct907±121.7K3.4%2.6%77 tps0.6s33K$0.07$0.14
197210Krutrim Spectre V2904±161.1K0.9%<0.1%33 tps3.1s4K$0.19$0.19
198201Llama 3.2 11B Instruct902±142K3.8%1.5%152 tps0.5s8K$0.16$0.16
199210Moonshot V1 128k899±231.1K4.4%1.4%54 tps1.5s131K$2.00$5.00
200210Mistral Small 24B Instruct894±131.5K4.2%1.5%84 tps0.4s33K$0.80$0.80
201201GPT-4o mini893±162.3K5.4%2.1%71 tps1.7s128K$0.15$0.60
202201Mistral Small 3.2 24B Instruct891±248058.5%1.9%113 tps1.1s131K$0.02$0.08
203210GLM 4 32B888±152.6K4.0%2.6%40 tps1.6s33K$0.14$0.14
204210Hermes 2 Pro Llama 3 8B888±121.7K2.2%<0.1%76 tps1.0s131K$0.08$0.09
205210Gemma 3 12B887±142.5K4.8%4.2%73 tps0.8s131K$0.05$0.12
206210Mixtral 8x22B885±321.2K4.3%1.2%140 tps0.6s64K$2.00$6.00
207252Hermes 4 405B FP8883±205308.6%3.5%31 tps0.9s131K$0.52$1.73
208210Qwen 2.5 14B Instruct881±192.2K5.2%2.4%40 tps1.6s1M$0.40$1.61
209210DeepSeek R1T2 Chimera878±111.8K5.9%3.0%28 tps1.8s164K$0.13$0.45
210210Mistral Nemo878±238902.7%<0.1%112 tps0.4s131K$0.07$0.13
211210Gemma 3 27B877±371K6.8%1.8%35 tps1.1s66K$0.06$0.10
212189Llama 3.3 70B876±211.4K8.4%0.3%500 tps0.5s8K$0.48$0.66
213234Gemma 3 27B IT875±112.3K4.1%2.0%60 tps0.8s128K$0.17$0.29
214240Sky T1 32B Preview869±172.1K3.6%7.8%73 tps0.6s16K$0.12$0.18
215234GPT-3.5 Turbo 16k869±112.7K3.6%<0.1%22 tps0.6s16K$3.00$4.00
216210Mixtral 8x7B868±161.3K4.9%2.2%142 tps0.6s33K$0.23$0.23
217189Jamba 1.7 Large868±288108.0%1.3%58 tps1.0s256K$1.33$5.33
218210Mixtral 8x7B Instruct867±171.3K3.8%0.2%79 tps0.7s33K$0.23$0.31
219189Seed 1.6 Flash 250715864±239606.3%2.5%108 tps1.6s256K$0.07$0.30
220234Jamba 1.5 Large862±142.6K3.4%1.7%48 tps0.9s256K$1.50$6.00
221234Command R 7B861±143.1K4.4%1.1%76 tps0.4s128K$0.04$0.15
222201GLM 4.6V Flash860±231.9K7.9%3.7%64 tps2.1s128K$0.04$0.40
223210Moonshot V1 8k859±188904.8%1.0%55 tps1.5s8K$0.20$2.00
224240LFM2 8B A1B858±1855012.0%<0.1%142 tps0.3s33K$0.01$0.02
225240Qwen 2.5 7B858±152K5.2%3.7%40 tps1.9s131K$0.08$0.27
226210Qwen3 4B858±133.7K10.7%1.9%94 tps1.5s128K$0.01$0.01
227210Solar Mini 250422856±171.2K5.7%1.8%90 tps1.7s33K$0.15$0.15
228240Krutrim 2851±142.1K0.7%12.5%33 tps2.1s128K$1.00$1.00
229189Rnj-1 Instruct846±275657.4%0.6%103 tps0.3s33K$0.15$0.15
230240Moonshot V1 32k845±239053.2%1.4%53 tps1.4s33K$1.00$3.00
231240Mixtral-8x7B Instruct v0.1840±191.2K4.0%1.3%54 tps0.4s33K$0.60$0.60
232240Gemma 2 27B839±151.5K3.9%1.4%44 tps1.4s8K$0.80$0.80
233234Llama 3.3 70B Instruct Turbo836±161.1K6.1%2.0%78 tps1.0s131K$0.88$0.88
234210Ministral 3B 2512836±605057.3%2.8%339 tps0.6s131K$0.10$0.10
235240C4AI Aya Expanse 32B833±93.5K3.1%1.5%43 tps0.5s128K$0.50$1.50
236252Phi 4829±161.6K3.3%5.1%28 tps1.3s128K$0.10$0.32
237252WizardLM-2 8x22B827±151.6K1.8%11.6%11 tps2.5s66K$0.77$0.77
238252Magistral Small 2509827±271.5K7.3%2.7%116 tps0.6s131K$0.50$1.50
239210Gemma 3n E4B827±104.4K4.7%2.0%30 tps0.5s8K$0.01$0.02
240252Ministral 3B824±182.3K4.9%0.8%248 tps0.4s131K$0.08$0.08
241240Ministral 8B823±232.2K5.4%1.4%177 tps0.4s128K$0.14$0.14
242210GLM 4.7 Flash822±304903.9%5.8%61 tps2.8s128K$0.07$0.39
243252Gemma 3 1B807±161.9K5.9%0.6%176 tps1.0s33K$0.06$0.10
244252GPT-3.5 Turbo Instruct802±152K2.7%<0.1%46 tps1.2s4K$1.50$2.00
245240DeepSeek-R1 Distill Llama 70B798±142.7K5.5%3.6%27 tps1.6s32K$0.73$0.95
246240LFM2 2.6B796±2264510.4%6.7%184 tps0.4s33K$0.01$0.02
247240GLM 4.5 Flash796±494908.4%12.2%15 tps2.2s131K$0$0
248262Mistral Small789±161.1K4.6%1.7%142 tps0.6s32K$0.43$1.30
249262Open Mistral 7B788±191.3K4.8%0.7%176 tps0.4s33K$0.25$0.25
250234ERNIE 4.5 21B A3B Thinking779±278957.3%1.8%87 tps1.5s120K$0.07$0.28
251262Command R777±182K3.8%5.8%54 tps0.6s128K$0.30$0.99
252252Mistral Large771±251K5.1%1.5%54 tps0.7s33K$2.00$6.00
253262Baichuan-M2-32B769±3270511.3%<0.1%32 tps3.3s131K$0.07$0.07
254269Gemma 3 4B762±133.3K4.6%1.3%138 tps0.7s131K$0.02$0.04
255269Pixtral 12B758±272.5K5.7%2.2%101 tps1.2s131K$0.08$0.08
256269Mixtral 8x22B Instruct754±251.3K4.8%1.8%142 tps0.7s66K$0.45$0.45
257269Command R+753±161.5K4.9%2.8%36 tps0.7s128K$2.08$9.45
258262Qwen 2.5 VL 72B Instruct748±181.8K5.7%5.3%25 tps3.7s128K$1.01$2.79
259269Inflection 3 Pi744±191.5K4.2%1.1%33 tps3.4s8K$2.50$10.00
260276Hermes 3 405B Instruct739±231.4K3.9%2.3%20 tps1.1s131K$0.80$0.80
261262Hermes 4 405B Reasoning FP8732±242.1K14.5%3.6%32 tps0.8s131K$1.00$3.00
262269Inflection 3 Productivity721±201.5K4.8%0.6%50 tps3.2s8K$2.50$10.00
263262Goliath 120B686±286255.3%2.7%21 tps2.2s6K$6.56$9.38
264269DeepHermes 3 Mistral 24B Preview677±316355.9%2.5%50 tps1.0s33K$0.06$0.25
265276DeepSeek-R1 Distill Qwen 32B672±171.6K5.4%6.2%22 tps1.8s131K$0.37$0.39
266276MiniMax M1653±162.9K6.1%<0.1%31 tps2.8s1M$0.55$2.20
267279Phi 4 Mini Instruct629±261K6.9%7.4%40 tps1.1s128K$0.07$0.30
268279UI-TARS 1.5 7B620±4048511.8%4.0%75 tps0.9s128K$0.10$0.20
269279MythoMax L2 13B618±202.3K5.8%1.2%22 tps1.1s4K$0.18$0.18
270279Phi 4 Reasoning600±111.9K5.0%21.0%29 tps1.0s33K$0.06$0.25
271279Hunyuan A13B Instruct574±201.5K9.3%2.3%67 tps2.0s33K$0.01$0.01
272284Qwen 2.5 VL 3B Instruct564±273.4K4.9%3.0%44 tps2.5s128K$0.21$0.63
273286Phi 4 Mini Reasoning411±182.9K12.7%9.7%30 tps0.9s128K$0.07$0.30
Show Less