Models
Topics
Language
More

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

Open license models

Filter the leaderboard to only show models that have an open license.

1569
Claude Opus 4.6 (Thinking)
1493
GPT-5.4
1471
GPT-5.4 (High)
1469
Claude Opus 4.6
1418
Gemini 3.1 Pro
1368
GPT-5.1 (High)
1364
GPT-5.1 (Medium)
1364
Claude Sonnet 4.6
1361
GPT-5.2 Instant
1360
GPT-5.1
1345
Qwen3 30B A3B Instruct 2507
1343
Gemini 3 Pro
1329
GPT-5.2
1328
Mistral Medium 3.1
1313
Claude Opus 4.5 (Thinking)

Last updated about 1 month ago

RankOverallNameVIBE
Score
Confidence
Interval
VotesDownvote %Abort %SpeedLatencyContextCost
(Input)
Cost
(Output)
11Claude Opus 4.6 (Thinking)1569±101.5K1.7%2.5%56 tps1.6s200K$5.00$25.00
22GPT-5.41493±146951.4%2.6%55 tps0.8s1M$2.50$15.00
34GPT-5.4 (High)1471±168051.2%4.6%68 tps7.9s1M$2.50$15.00
42Claude Opus 4.61469±161.6K2.4%2.1%48 tps1.7s200K$5.00$25.00
56Gemini 3.1 Pro1418±114.9K1.0%3.5%35 tps4.1s1M$2.00$12.00
68GPT-5.1 (High)1368±104.9K2.2%3.2%76 tps6.9s400K$1.25$10.00
78GPT-5.1 (Medium)1364±129954.3%<0.1%86 tps3.8s400K$0.83$6.67
84Claude Sonnet 4.61364±191.9K1.3%1.6%47 tps1.2s200K$3.00$15.00
910GPT-5.2 Instant1361±114K1.1%1.7%52 tps2.0s400K$1.75$14.00
108GPT-5.11360±92.6K2.4%2.3%71 tps1.4s400K$1.42$11.33
1133Qwen3 30B A3B Instruct 25071345±63.4K1.4%1.2%55 tps1.3s131K$0.13$0.72
1210Gemini 3 Pro1343±716.2K1.4%2.1%50 tps3.6s1M$2.00$12.00
1316GPT-5.21329±93.2K1.2%4.1%18 tps2.7s400K$1.75$14.00
1419Mistral Medium 3.11328±83.1K1.9%<0.1%77 tps0.7s128K$0.40$2.00
157Claude Opus 4.5 (Thinking)1313±96K2.5%1.8%49 tps1.4s200K$5.00$25.00
1637Nova Experimental Chat 10-201313±111.4K7.9%<0.1%30 tps0.5s98K$0$0
1714Gemini 3 Pro (Low)1300±84.1K1.8%2.4%51 tps3.5s1M$2.00$12.00
18213DeepSeek R1T Chimera1289±157703.1%<0.1%46 tps1.1s164K$0.09$0.36
1917Gemini 3 Flash Preview1281±112K1.2%1.3%138 tps1.4s1M$0.50$3.00
205Claude Sonnet 4.6 (Thinking)1280±141.4K2.2%4.7%57 tps1.1s200K$3.00$15.00
2117GPT-5.2 (High)1275±126.5K1.4%6.7%18 tps16.3s400K$1.75$14.00
2229Qwen3 VL 235B A22B Instruct1273±141.1K2.3%3.1%75 tps1.9s129K$0.37$1.81
2333Qwen3 Next 80B A3B Instruct1270±102.3K2.8%0.6%84 tps1.1s256K$0.20$1.42
2422GPT-5 Chat1269±57.9K1.6%1.3%95 tps0.9s400K$1.25$10.00
2548gpt-oss-120b1269±74.6K1.4%0.7%213 tps0.5s131K$0.11$0.50
2640Qwen3 235B A22B Instruct 25071261±83.1K1.4%6.8%13 tps1.9s262K$0.13$0.52
2710Claude Sonnet 4.5 (Thinking)1261±75.5K1.9%1.9%44 tps1.1s200K$3.00$15.00
2826Grok 4.1 Fast Non-Reasoning1260±162.5K3.7%0.9%101 tps0.5s2M$0.20$0.50
2956Gemini 2.5 Pro Low1259±92.4K2.4%<0.1%89 tps2.4s1M$1.25$10.00
3016Nova Experimental Chat 11-101252±141.7K3.6%0.4%84 tps8.9s98K$0$0
3114Gemini 3 Flash Preview Thinking1248±103.7K1.3%1.6%3 tps6.2s1M$0.50$3.00
3232Gemini 2.5 Pro High1234±64.6K2.4%1.5%48 tps2.3s1M$1.25$10.00
3362GPT-5.1 Instant1233±122.6K2.7%1.3%50 tps1.9s400K$1.25$10.00
3481GPT-4o1228±112.9K1.7%1.0%49 tps2.4s128K$3.71$12.57
3513GPT-5.3 Instant1225±132.3K1.3%0.9%63 tps0.8s400K$1.75$14.00
3642Qwen3 Max Instruct Preview1192±92.8K3.0%1.1%31 tps1.7s256K$1.43$6.61
3729Nova Experimental Chat 12-101192±111.4K0.7%2.4%84 tps12.9s98K$0$0
3856Claude Opus 4.1 (Thinking)1177±111.1K3.8%<0.1%20 tps3.9s200K$15.00$75.00
3917Claude Opus 4.51177±151.9K4.7%1.5%45 tps1.5s200K$5.00$25.00
4033Kimi K2.51175±123.1K1.6%6.5%33 tps1.7s262K$0.34$2.57
4177GPT-4.5 Preview1173±134952.9%<0.1%36 tps3.0s200K$75.00$150.00
4242GPT-5.2 (Extra High) 1172±133K1.6%13.2%17 tps20.5s400K$1.75$14.00
4344Kimi K2 Thinking Turbo1171±111.8K4.2%2.0%75 tps1.4s262K$1.15$8.00
4480GPT-5 (Minimal)1169±101.9K2.6%<0.1%67 tps1.4s400K$1.25$10.00
4533Qwen Plus 07281167±205857.1%<0.1%55 tps0.9s1M$0.40$1.20
4648Step 3.5 Flash1163±236400.8%2.2%109 tps0.6s256K$0.05$0.15
4756Gemini 3.1 Flash Lite Preview Thinking1162±197302.0%1.7%75 tps4.7s1M$0.25$1.50
4844Gemini 2.5 Pro1161±612.1K1.4%2.3%45 tps2.6s1M$1.25$10.00
4944Grok 4.1 Fast Reasoning1159±114.3K2.9%1.5%58 tps7.3s2M$0.20$0.50
5056DeepSeek V3.2 Thinking1158±192.4K2.3%9.0%30 tps2.6s131K$0.28$0.42
5126Claude Haiku 4.5 (Extended Thinking)1154±82.4K2.8%1.4%115 tps0.7s200K$1.00$5.00
5248Grok 4 Fast Reasoning1154±102.2K3.0%2.1%102 tps3.1s2M$0.30$0.75
5368Qwen Plus (Aug'24)1152±94.8K1.1%1.4%53 tps1.3s30K$0.40$1.20
5460Gemini 2.5 Flash Preview 09251151±91.8K3.3%1.2%5 tps0.9s1M$0.13$0.97
5537Qwen3 Omni 30B A3B Thinking1150±168453.4%3.7%67 tps1.2s66K$0.97$1.79
5637Claude Sonnet 4.51148±93.3K2.7%1.4%41 tps1.3s200K$1.80$9.00
57106Claude Sonnet 3.5 v21147±141K2.0%<0.1%46 tps1.4s200K$3.00$15.00
58111Claude Sonnet 3.71146±92.1K2.1%<0.1%39 tps1.6s200K$3.00$15.00
59133Kimi K2 09051145±101.5K2.0%4.0%30 tps1.4s262K$0.63$2.39
6043Gemini 2.5 Flash Thinking Preview 09251137±81.9K3.0%<0.1%111 tps4.7s1M$0.30$2.50
6126GPT-5 (High)1131±92.7K3.2%4.5%81 tps35.9s400K$1.25$10.00
6240DeepSeek V3.21125±102.1K1.6%1.4%83 tps5.1s131K$0.43$1.09
63111LongCat Flash Chat1121±177254.6%0.8%85 tps0.9s131K$0.14$0.68
6421Claude Opus 41120±139202.1%<0.1%25 tps1.5s200K$15.00$75.00
6579MiniMax M2.5 Lightning1119±166500.8%1.5%51 tps2.0s205K$0.60$2.40
6671Gemini 2.5 Flash Lite Preview 09251116±102K2.4%1.2%209 tps0.7s1M$0.25$0.35
67101Gemini 2.5 Flash Lite1113±64.4K1.8%1.3%210 tps0.7s1M$0.10$0.40
6852Grok 4 Fast Non-Reasoning1108±121.8K3.0%1.5%93 tps0.6s2M$0.27$0.67
6979Qwen3 Max Thinking Preview1105±141.9K4.0%3.1%40 tps2.1s256K$1.20$6.00
7077Claude Opus 4.11097±139654.0%3.0%17 tps3.7s200K$15.00$75.00
7148OpenAI o1-mini1095±84.9K1.0%<0.1%118 tpsN/A128K$1.13$4.51
72124Qwen3 235B A22B Thinking 25071091±207052.1%2.5%53 tps1.6s131K$0.59$5.70
73133Solar Pro 2 2507101086±112.8K0.9%<0.1%9 tpsN/A66K$0.50$0.50
74101gpt-oss-20b1085±122.1K2.6%0.5%216 tps0.5s131K$0.06$0.26
7552GPT-51084±75K2.1%3.1%78 tps23.1s400K$1.25$9.67
76147Arcee AI Maestro Reasoning1084±141.3K1.1%<0.1%85 tps0.3s131K$0.90$3.30
7756DeepSeek V3.1 Turbo1082±161.8K2.2%0.9%173 tps1.3s164K$2.00$3.75
78148Qwen3 30B A3B Thinking 25071081±198902.7%0.5%124 tps1.2s131K$0.16$1.70
7993Qwen Max1080±75.4K1.1%1.5%49 tps1.5s33K$1.60$6.40
8022GLM 51080±151.4K1.4%3.4%36 tps2.7s200K$0.72$2.55
81113Mistral Medium1079±92.7K1.8%1.8%48 tps0.6s33K$1.48$4.55
8252Claude Haiku 4.51078±112.6K3.7%1.1%100 tps0.9s200K$1.00$5.00
83100Gemini 2.5 Flash Preview1076±146400.8%<0.1%138 tps6.9s1M$0.15$0.60
84159Qwen Turbo1074±72.9K1.2%<0.1%53 tps1.1s1M$0.05$0.20
8595DeepSeek-R1 Turbo1073±167803.7%2.6%29 tps1.8s64K$2.85$4.75
8695Kimi K2 Thinking1072±308808.8%4.2%61 tps5.9s262K$0.24$1.03
8760MiniMax M2.11071±112.6K1.1%2.1%66 tps2.6s205K$0.30$1.20
88113Kimi K2 Fast1069±78.5K1.0%0.8%365 tps0.5s131K$1.00$3.00
8986DeepSeek V3.1 Chat1069±111.3K3.0%2.8%21 tps1.6s131K$0.38$1.00
9093DeepSeek V3 0324 Turbo1068±74.2K0.8%6.3%12 tps2.4s164K$0.73$1.79
9168Grok 41062±610.5K1.3%3.9%29 tps11.1s256K$3.00$15.00
9295Gemini 2.5 Flash1061±710K1.0%1.3%2 tps3.7s1M$0.30$2.50
93121Qwen3 32B Fast1061±93K2.4%11.6%30 tps3.1s41K$0.10$0.25
9444DeepSeek V3.1 Terminus Chat1060±131.4K3.3%3.4%27 tps1.5s131K$0.86$1.80
9586Nemotron 3 Nano (Thinking)1059±188251.8%2.0%200 tps0.5s256K$0$0
9684GPT-5 Mini Minimal1057±117455.7%1.2%63 tps1.4s400K$0.25$2.00
9771Gemini 2.5 Flash Thinking1055±112.8K2.3%2.2%88 tps6.4s1M$0.30$2.50
9865DeepSeek V3.2 Exp Chat1054±141.2K3.7%2.6%29 tps1.5s131K$0.27$0.39
9952Qwen3.5 122B A17B1053±255803.3%1.5%82 tps1.4s256K$0.40$3.20
100133Qwen3 14B1053±131.7K2.9%1.7%109 tps0.8s41K$0.04$0.15
10195Gemini 2.5 Flash Lite Thinking Preview 09251051±151.5K4.2%1.5%152 tps3.0s1M$0.10$0.40
10256MiniMax M2.1 Lightning1050±236151.6%1.7%52 tps2.1s205K$0.30$2.40
103157Qwen3 Next 80B A3B Thinking1049±92K2.6%0.6%175 tps1.3s256K$0.21$2.26
104147GLM 4.5 Air1047±71.8K2.2%<0.1%22 tps1.4s131K$0.10$0.38
10568GLM 4.71047±152.4K1.2%5.8%40 tps1.5s200K$0.77$1.73
10671GPT-5 Mini1047±92.2K2.7%2.6%66 tps14.2s400K$0.25$2.00
10737Kimi K2.5 Instant1046±166202.4%2.9%32 tps3.0s262K$0.50$3.00
10895DeepSeek V3.2 Exp Thinking1046±187354.5%7.2%26 tps3.0s131K$0.28$0.42
109106DeepSeek V3 03241045±84.1K1.0%5.8%12 tps2.7s164K$0.38$0.93
110106Grok 31044±76K1.1%1.5%53 tps0.6s1M$3.67$18.33
111113Gemini 2.5 Flash Lite Thinking1041±92.2K1.8%1.0%118 tps4.4s1M$0.03$0.13
112133DeepSeek-R1 05281038±131.7K2.0%1.3%93 tps0.5s64K$1.60$3.67
11381OpenAI o3-pro1037±189502.6%5.2%22 tps70.8s200K$20.00$80.00
114165DeepSeek R1T2 Chimera1031±175753.4%3.0%28 tps1.8s164K$0.13$0.45
11548Claude Sonnet 4 (Thinking)1028±153.7K2.9%1.5%52 tps1.5s200K$3.00$13.67
116126Qwen3 VL 235B A22B Thinking1027±139354.1%4.3%47 tps3.0s127K$0.47$3.31
11762MiniMax M21027±92.5K5.2%2.2%39 tps2.3s205K$0.21$0.85
118129Qwen3 Max Thinking1022±121.3K1.1%13.5%32 tps2.3s256K$1.20$6.00
11971Qwen3.5 397B A17B1021±229101.1%4.3%57 tps1.4s256K$0.52$3.00
120124Kimi K2 0905 Turbo1017±122.1K2.3%0.7%373 tps0.5s262K$1.70$6.50
121121QwQ 32B1015±74.6K1.4%5.4%41 tps2.1s16K$0.43$0.56
122119ERNIE 4.5 300B A47B1014±114K1.1%4.7%23 tps2.3s123K$0.28$1.10
12384Claude Sonnet 3.7 (Thinking)1014±111.8K3.0%<0.1%41 tps2.6s200K$3.00$15.00
12486Claude Sonnet 41013±710.4K1.2%1.8%49 tps1.3s200K$3.00$15.00
125108GPT-5 Mini Low1011±155104.7%<0.1%69 tps3.2s400K$0.25$2.00
126153Qwen 2.5 32B Instruct1004±141.2K1.7%2.5%48 tps1.0s131K$0.21$0.25
12765GLM 4.61001±151.3K4.4%5.4%39 tps1.5s200K$0.42$1.66
128118GPT-4.1 mini999±105.1K1.4%1.1%67 tps0.9s1M$0.34$1.60
12986Qwen3 235B A22B998±181.4K3.2%5.3%71 tps0.9s41K$0.23$0.63
13071Seed 1.8 251228997±132.4K1.3%3.7%41 tps2.1s256K$0.25$2.00
131277GLM Z1 32B994±205801.7%<0.1%18 tps9.3s33K$0.09$0.11
132161Qwen3 8B991±111.4K2.5%2.4%61 tps1.4s41K$0.02$0.07
133113GLM 4.5986±151.5K2.0%3.7%46 tps1.4s131K$0.43$1.63
134126Qwen3 30B A3B986±161.9K3.4%5.1%163 tps1.0s41K$0.06$0.21
135129DeepSeek V3.1 Thinking984±131.4K3.7%7.1%18 tps1.8s131K$0.23$0.75
136126DeepSeek V3975±105.5K0.5%0.9%69 tps1.1s64K$0.59$1.49
13765Mistral Large 3974±201.2K5.5%2.1%51 tps1.0s256K$0.50$1.50
138139OpenAI o4-mini971±82.2K2.6%1.4%97 tps7.0s128K$1.10$4.40
139148OpenAI o3968±121.6K1.6%0.9%85 tps6.8s128K$7.33$29.33
140143Gemini 2.0 Flash965±121.8K1.3%<0.1%76 tps0.5s1M$0.14$0.56
14171DeepSeek V3.1962±207503.2%0.8%197 tps0.4s164K$0.55$1.60
142129Command A959±86.2K1.2%2.2%42 tps0.8s256K$2.00$7.33
143133GPT-4.1 nano958±84K1.4%0.6%175 tps0.5s1M$0.10$0.40
144157GPT-5 Nano955±181.2K4.1%3.2%113 tps20.9s400K$0.05$0.40
145211Gemini 1.5 Pro952±455903.3%<0.1%15 tps0.0s2M$0.78$3.13
14686Amazon Nova 2 Lite948±219707.6%1.0%137 tps0.6s300K$0.35$2.95
147143Gemini 2.0 Flash Lite948±73.5K1.7%<0.1%42 tps0.5s1M$0.08$0.30
148153OpenAI o1946±113.6K1.0%4.2%92 tps5.5s200K$15.00$60.00
149165Pixtral Large941±189403.6%2.5%57 tps1.3s128K$1.50$4.50
150165Qwen3 4B940±141.7K5.2%1.9%94 tps1.5s128K$0.01$0.01
151179GLM 4.7 Flash932±266902.1%5.8%61 tps2.8s128K$0.07$0.39
152170Kimi K2 0711932±122.4K1.7%1.6%29 tps1.3s131K$0.72$2.60
153170Mistral Small 3.2 24B929±149852.5%2.8%141 tps0.7s33K$0.02$0.08
154106DeepSeek V3.1 Terminus Thinking927±178404.5%5.9%27 tps1.8s131K$0.56$1.68
155161Mistral Small 3.1924±366152.4%7.4%13 tps2.6s32K$0.17$0.28
156111Grok 3 Fast922±125300.9%1.7%52 tps2.4s131K$5.00$25.00
157133DeepSeek V3.2 Speciale922±257806.6%6.0%43 tps1.4s131K$0.84$1.52
158148OpenAI o4-mini-high915±104.7K1.5%1.9%117 tps15.9s200K$1.10$4.40
159148DeepSeek-R1904±161.7K2.8%0.8%133 tps0.6s64K$0.91$3.07
160292AFM 4.5B903±221.1K2.6%<0.1%81 tps0.3s66K$0.05$0.20
161161Llama 4 Maverick900±115.1K1.9%1.2%88 tps2.4s1M$0.23$0.83
162253R1 1776899±151.3K0.8%<0.1%61 tps1.0s128K$2.00$8.00
163200Claude Sonnet 3.5888±274952.9%1.0%40 tps2.7s200K$3.00$15.00
164143Seed 1.6 250615888±235302.8%3.1%46 tps2.2s256K$0.25$2.00
165314DeepSeek-R1 0528 Qwen3 8B886±191.1K5.7%<0.1%45 tps2.4s128K$0.05$0.09
166219NVIDIA Llama 3.3 Nemotron Super 49B v1883±179151.1%<0.1%13 tpsN/A131K$0.07$0.20
167139GLM 4.6V880±338852.2%6.4%21 tps1.8s128K$0.38$0.90
168201Gemma 3 27B IT879±215601.8%2.0%60 tps0.8s128K$0.17$0.29
169157Cogito v2.1 671B876±304903.9%0.8%85 tps0.5s128K$1.25$1.25
170160Llama 4 Scout872±114.4K1.4%0.6%88 tps5.1s131K$0.18$0.46
171194Magistral Small 2506870±161K2.8%1.6%156 tps0.5s40K$0.37$1.10
172241GPT-5 Mini High870±177703.1%<0.1%33 tps3.9s400K$0.25$2.00
173213Claude Haiku 3.5865±121.3K3.1%0.8%40 tps2.8s200K$0.80$4.00
174170Devstral Medium865±198051.8%1.5%77 tps0.6s131K$0.40$2.00
175209Llama 3.3 Swallow 70B Instruct865±198201.8%1.4%153 tps1.3s131K$0.13$0.39
176219Arcee AI Virtuoso-Large863±177101.4%<0.1%64 tps0.5s131K$0.75$1.20
177186GLM 4.6V Flash858±237502.6%3.7%64 tps2.1s128K$0.04$0.40
178175OpenAI o3-mini-low852±84.4K1.8%0.7%139 tps1.5s200K$1.10$4.40
179186Grok 3 Mini852±142.5K1.4%1.2%43 tps0.5s131K$0.30$0.50
180177OpenAI o3-mini851±74.7K1.8%0.8%143 tps3.3s200K$1.10$4.40
181179Inception Mercury847±131.4K1.0%0.4%257 tps1.1s32K$0.25$1.00
182177Mistral Small 3.1 24B Instruct839±226953.5%7.5%15 tps2.4s131K$0.06$0.18
183214OpenAI o3-mini-high833±132.9K1.0%2.4%231 tps10.5s200K$1.10$4.40
184314MAI-DS-R1823±198554.5%<0.1%73 tps3.2s64K$0.10$0.40
185399Magistral Medium (Thinking)822±246001.6%<0.1%67 tps0.8s41K$2.00$5.00
186277Wikipedia821±131.7K5.9%<0.1%47 tps2.1s32K$0$0
187222Sky T1 32B Preview821±186251.6%7.8%73 tps0.6s16K$0.12$0.18
188235GLM 4 32B820±167002.1%2.6%40 tps1.6s33K$0.14$0.14
189186Gemma 3n E4B814±171.6K3.6%2.0%30 tps0.5s8K$0.01$0.02
190186Grok 3 Mini Fast807±142.4K1.8%1.6%44 tps0.5s131K$0.60$4.00
191179Amazon Nova Pro 1.0803±131.2K2.0%0.9%96 tps0.7s300K$0.80$1.70
192277Grok 2798±185550.9%<0.1%55 tps1.1s131K$2.00$10.00
193270AFM 4.5B Preview797±386103.9%<0.1%32 tps0.0s66K$0$0
194292Arcee AI Spotlight788±151.1K1.3%<0.1%121 tps0.4s131K$0.18$0.18
195225Command R 7B787±266602.9%1.1%76 tps0.4s128K$0.04$0.15
196219Grok 3 Mini Beta783±176050.8%<0.1%75 tps0.5s131K$0.45$2.25
197200NVIDIA Llama 3.1 Nemotron 70B783±171.2K2.4%<0.1%9 tps0.1s128K$0.33$0.39
198229Magistral Medium 2509782±285507.6%4.0%58 tps0.9s131K$2.00$5.00
199270Solar Pro 2 250710 (Reasoning)782±226051.6%<0.1%9 tpsN/A66K$0.50$0.50
200200K2 Think763±246050.8%<0.1%418 tps2.8sN/A$0$0
201186Jamba 1.6 Large761±157801.9%2.0%59 tps1.2s256K$1.33$5.33
202246DeepSeek-R1 Distill Llama 70B755±199602.5%3.6%27 tps1.6s32K$0.73$0.95
203177Llama 3 70B Turbo755±141.1K5.1%<0.1%31 tps0.0s8K$0.73$0.83
204222Jamba 1.5 Large744±247152.1%1.7%48 tps0.9s256K$1.50$6.00
205233Llama 3.1 70B Instruct Turbo739±209003.2%<0.1%110 tps0.8s128K$0.88$0.88
206274DeepSeek-R1 Distill Qwen 32B736±225701.7%6.2%22 tps1.8s131K$0.37$0.39
207225Command R730±236052.4%5.8%54 tps0.6s128K$0.30$0.99
208241Arcee AI Blitz728±176550.8%<0.1%6 tpsN/A33K$0.45$0.75
209194Llama 3.3 70B726±267355.2%0.3%500 tps0.5s8K$0.48$0.66
210235Gemma 3 4B724±246603.6%1.3%138 tps0.7s131K$0.02$0.04
211406DeepSeek-R1 Distill Qwen 14B714±335702.6%<0.1%44 tps1.7s64K$0.63$0.63
212179Switchpoint Router710±266203.1%1.7%71 tps4.9s131K$0.85$3.40
213214C4AI Aya Expanse 32B709±229251.6%1.5%43 tps0.5s128K$0.50$1.50
214182Fauna Fox708±231.1K4.5%<0.1%194 tps0.3s128K$0.04$0.15
215284MiniMax M1707±141.5K1.0%<0.1%31 tps2.8s1M$0.55$2.20
216214Gemma 3 12B689±235504.3%4.2%73 tps0.8s131K$0.05$0.12
217260Hermes 4 405B Reasoning FP8681±317255.2%3.6%32 tps0.8s131K$1.00$3.00
218209Qwen 2.5 14B Instruct679±225053.8%2.4%40 tps1.6s1M$0.40$1.61
219201GPT-4o mini654±305154.6%2.1%71 tps1.7s128K$0.15$0.60
220241Claude Haiku 3647±425252.8%0.4%62 tps0.5s200K$0.25$1.25
221339Refuel LLM 2 Small565±275252.8%<0.1%116 tps0.5s8K$0.20$0.20
222287Phi 4 Reasoning486±325302.8%21.0%29 tps1.0s33K$0.06$0.25
223291Phi 4 Mini Reasoning258±229357.0%9.7%30 tps0.9s128K$0.07$0.30
Show Less