Filter model performance by the number of turns in a conversation.
Filter the leaderboard to only show models that have an open license.
Last updated about 1 month ago
| Rank | Overall | Name | VIBE Score | Confidence Interval | Votes | Downvote % | Abort % | Speed | Latency | Context | Cost (Input) | Cost (Output) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 161 | 201 | 924 | ±16 | 570 | 12.3% | 2.4% | 180 tps | 0.6s | 131K | $0.10 | $0.30 | |
| 162 | 194 | 920 | ±15 | 2K | 6.9% | 1.6% | 156 tps | 0.5s | 40K | $0.37 | $1.10 | |
| 163 | 179 | 919 | ±10 | 2.8K | 11.5% | 0.4% | 257 tps | 1.1s | 32K | $0.25 | $1.00 | |
| 164 | 148 | 919 | ±10 | 1.3K | 3.6% | 0.5% | 124 tps | 1.2s | 131K | $0.16 | $1.70 | |
| 165 | 229 | 918 | ±8 | 2.1K | 11.3% | 4.0% | 58 tps | 0.9s | 131K | $2.00 | $5.00 | |
| 166 | 153 | 916 | ±9 | 1.9K | 18.0% | 2.5% | 48 tps | 1.0s | 131K | $0.21 | $0.25 | |
| 167 | 179 | 916 | ±16 | 2.1K | 10.3% | 0.9% | 96 tps | 0.7s | 300K | $0.80 | $1.70 | |
| 168 | 186 | 915 | ±22 | 525 | 9.5% | 1.9% | 113 tps | 1.1s | 131K | $0.02 | $0.08 | |
| 169 | 214 | 914 | ±24 | 640 | 11.7% | 2.0% | 78 tps | 1.0s | 131K | $0.88 | $0.88 | |
| 170 | 160 | 911 | ±6 | 8K | 9.6% | 0.6% | 88 tps | 5.1s | 131K | $0.18 | $0.46 | |
| 171 | 179 | Baichuan-M2-32B | 911 | ±25 | 505 | 13.7% | <0.1% | 32 tps | 3.3s | 131K | $0.07 | $0.07 |
| 172 | 170 | 911 | ±8 | 3.2K | 9.2% | 1.6% | 29 tps | 1.3s | 131K | $0.72 | $2.60 | |
| 173 | 170 | 911 | ±10 | 2K | 12.4% | 2.8% | 141 tps | 0.7s | 33K | $0.02 | $0.08 | |
| 174 | 214 | 908 | ±14 | 1.1K | 6.5% | 2.4% | 231 tps | 10.5s | 200K | $1.10 | $4.40 | |
| 175 | 246 | 907 | ±23 | 535 | 7.0% | 3.6% | 27 tps | 1.6s | 32K | $0.73 | $0.95 | |
| 176 | 186 | 905 | ±10 | 2.6K | 8.4% | 2.0% | 30 tps | 0.5s | 8K | $0.01 | $0.02 | |
| 177 | 201 | 905 | ±12 | 805 | 7.5% | 4.9% | 36 tps | 3.5s | 123K | $0.42 | $1.25 | |
| 178 | 209 | Llama 3.3 Swallow 70B Instruct | 904 | ±8 | 1.6K | 15.2% | 1.4% | 153 tps | 1.3s | 131K | $0.13 | $0.39 |
| 179 | 186 | 903 | ±5 | 6K | 12.8% | 1.2% | 43 tps | 0.5s | 131K | $0.30 | $0.50 | |
| 180 | 157 | 903 | ±7 | 4.9K | 11.2% | 0.6% | 175 tps | 1.3s | 256K | $0.21 | $2.26 | |
| 181 | 161 | 902 | ±12 | 2K | 17.8% | 2.4% | 61 tps | 1.4s | 41K | $0.02 | $0.07 | |
| 182 | 186 | 902 | ±18 | 615 | 13.4% | 1.8% | 35 tps | 1.1s | 66K | $0.06 | $0.10 | |
| 183 | 186 | 897 | ±7 | 5.2K | 14.9% | 1.6% | 44 tps | 0.5s | 131K | $0.60 | $4.00 | |
| 184 | 186 | 895 | ±11 | 880 | 9.7% | 2.0% | 59 tps | 1.2s | 256K | $1.33 | $5.33 | |
| 185 | 194 | 893 | ±10 | 2.2K | 9.2% | 0.3% | 500 tps | 0.5s | 8K | $0.48 | $0.66 | |
| 186 | 235 | 893 | ±12 | 1.2K | 11.1% | 2.6% | 40 tps | 1.6s | 33K | $0.14 | $0.14 | |
| 187 | 214 | 889 | ±11 | 1.3K | 10.0% | 1.5% | 43 tps | 0.5s | 128K | $0.50 | $1.50 | |
| 188 | 179 | 887 | ±13 | 845 | 2.9% | 5.8% | 61 tps | 2.8s | 128K | $0.07 | $0.39 | |
| 189 | 246 | 880 | ±23 | 465 | 12.3% | 1.2% | 140 tps | 0.6s | 64K | $2.00 | $6.00 | |
| 190 | 186 | 877 | ±17 | 620 | 15.1% | 1.3% | 58 tps | 1.0s | 256K | $1.33 | $5.33 | |
| 191 | 214 | 874 | ±15 | 910 | 11.2% | 4.2% | 73 tps | 0.8s | 131K | $0.05 | $0.12 | |
| 192 | 225 | 872 | ±16 | 770 | 12.5% | 5.8% | 54 tps | 0.6s | 128K | $0.30 | $0.99 | |
| 193 | 165 | 871 | ±7 | 3.3K | 16.3% | 1.9% | 94 tps | 1.5s | 128K | $0.01 | $0.01 | |
| 194 | 256 | 869 | ±15 | 640 | 16.3% | 1.8% | 90 tps | 1.7s | 33K | $0.15 | $0.15 | |
| 195 | 229 | 868 | ±15 | 800 | 13.5% | 1.4% | 177 tps | 0.4s | 128K | $0.14 | $0.14 | |
| 196 | 246 | 866 | ±31 | 570 | 10.2% | 1.8% | 142 tps | 0.7s | 66K | $0.45 | $0.45 | |
| 197 | 274 | 862 | ±17 | 585 | 7.1% | 3.1% | 44 tps | 3.8s | 131K | $2.00 | $5.00 | |
| 198 | 194 | 858 | ±18 | 585 | 12.7% | 2.6% | 77 tps | 0.6s | 33K | $0.07 | $0.14 | |
| 199 | 179 | 854 | ±26 | 535 | 11.6% | 1.2% | 96 tps | 1.2s | 131K | $0.14 | $0.26 | |
| 200 | 222 | 853 | ±24 | 590 | 7.8% | 0.6% | 103 tps | 0.3s | 33K | $0.15 | $0.15 |