Filter model performance by the number of turns in a conversation.
Filter the leaderboard to only show models that have an open license.
Last updated about 1 month ago
| Rank | Overall | Name | VIBE Score | Confidence Interval | Votes | Downvote % | Abort % | Speed | Latency | Context | Cost (Input) | Cost (Output) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 241 | 240 | 834 | ±37 | 520 | 8.8% | 12.2% | 15 tps | 2.2s | 131K | $0 | $0 | |
| 242 | 240 | 832 | ±23 | 1.3K | 4.6% | 1.3% | 54 tps | 0.4s | 33K | $0.60 | $0.60 | |
| 243 | 240 | 831 | ±17 | 2K | 5.1% | 3.7% | 40 tps | 1.9s | 131K | $0.08 | $0.27 | |
| 244 | 240 | Sky T1 32B Preview | 829 | ±14 | 2.4K | 4.5% | 7.8% | 73 tps | 0.6s | 16K | $0.12 | $0.18 |
| 245 | 240 | 826 | ±26 | 810 | 10.0% | 6.7% | 184 tps | 0.4s | 33K | $0.01 | $0.02 | |
| 246 | 240 | 825 | ±10 | 2.3K | 2.3% | 12.5% | 33 tps | 2.1s | 128K | $1.00 | $1.00 | |
| 247 | 240 | 825 | ±17 | 2.2K | 5.5% | 1.4% | 177 tps | 0.4s | 128K | $0.14 | $0.14 | |
| 248 | 240 | 821 | ±7 | 3.8K | 4.0% | 1.5% | 43 tps | 0.5s | 128K | $0.50 | $1.50 | |
| 249 | 240 | 820 | ±17 | 950 | 3.1% | 1.4% | 53 tps | 1.4s | 33K | $1.00 | $3.00 | |
| 250 | 240 | 818 | ±18 | 825 | 11.3% | <0.1% | 142 tps | 0.3s | 33K | $0.01 | $0.02 | |
| 251 | 240 | 815 | ±17 | 1.5K | 4.1% | 1.4% | 44 tps | 1.4s | 8K | $0.80 | $0.80 | |
| 252 | 252 | 806 | ±16 | 2.3K | 5.1% | 0.8% | 248 tps | 0.4s | 131K | $0.08 | $0.08 | |
| 253 | 252 | 802 | ±18 | 1.8K | 7.5% | 2.7% | 116 tps | 0.6s | 131K | $0.50 | $1.50 | |
| 254 | 252 | 802 | ±11 | 2K | 6.1% | 0.6% | 176 tps | 1.0s | 33K | $0.06 | $0.10 | |
| 255 | 252 | 801 | ±12 | 1.9K | 3.1% | 11.6% | 11 tps | 2.5s | 66K | $0.77 | $0.77 | |
| 256 | 252 | 798 | ±16 | 1.7K | 3.4% | 5.1% | 28 tps | 1.3s | 128K | $0.10 | $0.32 | |
| 257 | 252 | 797 | ±21 | 815 | 8.4% | 3.5% | 31 tps | 0.9s | 131K | $0.52 | $1.73 | |
| 258 | 252 | 793 | ±27 | 510 | 3.8% | <0.1% | 247 tps | 2.2s | 32K | $0.25 | $1.00 | |
| 259 | 252 | 787 | ±9 | 2K | 2.7% | <0.1% | 46 tps | 1.2s | 4K | $1.50 | $2.00 | |
| 260 | 252 | 785 | ±16 | 1.1K | 5.8% | 1.5% | 54 tps | 0.7s | 33K | $2.00 | $6.00 | |
| 261 | 252 | 781 | ±29 | 460 | 8.9% | 1.1% | 67 tps | 0.6s | 131K | $0.12 | $0.39 | |
| 262 | 262 | 778 | ±18 | 2.2K | 4.9% | 5.8% | 54 tps | 0.6s | 128K | $0.30 | $0.99 | |
| 263 | 262 | Baichuan-M2-32B | 770 | ±30 | 740 | 10.8% | <0.1% | 32 tps | 3.3s | 131K | $0.07 | $0.07 |
| 264 | 262 | 770 | ±12 | 1.2K | 4.5% | 1.7% | 142 tps | 0.6s | 32K | $0.43 | $1.30 | |
| 265 | 262 | 762 | ±18 | 1.3K | 4.7% | 0.7% | 176 tps | 0.4s | 33K | $0.25 | $0.25 | |
| 266 | 262 | 759 | ±11 | 2.7K | 12.8% | 3.6% | 32 tps | 0.8s | 131K | $1.00 | $3.00 | |
| 267 | 262 | 754 | ±24 | 745 | 5.7% | 2.7% | 21 tps | 2.2s | 6K | $6.56 | $9.38 | |
| 268 | 262 | 746 | ±20 | 2.1K | 6.0% | 5.3% | 25 tps | 3.7s | 128K | $1.01 | $2.79 | |
| 269 | 269 | 742 | ±10 | 3.3K | 4.7% | 1.3% | 138 tps | 0.7s | 131K | $0.02 | $0.04 | |
| 270 | 269 | 738 | ±17 | 1.4K | 5.6% | 1.8% | 142 tps | 0.7s | 66K | $0.45 | $0.45 | |
| 271 | 269 | 738 | ±15 | 1.6K | 5.6% | 2.8% | 36 tps | 0.7s | 128K | $2.08 | $9.45 | |
| 272 | 269 | 737 | ±24 | 1.5K | 5.0% | 0.6% | 50 tps | 3.2s | 8K | $2.50 | $10.00 | |
| 273 | 269 | 722 | ±21 | 3K | 6.3% | 2.2% | 101 tps | 1.2s | 131K | $0.08 | $0.08 | |
| 274 | 269 | 719 | ±18 | 1.5K | 4.1% | 1.1% | 33 tps | 3.4s | 8K | $2.50 | $10.00 | |
| 275 | 269 | 706 | ±30 | 715 | 5.9% | 2.5% | 50 tps | 1.0s | 33K | $0.06 | $0.25 | |
| 276 | 276 | 702 | ±20 | 1.4K | 4.1% | 2.3% | 20 tps | 1.1s | 131K | $0.80 | $0.80 | |
| 277 | 276 | 696 | ±20 | 2K | 5.5% | 6.2% | 22 tps | 1.8s | 131K | $0.37 | $0.39 | |
| 278 | 276 | 686 | ±13 | 3.8K | 5.3% | <0.1% | 31 tps | 2.8s | 1M | $0.55 | $2.20 | |
| 279 | 279 | UI-TARS 1.5 7B | 610 | ±40 | 530 | 11.7% | 4.0% | 75 tps | 0.9s | 128K | $0.10 | $0.20 |
| 280 | 279 | 600 | ±21 | 2.3K | 5.8% | 1.2% | 22 tps | 1.1s | 4K | $0.18 | $0.18 |