Filter model performance by the number of turns in a conversation.
Filter the leaderboard to only show models that have an open license.
Last updated about 1 month ago
| Rank | Overall | Name | VIBE Score | Confidence Interval | Votes | Downvote % | Abort % | Speed | Latency | Context | Cost (Input) | Cost (Output) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 41 | 274 | 862 | ±17 | 585 | 7.1% | 3.1% | 44 tps | 3.8s | 131K | $2.00 | $5.00 | |
| 42 | 246 | 866 | ±31 | 570 | 10.2% | 1.8% | 142 tps | 0.7s | 66K | $0.45 | $0.45 | |
| 43 | 229 | 868 | ±15 | 800 | 13.5% | 1.4% | 177 tps | 0.4s | 128K | $0.14 | $0.14 | |
| 44 | 256 | 869 | ±15 | 640 | 16.3% | 1.8% | 90 tps | 1.7s | 33K | $0.15 | $0.15 | |
| 45 | 165 | 871 | ±7 | 3.3K | 16.3% | 1.9% | 94 tps | 1.5s | 128K | $0.01 | $0.01 | |
| 46 | 225 | 872 | ±16 | 770 | 12.5% | 5.8% | 54 tps | 0.6s | 128K | $0.30 | $0.99 | |
| 47 | 214 | 874 | ±15 | 910 | 11.2% | 4.2% | 73 tps | 0.8s | 131K | $0.05 | $0.12 | |
| 48 | 186 | 877 | ±17 | 620 | 15.1% | 1.3% | 58 tps | 1.0s | 256K | $1.33 | $5.33 | |
| 49 | 246 | 880 | ±23 | 465 | 12.3% | 1.2% | 140 tps | 0.6s | 64K | $2.00 | $6.00 | |
| 50 | 179 | 887 | ±13 | 845 | 2.9% | 5.8% | 61 tps | 2.8s | 128K | $0.07 | $0.39 | |
| 51 | 214 | 889 | ±11 | 1.3K | 10.0% | 1.5% | 43 tps | 0.5s | 128K | $0.50 | $1.50 | |
| 52 | 235 | 893 | ±12 | 1.2K | 11.1% | 2.6% | 40 tps | 1.6s | 33K | $0.14 | $0.14 | |
| 53 | 194 | 893 | ±10 | 2.2K | 9.2% | 0.3% | 500 tps | 0.5s | 8K | $0.48 | $0.66 | |
| 54 | 186 | 895 | ±11 | 880 | 9.7% | 2.0% | 59 tps | 1.2s | 256K | $1.33 | $5.33 | |
| 55 | 186 | 897 | ±7 | 5.2K | 14.9% | 1.6% | 44 tps | 0.5s | 131K | $0.60 | $4.00 | |
| 56 | 186 | 902 | ±18 | 615 | 13.4% | 1.8% | 35 tps | 1.1s | 66K | $0.06 | $0.10 | |
| 57 | 161 | 902 | ±12 | 2K | 17.8% | 2.4% | 61 tps | 1.4s | 41K | $0.02 | $0.07 | |
| 58 | 157 | 903 | ±7 | 4.9K | 11.2% | 0.6% | 175 tps | 1.3s | 256K | $0.21 | $2.26 | |
| 59 | 186 | 903 | ±5 | 6K | 12.8% | 1.2% | 43 tps | 0.5s | 131K | $0.30 | $0.50 | |
| 60 | 209 | Llama 3.3 Swallow 70B Instruct | 904 | ±8 | 1.6K | 15.2% | 1.4% | 153 tps | 1.3s | 131K | $0.13 | $0.39 |
| 61 | 201 | 905 | ±12 | 805 | 7.5% | 4.9% | 36 tps | 3.5s | 123K | $0.42 | $1.25 | |
| 62 | 186 | 905 | ±10 | 2.6K | 8.4% | 2.0% | 30 tps | 0.5s | 8K | $0.01 | $0.02 | |
| 63 | 246 | 907 | ±23 | 535 | 7.0% | 3.6% | 27 tps | 1.6s | 32K | $0.73 | $0.95 | |
| 64 | 214 | 908 | ±14 | 1.1K | 6.5% | 2.4% | 231 tps | 10.5s | 200K | $1.10 | $4.40 | |
| 65 | 170 | 911 | ±10 | 2K | 12.4% | 2.8% | 141 tps | 0.7s | 33K | $0.02 | $0.08 | |
| 66 | 170 | 911 | ±8 | 3.2K | 9.2% | 1.6% | 29 tps | 1.3s | 131K | $0.72 | $2.60 | |
| 67 | 179 | Baichuan-M2-32B | 911 | ±25 | 505 | 13.7% | <0.1% | 32 tps | 3.3s | 131K | $0.07 | $0.07 |
| 68 | 160 | 911 | ±6 | 8K | 9.6% | 0.6% | 88 tps | 5.1s | 131K | $0.18 | $0.46 | |
| 69 | 214 | 914 | ±24 | 640 | 11.7% | 2.0% | 78 tps | 1.0s | 131K | $0.88 | $0.88 | |
| 70 | 186 | 915 | ±22 | 525 | 9.5% | 1.9% | 113 tps | 1.1s | 131K | $0.02 | $0.08 | |
| 71 | 179 | 916 | ±16 | 2.1K | 10.3% | 0.9% | 96 tps | 0.7s | 300K | $0.80 | $1.70 | |
| 72 | 153 | 916 | ±9 | 1.9K | 18.0% | 2.5% | 48 tps | 1.0s | 131K | $0.21 | $0.25 | |
| 73 | 229 | 918 | ±8 | 2.1K | 11.3% | 4.0% | 58 tps | 0.9s | 131K | $2.00 | $5.00 | |
| 74 | 148 | 919 | ±10 | 1.3K | 3.6% | 0.5% | 124 tps | 1.2s | 131K | $0.16 | $1.70 | |
| 75 | 179 | 919 | ±10 | 2.8K | 11.5% | 0.4% | 257 tps | 1.1s | 32K | $0.25 | $1.00 | |
| 76 | 194 | 920 | ±15 | 2K | 6.9% | 1.6% | 156 tps | 0.5s | 40K | $0.37 | $1.10 | |
| 77 | 201 | 924 | ±16 | 570 | 12.3% | 2.4% | 180 tps | 0.6s | 131K | $0.10 | $0.30 | |
| 78 | 133 | 924 | ±12 | 1.6K | 6.3% | 6.0% | 43 tps | 1.4s | 131K | $0.84 | $1.52 | |
| 79 | 209 | 928 | ±13 | 910 | 11.7% | 2.4% | 40 tps | 1.6s | 1M | $0.40 | $1.61 | |
| 80 | 265 | 929 | ±12 | 1.2K | 7.9% | 5.3% | 25 tps | 3.7s | 128K | $1.01 | $2.79 |