Filter model performance by the number of turns in a conversation.
Filter the leaderboard to only show models that have an open license.
Last updated about 1 month ago
| Rank | Overall | Name | VIBE Score | Confidence Interval | Votes | Downvote % | Abort % | Speed | Latency | Context | Cost (Input) | Cost (Output) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 81 | 121 | 932 | ±5 | 4.5K | 12.9% | 11.6% | 30 tps | 3.1s | 41K | $0.10 | $0.25 | |
| 82 | 133 | 933 | ±11 | 2.7K | 17.1% | 1.7% | 109 tps | 0.8s | 41K | $0.04 | $0.15 | |
| 83 | 175 | 937 | ±4 | 6.1K | 13.6% | 0.7% | 139 tps | 1.5s | 200K | $1.10 | $4.40 | |
| 84 | 148 | 937 | ±6 | 6K | 14.9% | 1.9% | 117 tps | 15.9s | 200K | $1.10 | $4.40 | |
| 85 | 139 | 938 | ±11 | 2.5K | 6.1% | 6.4% | 21 tps | 1.8s | 128K | $0.38 | $0.90 | |
| 86 | 86 | 938 | ±14 | 1.3K | 7.6% | 2.0% | 200 tps | 0.5s | 256K | $0 | $0 | |
| 87 | 148 | 939 | ±12 | 1.6K | 5.5% | 0.8% | 133 tps | 0.6s | 64K | $0.91 | $3.07 | |
| 88 | 201 | 939 | ±9 | 1.4K | 9.2% | 2.1% | 71 tps | 1.7s | 128K | $0.15 | $0.60 | |
| 89 | 126 | 939 | ±5 | 3.7K | 12.1% | 5.1% | 163 tps | 1.0s | 41K | $0.06 | $0.21 | |
| 90 | 194 | 943 | ±14 | 745 | 14.4% | 1.5% | 152 tps | 0.5s | 8K | $0.16 | $0.16 | |
| 91 | 170 | 945 | ±11 | 1.6K | 14.7% | 1.5% | 77 tps | 0.6s | 131K | $0.40 | $2.00 | |
| 92 | 153 | 945 | ±28 | 490 | 11.7% | 2.0% | 119 tps | 0.5s | 128K | $0.20 | $0.20 | |
| 93 | 177 | 946 | ±6 | 6.7K | 12.3% | 0.8% | 143 tps | 3.3s | 200K | $1.10 | $4.40 | |
| 94 | 165 | 949 | ±8 | 1.5K | 11.2% | 4.5% | 84 tps | 2.9s | 127K | $0.20 | $1.47 | |
| 95 | 101 | 954 | ±5 | 6.1K | 10.8% | 0.5% | 216 tps | 0.5s | 131K | $0.06 | $0.26 | |
| 96 | 121 | 955 | ±7 | 5K | 15.3% | 5.4% | 41 tps | 2.1s | 16K | $0.43 | $0.56 | |
| 97 | 161 | 956 | ±4 | 11.2K | 8.2% | 1.2% | 88 tps | 2.4s | 1M | $0.23 | $0.83 | |
| 98 | 161 | 960 | ±16 | 915 | 11.2% | 7.4% | 13 tps | 2.6s | 32K | $0.17 | $0.28 | |
| 99 | 177 | 966 | ±12 | 1K | 10.6% | 7.5% | 15 tps | 2.4s | 131K | $0.06 | $0.18 | |
| 100 | 222 | Sky T1 32B Preview | 972 | ±16 | 805 | 10.6% | 7.8% | 73 tps | 0.6s | 16K | $0.12 | $0.18 |
| 101 | 209 | 974 | ±16 | 980 | 6.2% | 2.5% | 108 tps | 1.6s | 256K | $0.07 | $0.30 | |
| 102 | 214 | 977 | ±20 | 850 | 7.6% | 6.3% | 43 tps | 3.2s | 128K | $0.35 | $0.62 | |
| 103 | 165 | DeepSeek R1T2 Chimera | 978 | ±10 | 1.1K | 11.0% | 3.0% | 28 tps | 1.8s | 164K | $0.13 | $0.45 |
| 104 | 126 | 979 | ±6 | 3.5K | 11.5% | 4.3% | 47 tps | 3.0s | 127K | $0.47 | $3.31 | |
| 105 | 201 | 983 | ±15 | 905 | 10.4% | 2.0% | 60 tps | 0.8s | 128K | $0.17 | $0.29 | |
| 106 | 133 | 983 | ±12 | 1.3K | 4.6% | 1.3% | 93 tps | 0.5s | 64K | $1.60 | $3.67 | |
| 107 | 157 | 984 | ±17 | 715 | 5.9% | 0.8% | 85 tps | 0.5s | 128K | $1.25 | $1.25 | |
| 108 | 129 | 990 | ±13 | 1.7K | 2.3% | 13.5% | 32 tps | 2.3s | 256K | $1.20 | $6.00 | |
| 109 | 133 | 991 | ±6 | 7.5K | 5.6% | 4.0% | 30 tps | 1.4s | 262K | $0.63 | $2.39 | |
| 110 | 111 | LongCat Flash Chat | 996 | ±14 | 930 | 7.0% | 0.8% | 85 tps | 0.9s | 131K | $0.14 | $0.68 |
| 111 | 170 | 998 | ±12 | 1.1K | 2.8% | 2.1% | 650 tps | 0.5s | 128K | $0.13 | $0.14 | |
| 112 | 139 | 1000 | ±5 | 4.8K | 10.2% | 1.4% | 97 tps | 7.0s | 128K | $1.10 | $4.40 | |
| 113 | 121 | 1000 | ±16 | 1K | 9.9% | 2.0% | 50 tps | 0.6s | 131K | $0.09 | $0.33 | |
| 114 | 113 | 1002 | ±5 | 3.7K | 14.3% | 3.7% | 46 tps | 1.4s | 131K | $0.43 | $1.63 | |
| 115 | 113 | 1006 | ±4 | 26.2K | 13.8% | 0.8% | 365 tps | 0.5s | 131K | $1.00 | $3.00 | |
| 116 | 143 | 1011 | ±6 | 5.7K | 6.9% | <0.1% | 42 tps | 0.5s | 1M | $0.08 | $0.30 | |
| 117 | 139 | 1012 | ±17 | 1K | 6.5% | 1.8% | 80 tps | 2.6s | 129K | $0.18 | $0.67 | |
| 118 | 129 | 1014 | ±7 | 3.9K | 14.0% | 7.1% | 18 tps | 1.8s | 131K | $0.23 | $0.75 | |
| 119 | 148 | 1016 | ±11 | 1.3K | 4.6% | 0.9% | 85 tps | 6.8s | 128K | $7.33 | $29.33 | |
| 120 | 124 | 1018 | ±11 | 1.1K | 4.2% | 2.5% | 53 tps | 1.6s | 131K | $0.59 | $5.70 |