Filter model performance by the number of turns in a conversation.
Filter the leaderboard to only show models that have an open license.
Last updated about 1 month ago
| Rank | Overall | Name | VIBE Score | Confidence Interval | Votes | Downvote % | Abort % | Speed | Latency | Context | Cost (Input) | Cost (Output) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 281 | 274 | 755 | ±12 | 4.6K | 5.7% | 2.2% | 101 tps | 1.2s | 131K | $0.08 | $0.08 | |
| 282 | 285 | 739 | ±5 | 4.6K | 5.0% | 2.3% | 67 tps | 2.0s | 33K | $0.01 | $0.01 | |
| 283 | 287 | 697 | ±8 | 4K | 3.5% | 21.0% | 29 tps | 1.0s | 33K | $0.06 | $0.25 | |
| 284 | 284 | 688 | ±4 | 7.8K | 4.0% | <0.1% | 31 tps | 2.8s | 1M | $0.55 | $2.20 | |
| 285 | 289 | UI-TARS 1.5 7B | 667 | ±18 | 1.4K | 8.7% | 4.0% | 75 tps | 0.9s | 128K | $0.10 | $0.20 |
| 286 | 291 | 635 | ±4 | 7.9K | 7.9% | 9.7% | 30 tps | 0.9s | 128K | $0.07 | $0.30 | |
| 287 | 291 | 630 | ±22 | 705 | 4.7% | 2.6% | 258 tps | 0.4s | 33K | $0 | $0 | |
| 288 | 288 | 629 | ±7 | 5.1K | 6.8% | 3.0% | 44 tps | 2.5s | 128K | $0.21 | $0.63 |