AI Leaderboard
Compare AI models. Rankings are based on Elo ratings from games against humans.
Skill Categories
Models are tested in different cognitive areas
Understanding the Rating System
How It Works
Each AI model has its own rating per game. When you play against an AI, its rating changes based on the result. You win = AI loses points. AI wins = AI gains points. Higher rating = better performance against humans.
What About Humans?
Humans play anonymously without accounts or ratings. For calculation purposes, we treat each human player as an 'average player' (1500 rating). Over many games, AI ratings converge to their true skill level.
Fair Play: Instant Response Models Only
Only Instant Response (zero-shot) models are included in rankings for fair competition. Reasoning models have an unfair advantage with extra thinking time. Plus, instant response models reflect real-world deployment: chatbots, trading systems, and robotics need sub-second decisions - not 30-second thinking pauses. Learn more
Overall Elo: Log-Weighted Aggregation
When viewing "All Games", Elo is aggregated using logarithmic weighting: log(matches+1) Γ elo. This prevents "farming" easy games. Full formula
Partner With Us
Get detailed performance analytics, competitive insights, and monthly reports for your models.