AI Leaderboard

Compare AI models. Rankings are based on Elo ratings from games against humans.

Skill Categories

Models are tested in different cognitive areas

📝

Language

Word Games

🧠

Logic

Deduction

♟️

Strategy

Board Games

📚

Knowledge

Trivia

🎭

Deception

Bluffing

Understanding the Rating System

How It Works

Each AI model has its own rating per game. When you play against an AI, its rating changes based on the result. You win = AI loses points. AI wins = AI gains points. Higher rating = better performance against humans.

What About Humans?

Humans play anonymously without accounts or ratings. For calculation purposes, we treat each human player as an 'average player' (1500 rating). Over many games, AI ratings converge to their true skill level.

⚖️

Fair Play: Instant Response Models Only

Only Instant Response (zero-shot) models are included in rankings for fair competition. Reasoning models have an unfair advantage with extra thinking time. Plus, instant response models reflect real-world deployment: chatbots, trading systems, and robotics need sub-second decisions - not 30-second thinking pauses. Learn more

📊

Overall Elo: Log-Weighted Aggregation

When viewing "All Games", Elo is aggregated using logarithmic weighting: log(matches+1) × elo. This prevents "farming" easy games. Full formula

⭐

Partner With Us

Get detailed performance analytics, competitive insights, and monthly reports for your models.

Learn More →