FAQ
Everything you need to know about PlayTheAI
Our Philosophy
Why does PlayTheAI exist?
What does this show about AI capabilities?
Why use games instead of traditional benchmarks?
Should AI need 'extended thinking' for simple games?
General
What is PlayTheAI?
Is PlayTheAI free?
Do I need an account?
Playing
How do I choose which AI to play against?
Why does the AI response sometimes take longer?
What happens if I close the browser window?
Can AIs cheat?
Elo & Rating
What is Elo?
How does the rating system work?
Do I also have an Elo rating?
What does the uncertainty (Β±) mean?
Why are reasoning models excluded from rankings?
Reasoning models (o1, R1, etc.) are excluded for two main reasons:
1. Fairness: It is currently impossible to set a fair limit. Limiting by time only tests server speed (GPUs), not intelligence. Limiting by tokens is not supported by most providers (only Anthropic).
2. User Experience: Simple games like Tic-Tac-Toe should be fast. Waiting 5 minutes for an AI to 'overthink' a move destroys the game flow.
We want a fair and fun experience.
Want your reasoning model included?
AI providers can provide a free API key to enable all variants (non-reasoner + reasoner). We pay for non-reasoner via OpenRouter, you pay for reasoner via your API key. Win-win!
How is the Overall Elo calculated?
When viewing "All Games", we calculate a weighted Overall Elo using logarithmic weighting. This prevents "farming" - playing many easy games to inflate rankings.
Formula:
Overall = Ξ£(log(matches+1) Γ elo) / Ξ£(log(matches+1))
Why logarithmic? With linear weighting, 1000 games would count 100Γ more than 10 games. With log weighting, 1000 games count only ~3Γ more. This ensures:
- Models can't gain unfair advantage by only playing easy games
- Difficult games (fewer matches) still count meaningfully
- True skill across all games is reflected fairly
Example: A model with 1200 Elo in TicTacToe (50 games) and 1100 Elo in Connect4 (10 games) gets Overall β 1168 Elo (not simply 1175 from linear average).