Match Predictor & Value Finder

XGBoost model trained on 95k+ ATP matches with custom ELO ratings

Over 66% accuracy. 95,000 matches. 43 years of data.

Here's exactly how it works.

Where the data comes from

The model was trained on every ATP professional tennis match played from 1985 to 2024 — over 95,000 matches in total.

For each match the dataset includes player rankings and seedings, playing surface (hard, clay, grass), first serve win percentage, break points saved and converted, player ages and heights, head-to-head records, and win rates over the last 10, 25, 50, and 100 matches.

The data is sourced from Jeff Sackmann's open ATP tennis archive, the most comprehensive public tennis dataset in existence.

The secret weapon: ELO ratings

The single most powerful predictor in the model isn't ranking or recent form — it's a custom ELO rating system adapted from chess.

Every player starts at a rating of 1500. Win a match and your rating goes up. Lose and it goes down. Beat a higher-rated opponent and you gain more points. Lose to a weaker one and you lose more.

The model tracks three separate ELO ratings per player — one for hard courts, one for clay, and one for grass. This captures surface specialists like Nadal on clay or Federer on grass far better than a single ranking number ever could.

How the model works

The prediction engine uses XGBoost, a gradient-boosted decision tree algorithm. It takes 20 engineered features as input — ELO ratings, surface-specific ELO, recent form, head-to-head records, serve statistics, age, height, and ranking — and outputs a win probability for each player.

The model was validated on 3,076 held-out matches from 2024 that it never saw during training. It achieved 66.3% accuracy — well above the 63.9% baseline of simply picking the higher-ranked player.

Finding betting value

A prediction alone isn't enough. The value finder compares the model's probability to the implied probability from bookmaker odds. If the model thinks a player has a significantly higher chance of winning than the odds suggest, that's a value bet.

Bet sizing uses the Kelly Criterion — a mathematically optimal formula that balances risk and reward based on your edge and bankroll.