Introducing xGAR: expected goals above replacement
By Dan Altman, creator of smarterscout
Though it's not always obvious, many of the metrics we use today in football analytics have been borrowed from other sports. Basketball analysts like Kirk Goldsberry were among the first to gauge the changing value of possessions based on the likelihood of scoring. Our strategy for duel-based skill ratings descends from Arpad Elo's work in chess. The basis for our shooting and saving ratings owes much to ice hockey analysts. And then, there's baseball.
Baseball is very different from football. It's not a "complex invasion sport" where one team tries to push a ball or puck into another team's territory using an infinite variety of attacks. Rather, it's a game made up of a series of discretely defined situations, all stemming from repeated one-on-one encounters between batters and pitchers/catchers. Though basketball, ice hockey, lacrosse, water polo, and even American football can draw parallels with association football, baseball is practically another planet.
Yet baseball, with its long statistical tradition, still offers some metrics to entice analysts from other sports. Foremost among these is the oft-cited, if controversial, wins above replacement (WAR).
WAR is supposed to estimate the number of wins a baseball player will add to his team's total over the course of a season, above and beyond the value of a "replacement player" – essentially a reserve player at the same position. But it throws up so many questions: How good is a reserve player? How many games will the player participate in over the course of the season? Shouldn't the player's value be different for every club? And why are baseball's WAR formulas apparently so arbitrary?
This last question was always the one that bothered me the most. The coefficients in WAR models didn't seem to come from structural models of the game. Rather, they seemed to be estimated ad hoc.
But the pull of an "above replacement" metric for football was still strong. In fact, one of our Data Scientist members asked us for it. So I gave it a try, and what I came up with was more convincing than I would have guessed (or expected).
In keeping with the rest of our platform, I wanted to create a metric that was both league-adjusted and based on models of the game. So I started with our ratings for attacking output and defending quality. The metrics underpinning these ratings come directly from our two models of expected goals. Attacking output is based on contributions to expected goals per minute in possession. Defending quality is based on expected goals conceded per defending opportunity. We also measure defending quantity: the number of defending opportunities per minute out of possession.
To turn these base metrics into our existing ratings, we standardise them and then apply the league adjustments. The key insight was that we could reverse-engineer the ratings to get league-adjusted measures of expected goals with the right denominators.
Here's how I did it. First I picked a benchmark league, just like all of our members do. Then, for each possible value of the three ratings – attacking output, defending quality, and defending quantity, all from 0 to 99 – I measured the average values of the underlying metrics among players in the benchmark league. So if a striker in the Bundesliga had a rating of 60 for attacking output using the Premier League as a benchmark, and Premier League strikers with a 60 rating contributed an average of 0.007 expected goals per minute in possession, then we would expect the same of the Bundesliga striker if he came to the Premier League.
I then compared these values to the expected goal contributions by players in the benchmark league with ratings of 50 – my choice for replacement players – and multiplied by all clubs' average minutes in possession per match in the benchmark league. The result was our new metric xGFAR – expected goals for above replacement per match.
Defending was a little more difficult, since I had to combine our metrics for defending quality and quantity. If a player has defending quantity above the media, then he's involved in more than his share of defending situations. If the quality of his defending is better than the median, then this clearly helps his club. If the quality is below the median, then he's costing his club expected goals across all those defending opportunities. So the difference versus a replacement player would depend on the product of the differences in quality and quantity.
But what if a player defends at less than the median rate? Then other players on his team will have to pick up the slack. I assessed the cost of this slackness by multiplying the difference between his defending quantity and the median defending quantity by the median defending quality. It's essentially a penalty for failing to engage the opposition.
This time I multiplied the total differences in expected goal contributions by all clubs' average minutes out of possession* per match in the benchmark league. The result was xGAAR – expected goals against above replacement per match. Here "above replacement" means "better than replacement" – in other words, fewer expected goals conceded. And of course, the next logical step was to add the two new metrics together: xGFAR + xGAAR = xGDAR, or expected goal difference above replacement per match.
So who are the top players in Europe's top five leagues for xGDAR in our database? Here are the top 20 at a Premier League standard. You've probably heard of most of them:
|Lionel Andres Messi||RW||2016-17||0.21||0.02||0.23|
|Neymar da Silva Santos Junior||LW||2017-18||0.19||0.04||0.23|
|Mohamed Salah Ghaly||RW||2017-18||0.18||0.04||0.21|
|Alexis Alejandro Sanchez||LW||2016-17||0.20||0.01||0.21|
|Lionel Andres Messi||RW||2018-19||0.19||0.02||0.21|
|Francesc Fabregas i Soler||DM||2016-17||0.18||0.03||0.21|
|Neymar da Silva Santos Junior||LW||2016-17||0.17||0.03||0.20|
|Kevin de Bruyne||CM||2019-20||0.18||0.01||0.19|
|Cristiano Ronaldo dos Santos Aveiro||LW||2016-17||0.18||0.00||0.19|
|Duvan Esteban Zapata Banguera||CF/ST||2020-21||0.17||0.01||0.18|
|Kevin de Bruyne||CM||2020-21||0.12||0.06||0.18|
|Roberto Torres Morales||CM||2016-17||0.13||0.05||0.18|
|Cristiano Ronaldo dos Santos Aveiro||CF/ST||2017-18||0.18||-0.01||0.18|
|David Josue Jimenez Silva||CM||2019-20||0.15||0.02||0.18|
|Luis Fernando Muriel||CF/ST||2020-21||0.18||-0.01||0.17|
|Lionel Andres Messi||CF/ST||2017-18||0.17||0.01||0.17|
|Yerry Fernando Mina Gonzalez||RCB||2019-20||0.10||0.07||0.17|
You can see why Aouar is the cover star for this article; he's performing better in terms of xGDAR than every other player in our database this season. No fullbacks cracked the top 20, but Mina's 2019-20 campaign snuck in to represent central defenders. It's important to note that we're evaluating outfield contributions only, so finishing and shotstopping skill are not included here.
So we're all done, right? We have a new set of metrics that seem to pass the eye (or, if you prefer, smell) test. Drinks all around? Nope, not yet.
To see if these metrics are really useful, we need to validate them statistically. One way to do this is to check whether the xGAR metrics for players in one season can predict teams' performances in terms of expected goals during the next season.
I decided to look at three different leagues: the Premier League, the EFL Championship, and the Bundesliga. I knew that home field advantage was a reasonable predictor of expected goals in pre-pandemic matches. My test was to see how much better the predictions would be if we added our xGAR metrics based on the previous season. Here are the adjusted R-squared figures from regressions of expected goals in 2018-19, both per game and per season (with home and away matches separated), on a binary variable for playing at home and xGAR from 2017-18:
|xGF 2018-19||xGA 2018-19||xGD 2018-19|
|HFA alone||HFA + xGFAR 2017-18||HFA alone||HFA + xGAAR 2017-18||HFA alone||HFA + xGDAR 2017-18|
[Technical note: The xGAR metrics are based on the performances of players with 570'+ in the benchmark league. For prediction, I assumed that we knew how many minutes each player would play during matches in the 2018-19 season, but I used averages for time in and out of possession.]
In the Premier League, the xGAR metrics add a substantial amount of predictive power in all cases. In the Championship, the predictive power is lower. This is in part because home field advantage is so powerful in the Championship. But we're also less likely to have data on from the previous season on the Championship's players, since many may have been on reserve or youth teams. In the Bundesliga, our predictive power for defending is not as strong, but for overall expected goal difference it's similar to the Premier League.
Overall, we are worse at predicting expected goals against than expected goals for or expected goal difference. One reason is that our defensive ratings are based on estimates that need more minutes to filter the signal from the noise. Another reason that will be familiar to many analysts is that goals scored tend to have more predictive power than goals conceded in general. After all, scoring can get you wins, not just draws.
Of course, these results will vary from season to season and league to league. But it's stunning to see how well we can predict expected goals using previous performance expressed as xGDAR. Most clubs will have a very good idea of how they might perform in the next season based on the prior xGDAR of their players.
Lastly, I want to give credit to some of the other analysts who've ventured down the "above replacement" path. xGAR has existed for some time in ice hockey, but I'm not familiar enough with the sport to know who its originators were. In football, the first analyst whom I can remember using "above replacement" is Dave Laidig, whose work you can read here. More recently Konstantinos Pelechrinis and Wayne Winston used video game data to calibrate a model. And there have been several more attempts looking at goalkeeping, defending, and other aspects of the game, showing the variety of potential approaches. Among all of them, I'm pretty sure our approach is the first to offer league-adjusted metrics based on structural models of the game.
Our own xGAR metrics will be rolling out to Data Scientist members this week, and shortly after that they'll become available to Pro members right here on the platform. As always, we hope you enjoy!
* which is the same as average minutes in possession.