Introducing xGAR: expected goals above replacement
By Dan Altman, creator of smarterscout
Though it's not always obvious, many of the metrics we use today in football analytics have been borrowed from other sports. Basketball analysts like Kirk Goldsberry were among the first to gauge the changing value of possessions based on the likelihood of scoring. Our strategy for duel-based skill ratings descends from Arpad Elo's work in chess. The basis for our shooting and saving ratings owes much to ice hockey analysts. And then, there's baseball.
Baseball is very different from football. It's not a "complex invasion sport" where one team tries to push a ball or puck into another team's territory using an infinite variety of attacks. Rather, it's a game made up of a series of discretely defined situations, all stemming from repeated one-on-one encounters between batters and pitchers/catchers. Though basketball, ice hockey, lacrosse, water polo, and even American football can draw parallels with association football, baseball is practically another planet.
Yet baseball, with its long statistical tradition, still offers some metrics to entice analysts from other sports. Foremost among these is the oft-cited, if controversial, wins above replacement (WAR).
WAR is supposed to estimate the number of wins a baseball player will add to his team's total over the course of a season, above and beyond the value of a "replacement player" – essentially a reserve player at the same position. But it throws up so many questions: How good is a reserve player? How many games will the player participate in over the course of the season? Shouldn't the player's value be different for every club? And why are baseball's WAR formulas apparently so arbitrary?
This last question was always the one that bothered me the most. The coefficients in WAR models didn't seem to come from structural models of the game. Rather, they seemed to be estimated ad hoc.
But the pull of an "above replacement" metric for football was still strong. In fact, one of our Data Scientist members asked us for it. So I gave it a try, and what I came up with was more convincing than I would have guessed (or expected).
In keeping with the rest of our platform, I wanted to create a metric that was both league-adjusted and based on models of the game. So I started with our ratings for attacking output and defending quality. The metrics underpinning these ratings come directly from our two models of expected goals. Attacking output is based on contributions to expected goals per minute in possession. Defending quality is based on expected goals conceded per defending opportunity. We also measure defending quantity: the number of defending opportunities per minute out of possession.
To turn these base metrics into our existing ratings, we standardise them and then apply the league adjustments. The key insight was that we could reverse-engineer the ratings to get league-adjusted measures of expected goals with the right denominators.
Here's how I did it. First I picked a benchmark league, just like all of our members do. Then, for each possible value of the three ratings – attacking output, defending quality, and defending quantity, all from 0 to 99 – I measured the average values of the underlying metrics among players in the benchmark league. So if a striker in the Bundesliga had a rating of 60 for attacking output using the Premier League as a benchmark, and Premier League strikers with a 60 rating contributed an average of 0.007 expected goals per minute in possession, then we would expect the same of the Bundesliga striker if he came to the Premier League.
I then compared these values to the expected goal contributions by players in the benchmark league with ratings of 50 – my choice for replacement players – and multiplied by all clubs' average minutes in possession per match in the benchmark league. The result was our new metric xGFAR – expected goals for above replacement per match.
Defending was a little more difficult, since I had to combine our metrics for defending quality and quantity. If a player has defending quantity above the median, then he's involved in more than his share of defending situations. If the quality of his defending is better than the median, then this clearly helps his club. If the quality is below the median, then he's costing his club expected goals across all those defending opportunities. So the difference versus a replacement player would depend on the product of the differences in quality and quantity.
But what if a player defends at less than the median rate? Then other players on his team will have to pick up the slack. I assessed the cost of this slackness by multiplying the difference between his defending quantity and the median defending quantity by the median defending quality. It's essentially a penalty for failing to engage the opposition.
This time I multiplied the total differences in expected goal contributions by all clubs' average minutes out of possession* per match in the benchmark league. The result was xGAAR – expected goals against above replacement per match. Here "above replacement" means "better than replacement" – in other words, fewer expected goals conceded.
There are actually four cases for computing the xGAAR metric, and it may help to go through them one at a time. So let's say the median player at a given position defends 1.5 times per minute out of possesson and concedes 0.005 xG each time, with an average for the league of 40' out of possession per match. Then we have these possibilities for evaluating a hypothetical Player X at the same position:
1. Player X has higher quantity and higher quality – defends 2.0 times per minute out of possession conceding 0.003 xG each time
Here Player X concedes 0.24 xG per match (2.0 x 40 x 0.003 = 0.24). If the median player defended 2.0 times per minute out of possession, he would concede 0.40 xG. So the base for this player's xGAAR is 0.16, because he is replacing the median player's defending with better defending for the median frequency and more. But we're not done yet. Because of Player X's aggressiveness, he is also allowing another player (in expectation, the median player) to defend less often – to be exact, 0.5 times less often per minute out of possession – or to defend elsewhere on the pitch. Thus Player X also receives a reward of 0.5 * 0.005 * 40 = 0.10 xG, and his total xGAAR is 0.26.
2. Player X has higher quantity and lower quality – defends 2.0 times per minute out of possession conceding 0.008 xG each time
Here Player X concedes 0.64 xG per match. That's a base of -0.24 for his xGAAR, but he also receives the same reward for freeing up another player, which is 0.10 xG as above. So his total xGAAR is -0.14.
3. Player X has lower quantity and lower quality – defends 1.0 times per minute out of possession conceding 0.008 xG each time
Here Player X concedes 0.32 xG per match. If the median player defended 1.0 times per minute out of possession, he'd concede 0.20 xG per match, so Player X is costing his club 0.12 xG per match in those events. But because of Player X's low quantity, he also requires another player (in expectation, the median player) to come and defend for him 0.5 times per minute out of possession, pulling the median player away from his own defensive responsibilities. I assess this as a penalty equal to 0.5 * 0.005 * 40 = 0.10 xG conceded. So Player X's true xGAAR in expectation is -0.12 + -0.10 = -0.22.
4. Player X has lower quantity and higher quality – defends 1.0 times per minute out of possession conceding 0.003 xG each time
Here player X concedes 0.12 xG per match. That's 0.08 xG less than the median player defending 1.0 times per minute out of possession. So using the same penalty as above, Player X's xGAAR would be 0.08 + -0.10 = -0.02.
With xGAAR under our belt, the next logical step was to add the two new metrics together: xGFAR + xGAAR = xGDAR, or expected goal difference above replacement per match. So who are the top players in Europe's top five leagues for xGDAR in our database, with at least a third of a season at a single position? Here are the top five at each major positional grouping usinig a Premier League standard. You've probably heard of most of them:
|Nico Schlotterbeck||1 FC Union Berlin||LCB||1252||2020-21||0.05||0.21||0.26|
|Dimitrios Siovas||CD Leganes||LCB||2049||2019-20||0.02||0.21||0.23|
|Sergio Ramos Garcia||Real Madrid||LCB||2395||2017-18||0.05||0.15||0.20|
|Diego Roberto Godin Leal||Atletico Madrid||LCB||2561||2016-17||0.07||0.13||0.20|
|Cesar Azpilicueta Tanco||Chelsea||RCB||1297||2020-21||0.01||0.18||0.19|
|Juan Guillermo Cuadrado Bello||Juventus||RB||1317||2020-21||0.11||0.06||0.17|
|Trent Alexander-Arnold||Liverpool FC||RB||3363||2019-20||0.09||0.07||0.16|
|Trent Alexander-Arnold||Liverpool FC||RB||3009||2020-21||0.10||0.04||0.14|
|Alfonso Pedraza Sag||Villarreal||LB||1984||2020-21||0.07||0.06||0.13|
|Joao Cancelo||FC Internazionale Milano||RB||1562||2017-18||0.06||0.07||0.13|
|Gabriel Appelt Pires||CD Leganes||DM||1309||2017-18||0.08||0.07||0.15|
|Tomas Soucek||West Ham United||DM||3190||2020-21||0.08||0.07||0.15|
|Teji Savanier||Nimes Olympique||DM||1819||2018-19||0.08||0.06||0.14|
|Roberto Gagliardini||FC Internazionale Milano||DM||2288||2016-17||0.08||0.05||0.13|
|Riyad Mahrez||Leicester City||RM/RWB||2470||2016-17||0.11||0.06||0.17|
|Emil Forsberg||RB Leipzig||LM/LWB||1781||2016-17||0.09||0.07||0.16|
|Riyad Mahrez||Leicester City||RM/RWB||1372||2017-18||0.11||0.04||0.15|
|Thorgan Hazard||Borussia Monchengladbach||LM/LWB||1277||2017-18||0.11||0.04||0.15|
|Yannick Ferreira Carrasco||Atletico Madrid||LM/LWB||1364||2016-17||0.12||0.02||0.14|
|Houssem Aouar||Olympique Lyonnais||CM||1279||2020-21||0.16||0.04||0.20|
|Rodrigo Javier De Paul||Udinese Calcio||CM||3083||2020-21||0.13||0.04||0.17|
|Kevin De Bruyne||Manchester City||CM||2274||2019-20||0.16||0.01||0.17|
|David Josue Jimenez Silva||Manchester City||CM||2163||2018-19||0.13||0.04||0.17|
|Kevin De Bruyne||Manchester City||CM||1677||2016-17||0.10||0.06||0.16|
|Lionel Andres Messi||Barcelona||RW||1757||2016-17||0.19||0.02||0.21|
|Lionel Andres Messi||Barcelona||RW||1939||2018-19||0.18||0.03||0.21|
|Neymar da Silva Santos Junior||Paris Saint-Germain||LW||1732||2017-18||0.15||0.05||0.20|
|Neymar da Silva Santos Junior||Barcelona||LW||1943||2016-17||0.14||0.05||0.19|
|Mohamed Salah||Liverpool FC||RW||2619||2017-18||0.15||0.03||0.18|
|Duvan Esteban Zapata Banguera||Atalanta||CF/ST||2260||2020-21||0.16||0.01||0.17|
|Lionel Andres Messi||Barcelona||CF/ST||2814||2017-18||0.16||0.01||0.17|
|Edin Dzeko||AS Roma||CF/ST||3193||2016-17||0.14||0.01||0.15|
You can see why Aouar is the cover star for this article; he's performing better in terms of xGDAR this season than every other player except the CBs in our database. The CBs get a bit of an asterisk, though, especially those who have typically played wide in a back three, like Nico Schlotterbeck, Dimitrios Siovas, and Cesar Azpilicueta. They tend to be more active than other CBs, but also to face fewer critical situations, which mechanically boosts their xGAAR ratings; they get rewarded for being aggressive and for conceding less in terms of expected goals than CBs who play in a two.
It's also important to note that we're evaluating outfield contributions only, so finishing and shotstopping skill are not included here. That's why we've excluded GKs.
So we're all done, right? We have a new set of metrics that seem to pass the eye (or, if you prefer, smell) test. Drinks all around? Nope, not yet.
To see if these metrics are really useful, we need to validate them statistically. One way to do this is to check whether the xGAR metrics for players in one season can predict teams' performances in terms of expected goals during the next season.
I decided to look at three different leagues: the Premier League, the EFL Championship, and the Bundesliga. I knew that home field advantage was a reasonable predictor of expected goals in pre-pandemic matches. My test was to see how much better the predictions would be if we added our xGAR metrics based on the previous season. Here are the adjusted R-squared figures from regressions of expected goals in 2018-19, both per game and per season (with home and away matches separated), on a binary variable for playing at home and xGAR from 2017-18:
|xGF 2018-19||xGA 2018-19||xGD 2018-19|
|HFA alone||HFA + xGFAR 2017-18||HFA alone||HFA + xGAAR 2017-18||HFA alone||HFA + xGDAR 2017-18|
[Technical note: The xGAR metrics are based on the performances of players with 570'+ in the benchmark league. For prediction, I assumed that we knew how many minutes each player would play during matches in the 2018-19 season, but I used averages for time in and out of possession.]
In the Premier League, the xGAR metrics add a substantial amount of predictive power in all cases. In the Championship, the predictive power is lower. This is in part because home field advantage is so powerful in the Championship. But we're also less likely to have data on from the previous season on the Championship's players, since many may have been on reserve or youth teams. In the Bundesliga, our predictive power for defending is not as strong, but for overall expected goal difference it's similar to the Premier League.
Overall, we are worse at predicting expected goals against than expected goals for or expected goal difference. One reason is that our defensive ratings are based on estimates that need more minutes to filter the signal from the noise. Another reason that will be familiar to many analysts is that goals scored tend to have more predictive power than goals conceded in general. After all, scoring can get you wins, not just draws.
Of course, these results will vary from season to season and league to league. But it's stunning to see how well we can predict expected goals using previous performance expressed as xGDAR. Most clubs will have a very good idea of how they might perform in the next season based on the prior xGDAR of their players.
Lastly, I want to give credit to some of the other analysts who've ventured down the "above replacement" path. xGAR has existed for some time in ice hockey, but I'm not familiar enough with the sport to know who its originators were. In football, the first analyst whom I can remember using "above replacement" is Dave Laidig, whose work you can read here. More recently Konstantinos Pelechrinis and Wayne Winston used video game data to calibrate a model. And there have been several more attempts looking at goalkeeping, defending, and other aspects of the game, showing the variety of potential approaches. Among all of them, I'm pretty sure our approach is the first to offer league-adjusted metrics based on structural models of the game.
Our own xGAR metrics will be rolling out to Data Scientist members this week, and shortly after that they'll become available to Pro members right here on the platform. As always, we hope you enjoy!
* which is the same as average minutes in possession.