Intro: The state of public research into keeper passing
In The Numbers Game, Chris Anderson and David Sally make the point that goalkeeping remains a massively under researched aspect of the beautiful game. Indeed, within the online community of analytics bloggers, as per Thom Lawrence’s recent and interesting ‘State of the Stats’ questionnaire, it appears that the lion share of analysts make no attempt to analyse keepers, with many of these admitting that they simply wouldn’t know where to start.
This isn’t that surprising given that many of the notable attempts that have been made to quantify goalkeeping have resulted in leading bloggers such as Sander Ijtsma and Colin Trainor concluding that repeatable metrics for assessing shot-stopping are essentially nonexistent. The likes of Opta analyst Johannes Harkins, Paul Riley and Lawrence have offered other means by which shotstopping may be assessed, whilst Jörg Seidel has written on what his GoalImpact model has to say about keepers, and Flavio Fusi has made an effort to introduce goalkeeping radars. And that, as far as I am aware, is about it when it comes to publicly available analytical goalkeeping research.
Apart from Lawrence and Fusi, none of the above made any attempt to assess passing – obviously an important tenet of a goalkeeper’s game, with modern day keepers, to quote Chelsea’s goalkeeping coach Christopher Lollichon, quite clearly “more than just a shot-stopper.” Indeed, many of the aforementioned analysts, whilst not assessing passing, have pointed this out.
Lawrence and Fusi’s efforts to assess goalkeeper passing, in that they only really cover overall pass accuracy and distribution length (though Lawrence does make a distinction between long and short passing), seem to have only really skimmed the surface on accurately measuring a goalkeeper’s passing ability. This is because the measures they have applied are highly flawed in that they suppose that every attempted pass has an equal chance of being completed, with length presumed a simultaneously desirable characteristic of a pass.
If we divide the pitch into 9 horizontal sectors, it becomes clear that pass accuracy varies hugely depending on the destination location of the pass:
Secondly, length is certainly not necessarily a desirable aspect of goalkeeper passing. Indeed, as I explained last week in this somewhat rambling piece, the higher a club’s ELO rating, the more likely it is that their goalkeeper will tend to pass short more regularly:
So, the question remains, how best to quantify goalkeeper passing? If overall pass accuracy is flawed in that destination massively affects the likelihood of a pass being accurate, how can this be overcome?
This season, using Statszone, I have been manually noting the destination location by the 9 sectors (pictured above) and success/failure of every single pass to have been played by goalkeepers in the Premier League, the Bundesliga and La Liga (It’s my aim to also eventually add Serie A and Ligue 1, but noting down a single gameweek for a league takes about an hour, and I’m not in the business of placing myself in a dubious legal position by scraping, so three leagues will have to do for now!). Accordingly, I’ve been building up a picture of the overall pass accuracy, and frequency of passes by destination location, within each of the 9 sectors on a ‘three-league’ (PL+BL+LL) basis, on a league-wide basis, and on an individual keeper basis.
On a ‘three-league’ basis, the raw data by sector (numbered 1-9), with S or F denoting the total number of successful or failed passes to have ended in any particular sector pass accuracies presently appears as follows (this is the data that underpins the traffic light pitch dataviz posted above):
Pass accuracies accordingly look as in the dataviz, with pass frequencies like this:
So now let us see how a particular keeper compares. The following is Petr Cech’s raw data:
Cech’s pass accuracies look like this:
And his pass frequencies as follows:
If we delve into Cech’s numbers, I’ve created a model by which we may compare the accuracy of his passing to the average. From the raw data and percentages listed above, the next step is to assess how Cech’s accuracy compares to the overall standard. For example, Cech’s accuracy of 97.22% in the 2nd sector is 0.9% below the present ‘three-league’ average. If we repeat this for every sector, we get the following indication of how Cech’s pass accuracy performs relative to average:
We could simply then tally up these numbers as a measure of Cech’s passing ability relative to average, but any final assessment would be horribly skewed by the 9th sector in which Cech compares very well with the average, though only by having attempted a handful of passes that ended up in this sector. To overcome this issue of reliability, then, we may multiply the extent to which Cech is above or below average in a given sector by his pass frequency % for that given sector. For the 9th sector, this would appear as 16.56% * 1.33% = 0.2%.
At this point, if we repeat this calculation for every sector, we gain an indication for how Cech performs relative to the average with each sector weighted by the frequency with he passes into each sector. If we tot up the values for each sector, we gain a single % for Cech, which indicates the extent to which he outperforms or underperforms average. As we can see, Cech appears to be a 3.2% more accurate passer than the average goalkeeper.
This percentage ‘Total’, I have termed the ‘Relative Passing Index’ or ‘RPI.’ And I can calculate RPI’s for every single goalkeeper in the Premier League, Bundesliga and La Liga, with Wolfsburg’s Diego Benaglio (6.6% RPI) and FC Bayern’s Manuel Neuer (6.3% RPI) appearing to be the best passers, by this metric, across these three leagues, with Real Betis’ Antonio Adan (-10.9% RPI) and, interestingly, Manchester United’s David de Gea (-11.9% RPI) the worst:
This is all well and good, but is RPI actually functionally useful? What is its predictive value? Its little use thinking that Werder Bremen’s Felix Wiedwald is a great passer because his RPI is presently high if the model can’t say whether Wiedwald is highly likely or not to continue to pass accurately. Equally, are Adan, de Gea and Schmeichel likely to continue to pass poorly? Are the model’s findings repeatable?
Whilst I would love to be in a position to categorically declare YES THEY ARE, there is a reason why this piece is titled ‘Towards a metric…’ Currently I have doubts about RPI. Being, as we are, just 13 or 14 gameweeks into the season, I can’t compare season to season RPI’s in the manner with which Trainor and Ijtsma suggested Sv% was an unrepeatable metric. The best I can do, then, is to look at keepers to have played all 14 games, and compare their RPI for their first 7 games with their RPI for the next 7. 10 Premier League keepers and 12 Bundesliga keepers have played all 14 games, although sadly Statszone’s data on Tim Howard, Manuel Neuer and Lukas Hradecky is incomplete. This means only 19 keepers are ripe for analysis on a ‘first 7 games v second 7 games basis.’ A scatter plot of this initially suggests that there is no real relationship between the first and second RPI measurements:
So RPI is fucked, then? Well, not quite. Although the R² of .12 indicates a weak relationship, I don’t think that RPI is derived entirely from randomness. Plainly, there are a couple of quite notable outliers on the bottom-right – Bernd Leno and Boaz Myhill, as it goes – who simply may be yet to regress to the mean. Indeed, if Myhill and Leno are removed from the dataset then the R² rises to .43 – a certainly creditable relationship based on just seven games. It might be that Myhill and Leno are just experiencing random fluctuations and RPI may indeed be useless. I’m cautiously optimistic that this isn’t the case, and I’m very keen to keep measuring RPI to see the extent to which RPI becomes repeatable or not over the course of the season.
In the past whenever I’ve blogged or tweeted about measuring keeper passing, two particular questions/points recur. The first, with Pete Owen a particularly keen questioner (see above), is the suggestion that whilst short passes might be possible to quantify, long passing could just be an entirely random exercise. To test this, I split keeper’s RPI’s into short passes (destination sectors 1,2,3,4) and long passes (sectors 5,6,7,8,9) and re-did the ‘7-game’ test. At first viewing, it seems that Owen may have a point, with the scatter plot distribution of long passes apparently far more random than for short passes:
Again, however, Myhill and Leno are outliers as far as long-passes are concerned. When removed from the dataset, the long-pass R² increases massively to .31 and exceeds the repeatability of short passes. Whilst the performances of Myhill and Leno over the remainder of the season will play a large role in deciding the accuracy of these claims – whether the duo will regress or continue to be random – I am again cautiously optimistic that the accuracy of long-passes, over the course of the season, won’t be any more random than short passes.
Onto question/point two. Aren’t successful long-passes dependent on the presence, or absence, of target-men? My response to this is ‘yes and no’. Intuitively it makes sense that it’s way easier for a keeper to land a pass on, for comparison’s sake, Graziano Pelle or Christian Benteke than on Jermaine Defoe or Lionel Messi. And yet, the pass still has to be accurate in the first place. Additionally, other metrics such as xG and key passes are very reliant on the off-the-ball runs of their teammates to create space etc… Context is hugely important, but must be the passing skill at the keeper’s end. Sadly I’m unable to test this as well as I would like, with the ideal comparison being a scatter plot of Long-pass RPI against a team’s offensive header win ratio. This latter stat doesn’t exist publicly, so the best I can do is look at a team’s entire aerial battle % from Whoscored – obviously not ideal, but sufficient to show that questions exist regarding the strength of the relationship between long-pass accuracy and teams that win lots of headers that we intuitively presume to exist:
There are many ways that an analyst with access to raw Opta data could, by being able to circumvent the issues of my manual notation-from-Statszone methods, improve this analysis. Firstly, the 9 sectors are unlikely to be the best fit for accuracy %. I imagine that a whole pitch divided into semi-circles of the like which Dan Altman used in his presentation at the 2015 Opta Pro Analytics Forum to be a more accurate depiction:
Additionally, an analyst with access to Opta data could look back at past seasons and test repeatability on a much wider scope than I, performing the analysis as I am in real time, am able to do. By the end of the season, however, I’m hopeful that my RPI metric will prove to be an at least useful and reasonably predictive metric of future goalkeeper pass accuracy. Of course, given that I’m the one investing so much time in it, I’m likely to be a little biased in this respect. But even analysts are allowed to cross their fingers, right?!
Thanks for reading, and, as always, feedback is gratefully received!