Monday, 2 March 2015

Memphis Depay's 0-40 is No Cause for Concern.

It's incredibly easy to use raw data to arrive at simplistic conclusions which portray a player or team in a particular light. Player A has scored X goals from Y shots can be used to to give substance to a view that the player is either world class or rubbish, depending upon the values of x and y.

However, football analysis should try to acknowledge the impact of random variation on the recorded outcomes of trials which may be both limited in size and derived from models that are missing many minor variables which may tweak probabilities in one direction or another.

PSV's Memphis Depay may soon be heading from the Dutch revolution that is currently underwhelming the "Theatre of Dreams", although his stock may have fallen if his current goalscoring prowess from outside the box is taken at face value.

Zero returns from 40 attempts appears a poor selling point. but as Simon Gleave points out in this tweet, context is everything in interpretation of and a 0-40, poor as it may intuitively appear, lacks any context.

Goals from outside the box are relatively rare events. The best available round up can be read at the StatsBomb site in this post from Dan Kennett.  Dan's conversion figure of one in 37 for the Premier league, immediately adds context to Depay's 0-40, even from another major European League. A single goal and Depay appears average, two and he doubles the headline rate.

Benchmark information, therefore helps when trying to make sense of relatively limited samples. But it is also easy to estimate the likelihood and range of possible outcomes from Depay's 40 shots from distance, using a shot location model and a simple simulation.

Depay's 40 shots varied in location, with the most optimistic effort having a likely success rate of less than 1 in around 250 and ranged to 1 in 14 for those closer to goal. An average shooter, taking open play shots from the positions chosen by Depay in 2014/15 would average 1 goal, in keeping with Dan's findings.

But this average would be distributed such that zero goals in a single run of 40 such trials wouldn't be a major surprise, occurring over 33% of the time.

Even if we assume that Depay is an above average striker of the ball from distance in open play and inflate the likelihood of success for each of his 40 attempts by a generous 10%, there still remains a significant 27% chance that he would fail to score in 40 such efforts.

Data collection (0-40) is the start of the process, benchmark figures (1-37) add initial context, but distributions begin to explain how unusually good or bad a set of data might be compared to the expectation of an average performer. 0-40 from even an above average shot taker from outside the box, isn't unusual at all.

Footnote. These type of simple simulations can be done in excel, but for an excellent primer on using R check out @SteMc74's tutorial here.

Double footnote. (Some progressive European clubs chose Sloan to announce the elimination of random variation or luck from their processes, especially those taking place outside the box, but for now the rest of us will continue to work under the limitations imposed by such forces).

Triple footnote. The risk/reward for shooting from distance was broken down by Colin Trainor in this presentation from the inaugural Opta Pro forum in 2014. Read it here.

