Friday 27 September 2013

Over Performing Strikers or a Lucky Streak?

Shooting models that aim to predict the number of goals or attempts that hit the target, that an average player might expect to achieve, once such variables as shot location, defensive pressure, shooting method and power of the attempt, are accounted for, are steadily becoming the mainstay of player analysis, particularly in the case of strikers and attacking midfielders.

The accumulation of data, much of it self collected, is increasingly making it possible to arrive at a baseline expectation for any shot or header that is attempted from a wide variety of locations on the field, making comparisons with individual teams or players possible. A player or team which out-performs the generic average model may be considered to be better than average, although there is the possibility that the model may lack the variables to fully describe the goal attempt process.

The process of shot modelling is therefore a continually evolving one and even with the earlier caveats, a player whom is out performing the model is probably one to take note of. However, it is very easy to draw misleading conclusions when interpreting small sample sized trials, especially when the results are condensed into a summarized format.

It is still debatable as to the how the credit for an elevated conversion rate should be divided. In the case of van Persie, he may score at a higher rate than is normal, once shot location is allowed for, but this may be due to his skilled finishing, the quality and ease of the chances that Manchester United create for him (and this may not be reflected in a general shot model) or more likely a combination of the two.

This season van Persie has executed 19 attempts on goal, scoring three times (once from the spot) and hitting the target five times. My generic shot model gives him an expectation of 2.4 goals and  seven shots on target, once x,y shot locations are factored in. So, as a statement of "fact" (assuming the veracity of my particular model), van Persie has scored more goals than expected from a shooting model and has also been relatively inaccurate.

However, what level of confidence does out-performing a shot model in terms of actual goals scored compared to an expected average, allow us in describing van Persie's likely true ability?

Summarizing data based conclusions, often leads to the omission of extremely pertinent pieces of information. If van Persie scores more goals than a shot model predicts should be scored by an average player from the location of his chances, we really need to know how likely it was than his over performance was to occur by chance. And therefore we need to look at the range of scoring totals an average player might achieve given van Persie's opportunities and then see where van Persie's total compares to a "lucky" average marksman.

Above is the results of simulating van Persie's 19 shots so far in the Premiership, using the likely outcome of each shot as if it were being made by an average player. We've already seen that van Persie's three goals are above the 2.4 expected on average and his 5 on target efforts are below the expected average of almost exactly seven. Rather than summarizing the analysis and looking at general over or under performance, we are now re visiting each attempt.

An average marksman would be most likely to score 2 goals from these chances and that would happen around 30% of the time, but the next most likely outcome is the three goals actually scored by van Persie. That happens bout 25% of the time. So if this is all the information we have on the Dutchman, even though he has exceeded the goal expectation for a shooting model, there is still a 25% likelihood that an average player would be able to replicate his feats so far.

Similarly, with van Persie's shooting accuracy. Yes he is below the average expectation, hitting the goal just 5 times instead of 7, but random variation could inflict this level of inaccuracy on an average player around 13% of the time and such a player could record an accuracy that produced 5 or fewer shots on target 22% of the time.

So, tempting as it may be to describe players as above or below average over a limited run of games, their numbers are going to be at the whim of random variation to a large degree (we really need to always include shot numbers to at least give some context) and we need corroborative evidence, such as a sustained pattern of over or under performing, before we can begin to draw more definitive conclusions about a player's likely, true abilities.

In the next post, I'll take a look at teams.

1 comment:

  1. Good analysis Mark. This paper makes similar observations: