It attempts to quantify the importance of pre-shot variables in determining the likelihood that a goal will be scored. In essence it is a measure of chance quality and is largely determined by such things as shot type and location.
The majority of models output the likelihood that an average Premier League player would score from a given position and shot type. By aggregating the individual expected goals for each attempt and comparing this to a player's actual output we can broadly suggest the level of under or over performance.
Here's how the two 2015/16 leading non penalty scorers fared compared to the aggregated total of their expected goals,
Both over-performed,
Aguero more so than Kane, but we can better visualise this disconnect by simulating each of the 111 non penalty attempts taken by Aguero to see the range of season long goal totals predicted by the model.
There's around an 8% chance that the average player model would equal or better Aguero's 20 non penalty goals from his 111 chances in 2015/16.
Thereafter the interpretation becomes more subjective.
We may assume presumptuously that the model is perfect and Aguero was merely lucky.
281 individual players tried to score in 2015/16, so that's alot of individual trials and someone is likely to over perform to the level that Aguero did.
This suggests that he may subsequently enjoy more normal levels of luck and his performance may be less extreme in the future.
Or we might prefer that Aguero's 20 goals is partly driven by luck, but it also contains an element of skill in finishing chances that exceeds that granted to the average player whose out of sample data went into producing the model.
As suggested by the title of the above graph, we can produce a second expected goals model that while not explicitly tailored to Aguero's (potential) finishing prowess, does contain elements that may act as a proxy for elusive finishing ability.
If we now simulate Aguero's 111 chances, but using a model that incorporates statistically significant variables that "may" relate to finishing skill, he becomes less "lucky". His 20 goals are now much less unlikely. The new model predicts he would score 20 or more in nearly 40% of seasons.
Overall, this new set of variables (I can't be more specific, sorry) inflates the individual expected goals values of players, such as Aguero and Kane who possess the new variable and reduces the the figures for those who don't.
Overall a model that allows for a differential in finishing abilities across all players that attempt to score in a typical season reduces such indicators as the rmse in out of sample data.
Under a model that includes a proxy term for finishing skill, Aguero only scores 1 more goal than predicted in out of sample data from 2015/16 and Kane scores exactly the number predicted by the model.
Perhaps more importantly Aguero's 2015/16 is a substantially better goodness of fit at the individual attempt level under the second model compared to the first.
No comments:
Post a Comment