It's going to contain some useful information about the likelihood you score, (wider = worse), but you'd be surprised if it was the definitive example of the art of expected goals modelling.
You run your regression and horizontal distance is a statistically significant term. This gives you an equation to use on out of sample data and as in the previous post we can compare cumulative expected goals with reality.
Breaking down the out of sample shots into groups of 364 by increasing scoring likelihood from the perspective of your regression, we get some good reality checks and some not so good.
Actual Goals v Predicted Goals in Bins of Increasing Likelihood of Scoring. 364 shots per Bin.
A curate's egg. Good in parts, starts well, then goes awry, has a bull's eye mid table, then peter's off again.
The plot of expected verses actual goals looks half decent, especially considering that we've only used one, slightly obscure independent variable. R^2, another favourite is also impressively high.
But if we run a few test on the observed number of goals compared to the predicted number, along with the respective number of non goals for each bin, we find that there's only a 6% chance that the differences between the groups of observations and predictions has arisen by chance alone.
This is just above the generally used threshold for statistical significance, but for the model to be a good fit we ideally want a large probability that there is no difference between the actual results and the model's predictions.
It'd be great if we had say a 50% chance that the differences in the results had arisen by chance, but for this model we just have a 6% level.
So let's add the y co-ordinate's missing twin, the distance from goal.
Scores look a little better, a couple more near misses and fewer bins that are way wide of the mark. R^2 on the plot's up to 0.99.
If we compare the observed to the predictions, it's now nearly a 10% chance that the differences are just down to chance. In other words the likelihood that the model is a good predictor of reality and the differences are mere chance has increased by adding some more information to the model.
If we just used the x co-ordinate rather than the y, the 6% crept up to 7%. So we can perhaps conclude that horizontal distance builds you a model, vertical distance alone improves slightly on it and both inputs together improves it still further.
Finally, let's throw even more independent variables into the mix. We'll include x, y as well as an interaction term to see the effect of taking shots from wider and further out or closer and more centrally. This model also has information on the strength of the shot and as deflections play such a major role in confounding keepers, I've used that as an input also.
This looks the best of the bunch so far. Observed and predicted increase more or less hand in hand and over half of the bins could be considered as virtual twins. Cumulative observed and predicted totals match exactly and the R^2 is again 0.99.
Perhaps most tellingly, paired comparisons of the differences between the actual and expected goals and non-goals for each bin are highly likely to have arisen just by chance. The p value is around 0.8. So it is highly likely that the differences we see is just chance rather than a poorly fitting model, especially when compared to the 10% levels and below for the less populated models.
At the very least, binning your predictions from an expected goals model, comparing it in an out of sample exam and eye balling the results in the type of tables above might tell you if you've inadvertently "smoothed out" and hidden you model's flaws in more usually quoted certificates of calibration.
No comments:
Post a Comment