Saturday 9 December 2017

Know Your Limits

All predictions come with the caveat that there is a spread of uncertainty either side of the most likely outcome.

A side may be odds on to win almost all of their matches over a season, as Manchester City have very nearly shown in 2017/18, but there is a finite, if extremely small chance that they will actually lose all 38 matches.

Similarly, there is a bigger chance that they will win all 38, but the most likely scenario sits between these two extremes and for the current best team in the Premier League, winning the title with around 96 points is the most expected final outcome in May.

While single, definitive predictions are more newsworthy, they imply a precision that is never available about the longer term futures, especially about a sporting contest, such as a Premier League season that comprises low scoring matches spread over 380 games.

It's therefore useful to attach the degree of confidence we have in our predictions to any statements we make about a future outcome, particularly as new information about teams feeds into the system and the competition progresses, turning probabilistic encounters into 0,1 or 3 point actual outcomes.

Here's the range of points which a simulated model of the 2016/17 Premier League came up with using xG based ratings for each team and particularly Swansea before a ball was kicked.

Swansea had been in relative decline since their impressive introduction into the top tier, playing much admired possession football, mainly as a defensive tactic, that had seen then finish as high as 8th in 2014/15, 21 points clear of the drop zone.

2015/16 had seen them fall to 12th, just ten points from the drop zone and much of their xG rating for 2016/17 was based around this less impressive performance.

The top end of their points totals over 10,000 simulations resulted in a top 10 finish with 52 points, but the lower end left them relegated with 27 points and their mode of 36 final points suggested a season of struggle.

And this is illustrated by the dial plot showing well into the red zone signifying relegation.

After ten games, we now have more information, both about Swansea and the other 19 Premier league teams and the most likely survival cut off points in the 2016/17 league.

At the time, Swansea were 19th with five points from ten games and while the grey portion of mid table is still achievable, it has shrunk and the Swans' low point has fallen deeper into the red.

After thirty games, so just eight left, the upper and lower limits for Swansea after the full 38 games has narrowed. They are still more likely than not to be relegated, according to the updated xG model, but there is still some chance that they will survive.

In reality, Swansea were in the bottom three with three games left, but a win for them and a defeat for Hull in game week 36 was instrumental in retaining their top flight status, but it was as close as the final plot suggested it might be.

Adding indications of confidence in your model enhances any information you may wish to convey.

It's also essential when using xG simulations to "predict" the past, such as drawing conclusion about a player's individual xG and his actual scoring record.

Adding high and low limits will highlight if any over or under performance against an average model based simulation is noteworthy or not.

One final point. The upper and lower limits can be chosen to illustrate different levels of confidence, typically 95%. But this does not mean that a side's final points total and thus finishing position has a 95% chance of lying within these two limits.

It is more your model that is on trial.

There is a 95% chance that any new prediction made for a team by your model will lie within these upper and lower limits.

Hopefully, your model will have done a decent job of evaluating a side, in this case Swansea from 2016/17. But if it hasn't, Swansea's actual finishing position may lie elsewhere.

No comments:

Post a Comment