Sunday, 21 April 2013

Last Year's Model.

How many games should you look at when you are trying to estimate the likely true worth of a football team? It's a question that is often raised, especially in a betting context and is one that we can try to intuitively answer by reference to a team which we already know a great deal about. Manchester United are almost certainly this year's best Premiership side. The manager of their closes rivals has already conceded the title, even if the feeling is not universal within his coaching staff.

If we go back a couple of days to take evidence of United's fitness to be regarded this year's best side, the evidence is far from compelling. A late and disputed draw at West Ham hardly smacks of greatness. If we quite reasonably increase the sample size to two matches by pushing back to last Sunday, a win at Stoke hauls the points total to more respectable levels for the champions in waiting, but skip back another week and a loss at home to rivals Manchester City returns the win/draw/loss record to mediocre levels.

It is only when we regress towards Christmas and ignore a patchy cup record that United's record begins to appear worthy of their current position. More, it would seem is better, even if we have to pay the price of recency within the numbers.

After just a single game the balance between random variation and recognizable, repeatable talent is heavily weighted in favour of the former. It only through gathering evidence from more matches that the balance begins to shift and the results we measure begin to better describe a side's ability. How we use these measurements made over differing timescales can help to verify the strength of our belief in the validity of each set of figures and the most obvious way to test our faith in our faith in a measurement of a repeatable skill is to use such figures to make predictions about future performance.

This isn't the post for a detailed description of a predictive football model. suffice to say the path from Poisson to match odds is well worn and the inputs derived from goals scored and conceded via rates rather than simply raw numbers is only slightly less trodden. In short, workable models to predict football matches through individual team scoring rates are relatively simple to produce.

Below I've generated odds for a randomly selected group of matches (they were actually played on Boxing Day 2007) and the averages used for the Poisson have been found using the previous 35 matches for each team, the previous 20 and lastly, the previous six matches. The figures in red are typical bookmakers prices for the relevant matches for the home win, away win a draw.

Poisson Generated Odds Using Last 35 Games.

Game. Score. H. Win. A. Win. Draw. H. W. A. W. Draw.
ManC v B'burn. 2-2 0.44 0.29 0.27 0.45 0.26 0.29
P'mouth v Ars. 0-0 0.28 0.44 0.28 0.21 0.52 0.27
Che v AV. 4-4 0.62 0.11 0.27 0.68 0.11 0.22
Eve v Bol. 2-0 0.77 0.08 0.15 0.56 0.17 0.27
Tot v Ful. 5-1 0.71 0.13 0.16 0.66 0.12 0.23
Wig v New 1-0 0.32 0.41 0.27 0.35 0.36 0.29

Poisson Generated Odds Using Last 20 Games.

Game. Score. H. Win. A. Win. Draw. H. W. A. W. Draw.
ManC v B'burn. 2-2 0.54 0.22 0.24 0.45 0.26 0.29
P'mouth v Ars. 0-0 0.30 0.42 0.28 0.21 0.52 0.27
Che v AV. 4-4 0.55 0.16 0.28 0.68 0.11 0.22
Eve v Bol. 2-0 0.78 0.08 0.14 0.56 0.17 0.27
Tot v Ful. 5-1 0.66 0.16 0.18 0.66 0.12 0.23
Wig v New 1-0 0.35 0.41 0.25 0.35 0.36 0.29

Poisson Generated Odds Using Last 6 Games.

Game. Score. H. Win. A. Win. Draw. H. W. A. W. Draw.
ManC v B'burn. 2-2 0.80 0.08 0.12 0.45 0.26 0.29
P'mouth v Ars. 0-0 0.28 0.39 0.34 0.21 0.52 0.27
Che v AV. 4-4 0.58 0.12 0.31 0.68 0.11 0.22
Eve v Bol. 2-0 0.85 0.05 0.10 0.56 0.17 0.27
Tot v Ful. 5-1 0.89 0.02 0.08 0.66 0.12 0.23
Wig v New 1-0 0.43 0.36 0.21 0.35 0.36 0.29

Testing a new model usually requires a couple of obligatory steps. A large number of matches are used to determine the hopefully predictive inputs and then these are tested on a new set of out of sample games so we can be satisfied that we haven't overfit the predictors and modeled noise as much as signal.

However, as a workable shortcut we can use the bookmaker's expertise to arrive at an approximation of our model's validity. Once shorn of the overround, bookmakers odds are excellent indicators of the true odds of an event occurring. If you gather up all the 4/7 shots, they are going to win just over 60% of the time. So if your odds are close to the quoted odds, that is invariably a good sign. Likewise if one model is consistently closer to the quoted odds than another, then that model is probably more reflective of reality than the other.

Poisson generated odds using goal scoring and conceding data from the previous 35 matches were closer to the bookmakers raw odds in 11 out of the 18 prices (denoted by green numbers in the table above), three other odds were tied (denoted by blue). If I used 20 game figures only three occasions saw that model beat the other two, with two ties. Finally relying on data from just the last six matches, not only produced only one "result" and one tie, but the odds generated were also extreme.

Bookmakers deal in signal, not noise and then shorten their prices to allow for any slight miscalculation. A side's six game record demonstrates much of the former and less of the latter and so is less likely to model reality. However, as an interesting postscript to this post. The actual results of these six games are also more prone to random variation than they are to be defined by pure skill. Aside from the Manchester City Blackburn game, the six game model was most bullish about all of the remaining actual results. In short, on the day, the six game model appeared to outperform everyone else.

On that basis it outperformed each of the other two models and the bookmakers odds and over six matches the six game model appeared to be very impressive. But all this short term success demonstrates is that trusting to small sample sizes is a danger in building models and in testing conclusions.

No comments:

Post a Comment