Wednesday 9 January 2013

How Big A Shock Was Bradford 3 Aston Villa 1 ?

Seventy places may separate Aston Villa from their deserved League Cup conquerors Bradford City, but few would claim that the Yorkshire side should now be considered superior to Paul Lambert's beleaguered kids. The seventy place gap may not accurately reflect the current gulf in class, but Villa are still superior to Bradford and the second leg may confirm this view.

What Tuesday night's game does demonstrate is that teams can and do produce performances that are someway above or someway below the average level of their usual performance. Football is a low scoring sport and therefore random chance will be a big contributor to a single match result. Over a larger run of matches a side's true levels of talent will begin to come to the fore in the win/ draw and loss column and the best teams will show average levels of performance that more accurately reflects their real ability. That Bradford can defeat Villa merely indicates the levels to which Villa's standards can fall and Bradford's can rise to on one particular gameday in an environment where a couple of outstanding saves can be immediately followed by a second goal for the less accomplished team.

To visualize the peaks and troughs that can be expected in leagues and for particular teams it is first necessary to establish an expected level of performance for each team based on reliable indicators, ideally recorded over a reasonable length of time. There are numerous ways to forecast the most likely outcome when two sides meet at a particular venue ranging from methods involving goal scoring and conceding rates corrected for opponent strength to correlating performance indicators with previous match outcomes. Invariably, and most usefully we can eventually express the likely outcome of a match in terms of the average number of goals one team would be superior to their opponents should the game be repeatedly replayed.

Any reasonable proficient prediction system should eventually mirror a side's actual performance, especially if a continual under or over performance is taken as a sign that the overall strength of a team has shifted. If we are satisfied that we have a good prediction method, we can use a side's match by match goal difference compared to the pre game, expected goal difference to show the extremes of performance, both good and bad that typical teams are capable of.

For example if a team is a 1 goal favourite and wins by three clear goals, they could be considered to have over performed by a margin of two goals, while a draw would indicate under performance to the tune of one goal. By collecting and plotting the frequency of such outcomes, the spread of a team's actual performance can be charted.

Above I've plotted the frequency at which Tottenham beat or failed to beat a goal based pre game estimation of their chances in each match from the 2010-11 EPL season. The residual expectations are collected in half goal bands, so the zero column counts each occasion where the actual margin by which Spurs won or lost the game fell between 0.25 of a goal above or below expectation. The column immediately to the right of the nearly central zero column consists of the six occasions during the season when Spurs over performed by between 0.251 and 0.75 goals when compared to their expected pre game performance.

The majority of Tottenham's performances in 2010/11 cluster around their pre game estimates, a sign that the the model used is tracking reality fairly well. Two matches, a 1-0 defeat at home to Wigan and a 3-1 defeat away at Blackpool fall into the category at the extreme left comprising matches were the Spurs side under achieved by between 2.25 and 2.75 goals.

Overall the model does a good job of describing Tottenham and one season's worth of results produces over 20 results that were predicted with reasonable pre game accuracy. The half dozen games at either extreme shows the frequency with which Spurs turned in atypically good or bad performances and the distance from the zero line indicates roughly how extreme those outcomes actually were.

If we plot a similar graph for every EPL game played by every home team since the beginning of the 2005-06 season, a similarly well defined curve results. The majority of matches fall very close to the average margins of victory predicted beforehand.

 Again the buckets into which each game is placed are half a goal wide and once again the more extreme actual margins of victory or defeat compared to pre game estimates appear with increasing scarcity. Aston Villa were considered a shade over 1.2 goals superior to Bradford on most reliable indicators before kick off at Valley Parade, so a defeat by two goals represented an under achievement of 3.2 goals on the night or a similar over achievement by their hosts. Using the crude sampling bins, that would have placed the Bradford Villa result into a bin comprising home teams who had played between 2.75 and 3.25 goals better than expected, as indicated by the arrow on the plot above. An occurrence which was played out just 26 times out of 2280 matches in the EPL between August 2005 and May 2011.

No comments:

Post a Comment