Thursday, 5 December 2013

World Cup Groups. The Numbers Sometimes Lie.

The 2014 FIFA World Cup becomes much more tangible and real on Friday when the draw for the group stages takes place.  Inevitably some sides will appear to be presented with a relatively easy passage to the later stages and others will fall foul of an imperfect seeding system and find themselves competing in the ritual group of death.

The FIFA rankings determine the seeded teams, continuing the influence they had over the prolonged qualifying stages. And while it is easy to pick flaws in both the rankings themselves and the manner of their use in deciding group make up, they do perform a reasonable job of sorting sides into a recognizable order of merit.

The non competitive nature of many of the friendly matches that contribute towards a side's FIFA ranking figure, along with the seemingly arbitrarily applied weightings to such games, can sometimes undermines the authority of the figures. Additionally, factors that are unique to international football, such as a continental advantage, akin to home field advantage in domestic games, also complicate their use as a predictive tool for future matches. As does the lack of meaningful, collateral form lines between the various FIFA confederations outside of the years of a World Cup.

In view of these issues, it is perhaps surprising that FIFA rankings for two sides can provide a decent indication of the likely match outcome when those sides do meet. Other systems of course exist to provide international team rankings, such as the many elo based ratings and these may be preferred by some.

So whether FIFA is your preferred starting point or not, much of the group analysis that will appear following the draw will rely on the use and interpretation of a rating figure for each of the four teams that will comprise the individual World Cup groups. Simulating the outcomes of the group games by use of a ratings differential based on historical outcomes of similar games is a well recognized way of evaluating the challenges faced by each side before a Brazuca is kicked in anger.

Ratings based analysis of individual matches in football bears a similarity to their use in horse racing, where it has long been realised that translating ratings to likely outcomes doesn't always follow a smooth and regular progression. A single rating figure may describe a weighted evaluation of a side's recent performances, but the expectation in a single upcoming match is likely to merely be centered around that figure. A team may perform better than their rating or they may perform worse. This scenario can be accounted for by randomly selecting figures to be used in simulations from values distributed around that mean.

This approach still requires assumptions to be made that may not accurately reflect reality. The more opportunity a team (or horse) has to truly demonstrate their ability, the more confident we may be that any future performance, independent of a multitude of other factors, such as venue (or going), will be close to that central number. Mature ratings are more likely to have readily identifiable up and downsides. However, in the cases of a lightly exposed horse, in particular, the upside and downside to their recorded rating may be considerably skewed in one direction or another, especially if that horse has shown itself capable of at least being competitive on a major stage.

"Potential for major improvement" may be a horse racing cliche, but a combination of increased opportunity to show their true worth, combined with increasing experience, often throws up cases of underrated talent as measured by traditional ratings.

Relatively few numbers of runs, combined with good, but not great ratings, often characterize horses with capabilities that may far outstrip a tight, normally distributed range of recent performances. Mining the extensive, commercially available racing databases can identify such cases where subsequent performance deviates markedly from the more usual progression. The FIFA ratings provide the footballing equivalent of the racing handicap, but it is less easy to define how "exposed" a team may be.

Horse Ratings Rarely Follow the Straight and Narrow.
One way is to look at the average number of caps gained by the current side. Player turnover can be relatively rapid in international sides. Only four of the starting England players who lost on penalties to Italy at Euro 2010, started the final World Cup qualifying win over Poland at Wembley three years later. So a rating forged over multiple seasons has partly been passed to the inexperienced likes of Andros Townsend, Chris Smalling and Daniel Sturridge. This fluctuating lineup doesn't guarantee improvement, but it may make any conclusion we draw from England's current rating prone to a distorted up or downside compared to a rating belonging to a more mature starting 11.

The number of caps gained by a starting eleven isn't readily available and there is a limit to the amount of data I'm prepared to collect, but below I've listed the average number of caps owned by the starting eleven for all of the European qualifying teams in their most impressive performance during the round of group matches.

Average No. of Caps Owned by Each Starting 11 in their Most Impressive WC Qualifying Game.

Team. Mean Number of Caps Owned by Starting 11. Median Number of Caps.
Netherlands. 32 13
England. 40 22
France. 33 26
Switzerland. 34 30
Bosnia & H. 37 32
Italy. 44 41
Belgium. 38 42
Germany. 48 44
Greece. 45 46
Russia. 49 47
Portugal. 58 57
Spain. 71 63
Croatia. 67 70

In comfortably beating playoff bound Romania, the Netherlands did so with a side that produced an impressive performance and did so with an under exposed side, by the standards of current international football. If a similar database existed in football as it does in horse racing, we could perhaps make a more informed prediction about the likely size and shape of any immediate upside for such a side. But in the absence of such data, when simulating the WC chances of the Dutch team, we should consider that their upside may be heavily skewed and inflated compared to their downside, especially if they persist with "proven inexperience".

Teams towards the lower end of the table, such as Spain and Portugal, will be fancied to do well as highly rated European teams, but their range of likely outcomes may not surprise.

Anecdotal evidence of a decidedly non linear progression for under exposed talent, will inevitably be tainted by survivor bias, but the handful of established players I have looked at do show an exponential increase in useful output, such as goals and assists as their cap count climbed into the 30's and 40's.

Equally non random in selection and therefore probably inadmissible as anything more than an interesting nugget of information, is the average cap count of sides that have shown performances at major tournaments that belied their more modest pre-tournament ratings.

2004 Euro champions and pre-tournament 250/1 outsiders, Greece, started the final with a side which had an average of just 30 caps per player. Senegal kick started their 2002 World Cup campaign with a 20 cap a man win over France. Caps largely gained in a partly isolated confederation. Republic of Ireland defeated Italy in a WC with a median of 26 caps and Bulgaria's run to the 1994 WC quarter finals was achieved with a 32 cap average.

Prediction can draw from many pots and while the potential for ratings to progress in a non linear manner for teams about which our information may be limited (even if that side's name is well known) is unlikely to be an over riding factor, it will add some degree of uncertainty. So an established higher ranked side that welcomes the likes of Bosnia and Hertzegovina should probably heed the example of an upwardly skewed Bulgaria from 1994.

For those who can't wait for Friday's draw, a nice primer can be found here at

