Pages

Thursday, 27 November 2014

Why Uttoxeter Probably Isn’t A Hotbed of Swimming Talent.

Occasionally the newspapers publish stats based articles that do not relate to sport, but do serve to highlight some of the dubious assumptions that can be made from such studies.

In the run up to Christmas, a raft of newspapers, including the Daily Telegraph reported that the drink driving capital of Britain was Llandrindod Wells, a small rural town in mid Wales.

LW had over the last 12 months 1.98 convictions per 1,000 drivers, second to Blackpool with 1.85 such convictions. After establishing the drink driving hotspot, a couple of reasons were then devised to explain the results, lack of public transport and a belief that an offender will not be caught in a rural setting, for example.

However, studies comprising very different sample sizes inevitably lead to conclusions that may fail to represent the true picture. Most famously a study decided that small schools are inherently better than large ones because they appeared in disproportionate numbers at the top of a performance table and is quoted in Daniel Kahneman’s book “Thinking, fast and slow”.

In short, sometimes samples are too small to come to a reliable conclusion.

LW has a population of just over 5,000. If the town follows national trends around 80% of the population will be able to legally hold a driving licence. So, 1.98 convictions per 1,000 drivers implies that 8 cases of drink driving were successfully caught and prosecuted in LW over the previous 12 months.

If we imagine that one such case went undetected. Now LW has a conviction rate of 1.75 per 1,000 and they fall to 4th in the table. Blackpool is now top and it may seem that seaside towns lead to drink driving.

If convictions drop to 6, LW fall to the middle of the roll of shame with entirely unexceptional conviction rates per 1,000 drivers. However, two extra cases added to the actual total catapults the town to 2.5 cases per 1,000, well above the next worst, Blackpool.

So it is possibly the size of LW population that has contributed to making them a headline in the national press. Blackpool, in contrast has around 118,000 drivers and the conviction rate is much less susceptible to large changes occurring in that headline rate because of small numerical changes in convicted or non-convicted cases. Blackpool has probably prosecuted around 280 drink drivers.

Percentages derived from small sample sizes can bounce around if the raw number of cases alters by just one or two. Just as small schools can be shown to be the best, as in the study quoted in Kahneman’s book, they can also quickly become the worst if just a handful of students produce poor results rather than excellent ones.

To keep the blog sports orientated, let’s use this dubious method to “prove” that Uttoxeter, population 12,000, a small town on the correct side of the Staffordshire/Derbyshire border is a hot bed of swimming world records.

Around 12% of the population are in the age group that would typically hold a world swimming record. So Uttoxeter has around 1,400 potential champions. They currently have one actual world record holder, Adam Peaty (100/1 to be Sports personality of the Year, but don’t let that put you off voting for Adam).

Therefore, Uttoxeter has 0.7 world record swimmers per 1,000 likely candidates. This of course would double if we made the conditions gender specific, but it is still good enough to give it the best headline rate in the country.

So Uttoxeter can be shown to be the place for swimming excellence, but only by using percentages applied to small sample sizes which obscure, rather than illuminate the less startling reality of the situation.

Sadly, it is a flawed conclusion, based on the exploits of a single outstanding swimmer, especially as the town doesn’t currently have a swimming pool!

Monday, 17 November 2014

Is Wayne Rooney An International Flat Track Bully?

Wayne Rooney reached a landmark 100th cap against Slovenia on Saturday. He joined an illustrious club of England internationals and his equalising goal also cemented his position at the heart of his England's leading scorers.

Sir Bobby Charlton leads the way with 49 goals, followed by Gary Lineker on 48, Rooney then ties with Jimmy Greaves on 44 and Michael Owen completes the list of England strikers to have scored 40 or more goals.

Player Goals Caps Strike Rate per Game.
Sir Bobby Charlton 49 106 0.46
Gary Lineker 48 80 0.60
Wayne Rooney 44 100 0.44
Jimmy Greaves 44 57 0.77
Michael Owen 40 89 0.45

The Manchester Evening News  questioned the validity of Rooney's achievement by referencing not only the number of games he has played to achieve his 44 goals, but the strength of the opponents against whom he has played.

The average Elo rating of Gary Lineker's opponents is the highest of the group at 1,701, then Owen, 1,677, Greaves, 1,671, Charlton, 1,653 and finally Rooney, 1,557 impling that he is a flat track bully, who feasted on weak opposition.

However, this approach firstly fails to account for the different goal environments in which the five players scored their international goals.

Charlton and Greaves began their international careers in the late 50's, Lineker first appeared against Scotland in 1984, Owen debuted against Chile in 1998, overlapping with Rooney, who began his road to 100 caps against Australia in 2003.

The top English league in 1957/58 saw an average of 3.73 goals scored per game, this figure had fallen to 2.71 by the time Lineker was debuting for England and had drifted down to 2.63 when Rooney took on Australia.

In short, the goal environment was vastly different in the fifties compared to today, as illustrated in the plot below.

Therefore, Greaves and Charlton, regardless of the average Elo ratings of their opponents, began playing international games at a time when scoring was much more plentiful and it showed a gradual decline as the game moved into another century. This prolific scoring also appeared to spill over into international football.

The average number of goals that were scored in the 57 international games played by Jimmy Greaves was 4.05 goals per game. Charlton's 106 caps contained an average of 3.58 goals per game, possibly indicating that Sir Bobby may have played in more competitive matches than did Greaves. Lineker participated in the least goal laden contests, averaging 2.3 gpg and both Owen and Rooney's respective caps average 2.7 gpg.



During Rooney's 100 caps, England actually scored an average of 1.9 goals per game, conceding 0.82. If we place these average goal scoring rates into a Poisson, they are consistent with a side winning 63% of these matches.

Jimmy Greaves' international career saw an average of 2.58 goals scored and 1.47 allowed by England, this time consistent with a winning percentage of just 61% for a team, if applied to a Poisson.

So a Poisson approach  appears to confirm that Greaves' strength of schedule was more difficult than Rooney's. England scored and conceded goals consistent with them winning 63% of the matches when Rooney won a cap, but this fell to 61% in the 50's, and 60's when Greaves played presumably due to more difficult opponents.

But despite Greaves apparently playing against tougher opponents, he also played at a time when scoring, at both ends, was more plentiful.

Just as importantly, Rooney merely participated in 100 England matches. He didn't play every minute. Unlike Charlton and Greaves who played during a period where substitutions were largely not permitted. Charlton was infamously replaced against West Germany in anticipation of a World Cup semi final that never materialised at Mexico 1970 and was also subbed on after half an hour of a 10-0 win over the USA in New York in 1964.

But generally both Charlton and Greaves played from first whistle to last.

Rooney has failed to play the whole 90, or sometimes 120 minutes, in over half of his games, either through red card or being subbed in or out of the game. Therefore, an Elo average of his 100 caps may not reflect the realty of the minutes Rooney has spent in an England shirt.

Wayne could have potentially played over 9,000 minutes of international football in his 100 caps, but he was on the bench for over 1,400 of those minutes. He spent nearly 2 hours of his 100 caps watching his team mates score 10 goals against the likes of San Marino, Liechtenstein and Andorra.

He was absent for a proportion of the time available to be played against the weakest of opponents, who depressed his apparently damning Elo rating, whereas Greaves and Charlton were largely ever present to win their caps.

So the latter two players each played in an era of elevated goal scoring, teams were winning games by scoring more and also conceding more goals. Also we can't even be sure that Rooney's apparent low Elo opponent rating accurately reflects his actual playing time.

The average goals scored and allowed by England during the actual minutes Rooney has been on the field in gaining his 100 caps is 1.84 and 0.91 goals per game, indicating a more competitive playing environment than his 100 caps overall. These rates of scoring are consistent with a team winning just 59% of games.

Under these revised strength of schedule estimates that reflect actual playing time, we could conclude that Rooney has played in tougher matches than either Greaves or Charlton. England's goal differential was consistent with a side expected to win 59% of games when Rooney was on the field. The figures for Greaves and Charlton were 61% and 62%, respectively.

This approach elevates the Rooney opponents to at least the third most difficult faced by the five England 40+ scorers and highlights the deficiencies in drawing headline grabbing conclusions about a player merely from a team rating, devoid of wider context.

Player. % of the Total International Goals Scored Whilst
on the Pitch.
Gary Lineker 41%
Michael Owen 31%
Jimmy Greaves 30%
Wayne Rooney 28%
Sir Bobby Charlton 20%

A more useful method may be to calculate the percentage of the total team goals scored by a player when he is on the field. We saw in the initial table that Jimmy Greaves scored at a phenomenal rate of 0.77 goals per game. But during his international career, England were often scoring eight or nine goals and occasionally conceding four or five in return.

Other England players were also scoring against these profligate defences at such a rate that overall, Greaves was accounting for 30% of the goals scored by England. This figure is comparable with Michael Owen, forty years later and only slightly above Rooney to date.

Sir Bobby only accounted for 20% of the goals scored by England in his playing time whilst winning 106 caps. And most impressively, Gary Lineker scored over 40% of England's goals scored while he played, in an era when one goal or fewer were scored by England in over 50 of his matches and eight goals were reached by the national side just once.

All five have or had exceptional careers, but branding Rooney a flat track bully on spurious evidence is entirely unjustified.

Sunday, 9 November 2014

Age Profiles for A Team's Best and Worst Premiership Seasons.

The optimum age at which players rise to their peak before making the sometimes rapid decline into retirement or punditry is one of the most neglected areas of football analytics. The lack of readily available data is part of the issue, but assessing player development and then their regression is further confounded by the choice of variables.

Shots, goals and assists are obvious key performance indicators for strikers, although even these will have an aspect of team input, but the choice of which data to assess midfielders and defenders, with their diverse team responsibilities, is more problematic.

Some of the game's best defenders rarely made a tackle.

Therefore, using playing time as a percentage of playing time available as a proxy for player worth may still be the best alternative. A player who is not selected by his manager, either because other squad members are regarded as a better option or who misses playing time through injury, should perhaps be  considered as a less valuable asset, either through lack of developed talent or because of age related decline.

This isn't to say that a 30 year old Frank Lampard is inferior to a peak aged midfielder at a lesser club, but generally we might expect that a team that is stocked with youth or near sell by date talent may perform at lower levels than that same team when it operates with more players at their peak.

In these posts, I looked at when players are most likely to dominate playing time in the Premiership and while goal keepers inevitably defy logical appraisal, the peak for strikers would appear to be in their mid twenties, with midfielders and defenders peaking slightly later.

A logical next step is to see if the results achieved by a team is even casually related to having players at their perceived peak denoted by playing time and whether this performance falls away as less mature and aging performers take more of a centre stage.

I looked at teams which had played at least six seasons in the Premiership and collected the amount of playing time allotted to a range of age groups in the most successful season for that club and then their least successful EPL season.

I then combined the age profiles of these Premiership clubs best season as well as their worst season to see if their lack of maturity or aging may have played a role in their peak and trough of performance.

For, example, Arsenal's best performance in the EPL was unsurprisingly their 2003/04 undefeated season, where their points per game tally was over 2.5 standard deviations above the league average for that season, their worst performance so far followed soon afterwards in 2005/06 when they were just 0.75 standard deviations above average, when finishing fourth.

In total I have a group of 22 sides, comparing their best efforts to their worst, profiled by age related playing time.


The plots have been combined in two year intervals to try to make any conclusions more visible. Defenders perhaps excel as much through experience as raw physical attributes. Defending is as much about organisational skills as it is about speed and stamina. Therefore, generally defenders tend to gain proportionally more playing time later in their career, even if they have peaked physically, compared to strikers or midfielders.

A higher proportion of defenders aged from 25 to 28 played in the combined successful seasons, while more raw youth and 30+ defenders appeared when charting the 22 sides nadir.


Midfielders appear to show a similar trend. The physical demands of a midfield position generally results in players in their mid twenties being afforded proportionally more playing time and the peak at 25-26 years appears to show that a successful season by the standards of each of the 22 teams, was on average also marked by a higher proportion of midfielders from that age group.

Thereafter, older aged midfielders account for proportionally more playing time in every age group on the occasions where the sides under performed most from their usual standards.

  

For strikers, again 25-26 year old predominate in successful seasons. They then see proportionally even more playing time in the next two years, possibly as a result of favourable recent impressions. But much of these appearances by strikers in their late twenties also coincide with a season of dramatic under performance by their side, possibly indicating that some strikers can show sudden and precipitous reductions in talent levels as age creeps up on them.

Raw youth and players who retain some ability, but have seen it reduced by aging may be a necessary component of a team's make up at times because of restricted squad sizes and transfer restrictions. Or they may be selected in the belief that they currently possess more helpful ability than they actually do.

Whatever the reasons for the selection of players possibly removed from their peaks, there does seem to be some evidence that these occasions also correspond, on average, with a large degree of under performance by the side over the course of a season.