Saturday, 14 October 2017

Player Projections. It's All About The Distribution Part 15

A couple of football analytics' little obsessions are correlations and extrapolations.

Many player metrics have been deemed flawed because they fail to correlate from one season to the next, but there are probably good reasons why the diminished sample sizes available for individuals lead to poor season on season correlation.

Simple random variation, players suffer injury, a change in team mates or role within a club, atypically small sample sizes often lead to see sawing rate measurements and inevitably players age and so can be on a very different career trajectory to others within the sample.

The problems associated with neglecting the age profile of a group of players when attempting to identify trends for use in future projections is easily demonstrated by looking at the playing time (as a proxy for ability) enjoyed by players who were predominated aged 20 and 30 when members of a Premier League squad and how that time altered in their 21st and 31st years.

The 30 year oldies played Premier League minutes equivalent to 15 full matches, falling to 12 matches in their 31st year. So they were still valued enough to play fairly regularly, but perhaps due to the onset of decline in their abilities they featured, on average, less than they had done.

The reverse, as you may expected was true for the younger players. They won the equivalent of seven full games in their 20th year and nine the following season.

It seems clear that if you want to project a player's abilities from one season to the next and playing time provides a decent talent proxy, you should expect improvement from the youngster and decline from the older pro.

However, as with many such problems, we might be guilty of attempting to impose a linear relationship onto a population that is much better defined by a distribution of possible outcomes.

The table above shows the range of minutes played by 21 and 31 year olds who had played 450 minutes or fewer in the previous season as 20 or 30 year old players.

As before, we may describe the change in playing time as an average. In this subset, the older players play very slightly more than they did as 30 year olds, the equivalent of two games, improving to 2.2.

The younger players jump from 1.8 games to 3.6.

However, just as cumulative xG figures can hide very different distributions, particularly of big chances which subtly alter our expectation for different teams, the distribution of playing minutes that comprise the average change of playing time can be both heavily skewed and vary between the two groups.

Over three quarters of 30 year old didn't get on the field at all during the next Premier League season, likewise 2/3 of the younger ones..

21% of young players played a similar amount of time to the previous season, between one and 450 minutes, compared to just 14% of the older ones. And 17% of youngsters exceeded the total from the previous season, as did just 10% of the veterans.

So if you use the baseline rate of increased playing time as a flat rate across all players that fall into these two categories in the future, you might be slightly disappointed, because overwhelmingly the experience of such players is one where they fail to play even a minute in the following season.

Knowing that there is an upside, on average for these two groups of players, based on historical precedent is a start, but knowing that 3 out of 4 the oldies and 2 out of 3 youngsters who you are considering didn't merit one minutes worth of play in an historical sample is also a fairly important, if not overriding input. 

Wednesday, 11 October 2017

World Cup Qualification So Far.

To save my Twitter feed from viz overload, here's a couple of plots from the completed World Cup qualifiers.

FIFA ratings usually get a good kicking, but if you know their limitations they do a decent job and have done in predicting the qualifying teams so far for 2018.

Some higher rated teams will miss out, it's only 10 games in some cases, after all.

But if you want a benchmark FIFA rating at the time qualifying began in 2015, the definite qualifiers had a median rating of 891.

Those still waiting on a playoff were rated 676 and those rooting for other countries were 464.

Check your country and see if they ended up roughly in the position they deserved based on 2015 FIFA rankings.

FIFA don't seem to want you to find historical ratings, but to the best of my knowledge these were the ratings each side had in October 2015, apart from the three I couldn't find & made up.

Sunday, 8 October 2017

Premier League Age Profiles Through the Ages

I found some data I collected but never got round to analysing for the joint OptaProForum presentation with Simon Gleave a few years ago.

It simply consists of minutes played by each age group in the four highest tiers of English domestic football.

There are a variety of methods to describe the ageing curve in football, where players initially show improvement, peak and then decline with age. I prefer the delta approach, which charts the change of a variety of performance related indicators or their proxies.

We may condense the age profile of a team or league down into three main groups. Young players, under 24 who are still improving,

Peak age performers from around 24 to 29 and ageing players of 30 or more, who may still be good enough to command some playing time, but are diminishing compared to their own peak levels.

Using the amount of playing time allowed to each of the three groups as a performance proxy, the peak age group of Premier League players have been increasing their share at the expense of both the younger and older groups since 2004/05. Peak share has risen from 48% of the available playing time at the start of the period to 60% by 2014/15.

The wealth of the Premier League and the limited alternative destinations for the best, prime aged talent would appear to be a reasonable cause for this increase. Perhaps only Spain's Barcelona and Real Madrid (Suarez and Bale) account for the few realistic destinations for peak age, Premier League talent.

By contrast, League Two, the fourth tier of English football, appears to have a very different age profile.

Here, youth and peak aged players share playing time, with 30 & over players lagging well below these levels, implying a different market further down the pyramid.

Players are not being recruited from the extreme right hand tail of the talent pool, so more options of similar ability are available and there is also an extensive pool of buyers in the two or three divisions immediately above League Two, ready to take on the cream of the peak age performers.

Finally here's the plots for the best Premier League teams compared to the remainder of the clubs.


Peak shares are similar for both groups, but the top teams have played a larger share of (talented) younger players, while the remainder of the Premier League have swayed slightly more towards experience (perhaps ageing players from the top teams dropping in grade, but remaining in the Premier League).

Crouch at Stoke, for example.

Liverpool's individual profile appears to illustrate how their age profile has remained similar to the average for top Premier League teams across the 11 seasons.

Over 30's make up the lowest proportion of playing time, followed by younger players and topped of by peak age talent.

30+ contribution falls away, to be replaced by ageing peak age talent, which in turn is refreshed by maturing younger players. Replacement buys can then be made in the 22-24 range to continue the cycle.

By contrast, Everton has chosen to largely swap around the over 30 group and the under 24 group, leading to seasons where older players dominate.