Tuesday 24 May 2016

Expected Goals Shot on Target Profiles, Premier League, 2015/16.

Just a quick data dump on the distribution of expected goals from shots on target allowed and created by all 20 Premier League teams over the 2015/16 season.

Fairly obviously the better teams create more of the latter and allow fewer of the former, but there are also a couple of noticeable quirks, such as Arsenal's lack of speculative, low expectation attempts.

There's also the well known trade off between creating fewer big chances and spreading your expected goals across many more lower expectation attempts. Balancing the increased chances of scoring at least once against the smaller likelihood of a goal glut.

I've fitted a trend line to try to make the plots easier on the eye.

Attempts made are in blue and those allowed in orange. Teams with strong underlying figures therefore will have blue above orange (See Spurs) and poor teams the reverse (see Aston Villa).

Teams which had the latter, but finished in the top ten probably had a very good keeper (see Stoke).

Saturday 21 May 2016

Home Field Advantage, Down, But Still Not Out.

Home advantage, along with Sunderland, rebounded from a disappointingly lacklustre first half of the Premier League season to finish the 2015/16 campaign close to their traditional benchmark figure.

After 170 matches, home teams were winning only a few more times than away teams. 

Long term trends have been downward, but it was a bold claim made by some, to declare home field advantage a thing of the past after such a relatively small number of matches.

Unequal match-ups and significant, but rare events, such as red cards may have contributed to producing a subset of games that tended towards a more equitable share of home and away wins, but the noisy nature of a low scoring sport is always likely to push outcomes temporarily in one direction or another.

We now have an entire season’s worth of shot data, through which we can examine the process of chance creation, rather than the single outcome iteration that briefly led to the exaggerated obituary for HFA after 170 matches.

59% of home teams were favoured to win in the first 170 matches based on an expected goals rating of each team, but this increased to over 61% in the remaining 210 games. 

So scheduling played a small part in pressing home wins downwards up until Christmas. 

But random variation is always going to be a prime contender as the “cause” of bogus “sea changes” in league-wide traits. 

You may measure home field advantage in a variety of ways (and chose the one which best illustrates your particular agenda). For example, home win%, goal difference between home and away teams, points per game, success rate and so on.

Success rate, (wins + half draws)/games played, has the handy attribute of totalling one for home and away teams. 

A success rate of less than 0.5 for home teams immediately tells you that away teams have been more successful over that period of games.

The plot above shows the range of success rates gained by home teams in the first 170 matches of 2015/16, based on their chance creation during each match over 10,000 simulations. 

The actual success rate in late December 2015/16 fell in the bin 0.0495-0.51. 

Based on expected goals, the relative parity experienced by home teams over just 170 games wasn’t that unexpected. Home sides have also experienced similar runs of results in seasons where home teams were still deemed dominant.

However, with more usual levels of “luck” home teams could have seen their first third of the season reap success rates of around 0.53 or 0.54.

There’s even around a one in 10 chance they could have hit 0.555 or above and sent “home field strikes back” narratives into overdrive.

We may also simulate the entire season from the perspective of the success rate of home teams, mindful of the 210 additional games that were likely to feature proportionally more numerous dominant home team match-ups.

Home teams had an actual success rate of just over 0.55 over the entire season. So the actual outcome lies at the top end of the most likely bin in this plot of shot based simulations. 

There’s also around a 10% chance that the probabilistic shot profile of all 380 games could have produced a home success rate at least as good as the previous three “unexceptional” home seasons.

But any pretence that luck alone could have sent a reasonably normal, if declining season for home field advantage down to levels of near parity all but disappears in the increased sample size and more equitable schedule.

….if home success rate was around 0.5 now, we may have a case to answer re zero HFA, but the levels of parity in December was always likely to be sample size related.

Small sample narrative makes “home and away specialists” of individual teams before they subsequently melt away.

And the gradual decline of home field advantage may make “HFA is dead” posts even more frequent and probably erroneous in the future.

Tuesday 10 May 2016

Keeping Up With The Foxes.

The champagne bubbles have long since burst and the clackers have hopefully been dispatched to landfill after Leicester’s incredible title winning season, but the post mortem lingers on.

They returned to the Premier League last term as one of the strongest additions to the top tier and although survival was ultimately achieved with a healthy six points margin to 18th place, the route taken was far from smooth, with 20th position a regular resting spot during the season.

Nigel Pearson’s selectively biased cut-off point of seven wins from his final nine matches wasn’t the prelude to a personal title run in 2015/16. A record of four wins from the previous 29 is perhaps harder to totally erase from the debit column.

Instead, Claudio Ranieri was appointed to lead the charge to general apathy, if not mild hilarity, sometimes spilling over into hostility.

The easiest charge to lay at Ranieri’s door was his seeming inability to stick with a settled team. 

However, this particular foible was quickly laid to rest and a settled side, particularly in defence in hindsight became one of the cornerstones of Leicester’s astonishing rise from bottom to top.

And as the chart above hints at, another stage in 2015/16’s perfect storm was falling into place as Chelsea absented themselves from title contention, partly due to Mourinho drafting in numerous, increasingly mal-contented additions to the starting 11 with increasing regularity.

Leicester have become the 5,000/1 miracle and while bookmaker’s losses are often bathed in PR terms, there are credible sources that appear to verify real overall losses in the title winning market.

Not just at the initial headline price (one “lucky” punter allegedly cashed out his 50p bet on 5,000/1 Leicester for a 45p profit after one game of the season), but more damagingly at various points throughout the season.

An unfashionable team flirting with the upper reaches of the table early in a campaign is hardly news.
Even in 2015/16 Crystal Palace headed Leicester in the table after eight games. After 10 it was West Ham. And Hull and WBA to choose two equally unlikely Premier League winners have shone brief but bright in recent seasons.

But always the interlopers fail to sustain a title challenge, either through small sample size luck elevating them early beyond their talent, benign early schedules, the regular big hitters trampling them underfoot over the long term or injury and suspension exposing a lack of talent depth.

Until now.

Leicester steadily signed off on the requirements to send a lower, mid table team to the very top. The team was largely injury and suspension free. Jamie Vardy, for example took 35 games before he managed to get himself suspended. And Huth and Drinkwater followed suit when mere relegation form was needed to clinch the title.

The usual title contenders all partly prioritised elsewhere or coped poorly with the usual rigours of a Premier League season replete with European commitments.

Expected goals can usually discriminate the good sides from the mere lucky.

Title winners often have their fair share of both, but Leicester’s season can perhaps best be illustrated by this metric if based around where it had them finishing in season long simulations and how these estimations tallied with the bookmaking industry view over the ten months long season *.

Points won are forever in the record books, but remaining matches are mere probabilistic events until played. By combining the two, in this case using each team's expected goals records to date as a measure of future performance, Leicester's chances of pulling off the seemingly impossible can be charted over time.

The respective league positions of the teams throughout the season can be easily checked here.

The bookmaking odds constantly lag behind the optimism of the agnostic expected goals modelling and simulation, although prior knowledge of the usual fate of title upstarts may account for this reasonably pessimistic stance in the case of the former. No one has gate crashed the title party from outside the usual elite since 1995.

But the Foxes were increasingly taken seriously from all sides following respective wins over Chelsea, Spurs, Liverpool and Man City from week 16 onwards.  

Even when WHU briefly headed Leicester in third place after 10 matches, the Hammers didn't quite possess the underlying expected goals stats and number of earned points to muscle into the probabilistic expected goals title reckoning.

Leicester spent their title teens posting on average around 1.7 expected goals per game while conceding just over 1.3 in return. But when chance creation slipped by a couple of tenths, the defence caught the slack to maintain a healthy positive expected goals difference as they moved from their twenties into early middle age.

Lots of things fell right for Leicester during 2015/16, either at their own hand or the meltdown of other North London rivals. But they did provide a valuable lesson. 

Namely, the past, especially a cursory reading of general trends, rather than a deeper examination of expected achievements is only an imperfect indicator of the future for a reasonably young sporting ecosystem, such as the Premier League.

* many of these simulations have been posted on twitter over the season, so other than the natural variation within 10,000 sims using the same input, none have been “tweaked”.