Wednesday 29 November 2017

Over Performers Aren't Always Just Lucky.

Firstly, this isn't another post about whether Burnley are good at blocking shots because "yes they are".

Instead it's about applying some kind of context to levels of over or under performance to a side's performance data. And attempting to attribute how much is the result of the ever present random variation in inevitably small samples and how much is perhaps due to a tactical wrinkle and/or differing levels of skill.

Random variation termed as "luck" is probably the reddest of rags to a casual fan or pundit, disinterested or outwardly hostile to the use of stats to help to describe their beautiful game.

It's the equivalent for anyone with a passing interest in football analytics of "clinical" being used ad nauseam, all the way to the mute button by Owen Hargreaves.

Neither of these two catch-all, polar opposite terms used in isolation are particularly helpful. Most footballing events are an ever shifting, complex mixture of the two.

I first started writing about football analytics through being more than mildly annoyed that TSR (or Total Shot Ratio, look it up) and its supporters constantly branded Stoke as being that offensive mix of "rubbish at Premier League football" and constantly lucky enough to survive season after season.

And then choosing the Potters as the trendy stats pick for relegation in the next campaign as their "luck" came deservedly tumbling down.

It never did.

Anyone bothered enough to actually watch some of their games could fairly quickly see that through the necessity of accidentally getting promoted with a rump of Championship quality players, Stoke or more correctly Tony Pulis, were using defensive shapes and long ball football to subvert both the beautiful game and the conclusions of the helpful, but deeply flawed and data poor, TSR stat.

There weren't any public xG models around in 2008. To build one meant sacrificing most of Monday collecting the data by hand and Thursday as well when midweek games were played.

But, shot data was readily available, hence TSR.

At its most pernicious, TSR assumed an equality of chance quality.

So getting out-shot, as Stoke's setup virtually guaranteed they would be every single season, was a cast iron guarantee of relegation once your luck ran out in this narrow definition of "advanced stats",

Quantifying chance quality in public was a few years down the road, but even with simple shot numbers, luck could be readily assigned another constant bedfellow in something we'll call "skill".

There comes a time when a side's conversion rate on both sides of the ball is so far removed from the league average rates that TSR relied upon that you had to conclude that something (your model) was badly broken when applied to a small number of teams.

We don't need to build an xB model to see Burnley as being quite good at blocking shots, just as we didn't need a labouriously constructed expected goals model to show that Stoke's conversion disconnects were down to them taking fewer, good quality chances and allowing many more, poorer quality ones back in 2008.

Last season, the league average rate at which open play attempts were blocked was 28%. Burnley faced 482 such attempts and blocked 162 or 34%

A league average team would have only blocked 137 attempts under a naive, know nothing but the league average, model.

Liverpool had the lowest success rate under this assumption that every team has the same in built blocking intent/ability. They successfully blocked just 21% of the 197 opportunities they had to put their bodies on the line.

You're going to get variation in blocking rate, even if each team has the same inbuilt blocking ability and the likelihood of a chance being blocked evens out over the season.

But you're unlikely to get the extremes of success rates epitomized by Burnley and Liverpool last season.

You'll improve this cheap and cheerful, TSR type blocking model for predictive purposes by regressing towards the mean both the observed blocking rates of Liverpool and Burnley.

You'll need to regress Liverpool's more because they faced many fewer attempts, but the Reds will still register as below average and the Claret and Blues above.

In short, you can just use counts and success rates to analysis blocking in the same way as TSR looked at goals, but you can also surmise that the range and difference in blocking ability that you observe may be down to a bit of tactical tinkering/skillsets as well as randomness in limited trials.

In the real world, teams will face widely differing volumes, the "blockability" of attempts will vary and perhaps not even out for all sides and some managers will commit more potential blockers, rather than sending attack minded players to create havoc at the other end of the field.

With more data, and I'm lucky to have access to it in my job, you can easily construct an xB model. And some teams will out perform it (Burnley). But rather than playing the "luck" card you can stress test your model against these outliers.

There's around a 4% chance that a model populated with basic location/shot type/attack type parameters adequately describes Burnly's blocking returns since 2014.

That's perhaps a clue that Burnley are a bit different and not just "Stoke" lucky.

The biggest over-performing disconnect is among opponent attempts that Burnley faced that were quite likely to be blocked in the first place. So that's the place to begin looking.

And as blocking ability above and beyond inevitably feeds through into Burnley's likelihood of conceding actual goals, you've got a piece of evidence that may implicate Burnley as being a more acceptable face of over-performance in the wider realms of xG for the enlightened analytical  crowd to stomach than Stoke were a decade ago.

No comments:

Post a Comment