Thursday, 1 February 2018

Manchester City and WBA. The Best in Top Tier History.

The importance of league tables is only absolute after the final game has been played and your side has secured that all important Europa League spot or finished 17th spot or higher.

For the remainder of the time, but particularly just after mid season, it is your side's position relative to their nearest challengers that is most important.

Watford's current 11th may give the illusion of relative safety, but on closer inspection they are only three points above Huddersfield, who are teetering on the brink of the relegation spots in 17th position.

One way to try to quantify your side's current position is to see how close, above or below a side is from the relative mediocrity of the average points won by all sides in the season to date, whilst also accounting for the distribution of points both currently after 25 games and in the past.

Manchester City can rightfully claim to be in the running to become the most dominant title winners in the history of the 20 team top tier.

They are currently 2.56 standard deviations above the current points average per team. Their nearest historical rivals were the Manchester United team of Beckham, Giggs, Keane, Sheringham and the Neville brothers from 2000/01, who were 2.51 SD's above par after 25 games and Chelsea's 2005/06 team (2.50 SD's).

At the bottom, WBA are the "best" 20th placed team ever, being only 1.06 SD's below the average points won by teams so far.

Likewise Swansea and Southampton are the most impressive 19th and 18th placed team, respectively after 25 games.

The unusually distributed nature of the points won by sides in 2017/18 then begins to catch up with those sides whose position implies relative safety, but the proximity of their rivals suggests otherwise.

Newcastle are the second worst 14th placed side in top tier history by this measure, as are Watford in 11th and Burnley in 7th.

Here's the rest of the teams. we've got the strongest bottom four ever in relation to the average points won by a side after 25 games, along with the weakest and most vulnerable mid table teams, again in top tier history.

Monday, 22 January 2018

After the Shot xG2

Expected goals has variously been defined by advocates and opponents respectively as a more accurate summary of what "should" have happened on the pitch or a useless appendage to the final scoreline, that is neither useful nor enlightening.

The first description is perhaps too overtly optimistic for a "work in progress" that is evolving into a useful tool for player projection and team prediction.

Whereas the second, less flattering description, may also stand up to some scrutiny, particularly if the supporters of the stat ignore the uncertainty intrinsic in it's calculation, while the detractors may be blithely ignorant of such limitations.

Both camps are genuinely attempting to quantify the true talent levels of players and teams in a format that allows for more insightful debate and, in the case of the nerds, one that is less prone to cognitive bias.

The strength of model based opinion is that it can examine processes that are necessary for success (or failure), drawing from a huge array of similar scenarios from past competitions.

And in doing so without straying too far down the route from chance creation to chance conversion (or not), so that the model avoids becoming too anchored in the specifics of the past, rendering any projections about the future flawed.

Overfitting past events is a model's version of eye test biases, but that shouldn't mean we throw out everything that happens, post chance creation for fear of producing an over confident model that sticks immutably to past events and fails to flexibly project the future.

It's no great stretch to model the various stages from final pass to the ball crossing the goal line (or not).

Invariably, the process of chance creation alone has been prioritised as a better predictor of future output and post shot modeling has remained either a neglected sidetrack or merely the niche basis for xG2 keeper shot stopping.

But if used in a less dogmatic way, mindful of the dangers of over fitting, the "full set" of hurdles that a decisive pass must overcome to create a goal (or not) may become a useful component in an integrated approach that utilises both numeric and visual clues to deciphering the beautiful game.

Lets look at chances and goals created from set pieces and corners.

Here's the output from two expected goals models for chances and on target attempts conceded by the current Premier League teams in the top flight since early 2014.

The xG column is a pre shot model, typically used to project a side's attacking or defensive process, that uses accumulated information, but is ignorant of what happened once contact with the ball was made.

The xG2 column is based entirely upon shots or headers that require a save and uses a variety of post shot information, such as placement, power, trajectory and deflections. Typically this model would be the basis for measuring a keeper's shot stopping abilities.

A superficial overview of the difference between the xG allowed from set pieces and actual goals allowed leads to the by now familiar "over or under performing" tag.

Stoke had been transformed into a spineless travesty of their former defensive core at set plays, conceding both chucks of xG and under performing wantonly by allowing 42 actual goals against 37 expected.

There's little disconnect between the Potters' xG2, that examines those attempts that needed a save, but the case of Spurs & Manchester United perhaps shows that deeper descriptive digging may provide more insight or at least add nuance.

Tottenham allowed a cumulative 29.6 xG conceding just 23.

We know from keeper models that Lloris is generally an excellent shot stopper and the xG2 model confirms that, along with the ever present randomness, the keeper's reactions are likely to have played a significant role in defending set play chances.

In allowing 23 goals, Lloris faced on target attempts that worth just over 31 goals to an average keeper.

29.6 xG goals are conceded, looked at in terms of xG2 this value has risen to 31.3, so still mindful of randomness, Spurs' defenders might have been a little below par in surpressing the xG2 attempts that came about from the xG chances they allowed, but Lloris performed outstandingly to reduce the level of actual goals to just 23.

Superficially, Manchester United appears identical.

As a side they allowed 37.6 xG, but just 32 actual goals. we know that De Gea is an excellent shot stopper, therefore in the absence of xG2 figures we might assume he performed a similar service for his defence as Lloris did for his.

However, United's xG2 is just 33.1 and the difference between this and the actual 32 goals allowed is positive, but relatively small compared to Lloris at Spurs.

By extending the range of modeling away from a simple over/under xG performance we can begin to examine credible explanations for the outputs we've arrived at.

Are United's defenders exerting so much pressure, even when allowing attempts consistent with an xG of 37.6 that the power. placement etc of those on targets efforts are diluted by the time they reach De Gea?

Are the attackers themselves under performing despite decent xG locations? (Every xG model is always a two way interaction between attackers and defenders).

Is it just randomness or is it a combination of all three?

Using under and over performing shorthand is fine. But we do have the data to delve more into the why and taking this xG and xG 2 data driven reasoning over to the video analysis side is the logical, integrated next step.

Monday, 15 January 2018

Arsenal Letting in Penalties Doesn't Defy the Odds.

Arsenal fans have been getting hot under the collar about penalties.

Penalty kicks have either been awarded (against Arsenal) when they shouldn't have been, not awarded (to Arsenal) when they should have or when they have been conceded, they've gone in, alot.

The latter has spawned the inevitable trivia titbit.

There's nothing wrong with such trivia as fuel for the banter engine between fans, but almost inevitably they quickly become evidence for an underlying problem that exclusively afflicts Arsenal.

Cue the Daily Mail "why is Arsenal's penalty saving record so poor"

So lets add some context.

We're into familiar selective cutoff territory, where you pick a starting point in a sequence to make a trend appear much more extreme than it actually is.

As you'd probably guess, Arsenal saved a penalty just prior to the start of the run.

They also saved one Premier League penalty in each of the preceding two seasons, two more per season if you go back two more campaigns and obligingly opponents penalty takers also missed the target completely on a handful of other occasions.

If you shun the exclusivity of the Premier League Arsenal keepers made penalty saves in FA Cup shootouts and induced two misses in Community Shield shootouts, the latter as recently as 2017.

Over the history of the Premier League, 14% of penalties have been saved by the keeper. The remaining have gone wide, hit the post, been scored or an attempt has been made to pass the ball to a team mate. (Arsenal, again)

Arsenal's overall Premier League penalty save rate is also 14%.

So you should ask if we're simply seeing a random streak that was likely to happen to someone, not necessarily Arsenal, over the course of Premier League history.

Arsenal has conceded nearly 100 Premier League penalties because they have  had dirty defenders  been ever present, respected members of the top flight.

Of the current Premier League sides, 17 have had the opportunity to concede a run of 23 consecutive penalty goals.

If we simulate all the penalties faced by each of these teams using a generic penalty success rates, you find that at least one side during the current history of the Premier league will have conceded a run of 23 penalty goals or more in just over half of the simulations.

Letting in penalty after penalty, sometimes up to and beyond 23 is something that is going to have happened slightly more often than not in the top flight, based on save rates.

Arsenal just happen to have had both the opportunity and the luck to have been the Premier League's slightly odds on reality star winner.