Wednesday, 30 November 2016

Was Aguero Quite So Lucky in 2015/16?

By now, expected goals needs very little introduction.

It attempts to quantify the importance of pre-shot variables in determining the likelihood that a goal will be scored. In essence it is a measure of chance quality and is largely determined by such things as shot type and location.

The majority of models output the likelihood that an average Premier League player would score from a given position and shot type. By aggregating the individual expected goals for each attempt and comparing this to a player's actual output we can broadly suggest the level of under or over performance.

Here's how the two 2015/16 leading non penalty scorers fared compared to the aggregated total of their expected goals,

Both over-performed,

Aguero more so than Kane, but we can better visualise this disconnect by simulating each of the 111 non penalty attempts taken by Aguero to see the range of season long goal totals predicted by the model.

There's around an 8% chance that the average player model would equal or better Aguero's 20 non penalty goals from his 111 chances in 2015/16.

Thereafter the interpretation becomes more subjective.

We may assume presumptuously that the model is perfect and Aguero was merely lucky.

281 individual players tried to score in 2015/16, so that's alot of individual trials and someone is likely to over perform to the level that Aguero did.

This suggests that he may subsequently enjoy more normal levels of luck and his performance may be less extreme in the future.

Or we might prefer that Aguero's 20 goals is partly driven by luck, but it also contains an element of skill in finishing chances that exceeds that granted to the average player whose out of sample data went into producing the model.

As suggested by the title of the above graph, we can produce a second expected goals model that while not explicitly tailored to Aguero's (potential) finishing prowess, does contain elements that may act as a proxy for elusive finishing ability.


If we now simulate Aguero's 111 chances, but using a model that incorporates statistically significant variables that "may" relate to finishing skill, he becomes less "lucky". His 20 goals are now much less unlikely. The new model predicts he would score 20 or more in nearly 40% of seasons.

Overall, this new set of variables (I can't be more specific, sorry) inflates the individual expected goals values of players, such as Aguero and Kane who possess the new variable and reduces the the figures for those who don't.

Overall a model that allows for a differential in finishing abilities across all players that attempt to score in a typical season reduces such indicators as the rmse in out of sample data.

Under a model that includes a proxy term for finishing skill, Aguero only scores 1 more goal than predicted in out of sample data from 2015/16 and Kane scores exactly the number predicted by the model.

Perhaps more importantly Aguero's 2015/16 is a substantially better goodness of fit at the individual attempt level under the second model compared to the first.

Tuesday, 22 November 2016

Burnley's Unsustainable Survival Technique.

Monday night's live game pitted two of the Premier League's more dour sides against each other.

WBA is the magnificent Tony Pulis' current port of call, where they are the recipients of his exclusive brand of pundit flummoxing, survival techniques.

Meanwhile, Burnley are getting by on a meagre 0.8 expected goals per game. They are conceding an average of 2.1 expected goals per game and through the grace of the probabilistic gods, actually allowing just 1.4 real goals.

That's not a Pulis approved survival approach, at least in the long term, but it has given Sean Dyche's side a few notable results.

Top of the tree of upsets was Burnley's 2-0 early season win at home to Liverpool, where Dyche tired out his opponents, not by engaging them in a presssing foot race, but by nicking an early lead and then handing them dozens of goal attempts.

All of which they missed.

The blueprint of being overwhelmed, but showcasing the England credentials of your defence, was wheeled out again at Old Trafford for the approval of Jose. And while Burnley didn't quite manage to nick a goal here, they did keep their goal intact for a welcome point.

Sandwiched in between was another expected goals beating at the hands of a top six contender where the reality better reflected the distribution of the quality and quantity of chances created in the game.

Chelsea's invite left Burnley nursing a 3-0 loss.

On the surface Burnley had made a comfortable start to their renewed acquaintance with the Premier League. "they look far better equipped for survival this time around, sitting comfortably in 9th place"  might have been something that was written about the Clarets prior to Monday's game.

But scratch beneath the media soundbites and Burnley's well being is supported by a large helping of unsustainable variance.

Hats off to the 14 Burnley players who withstood the battering from an 11 and then ten man Manchester United in late October, but simulate the exercise 1,000's of times and a United win is by far the most likely outcome of the three possible results.

Simulate all 120 matches, along with the multitude of possible tables, 1,000's of times and Burnley's most likely current position is.....bottom. Rather than the more comfortable 9th they occupied prior to match week 12.

Of course, points already won are kept, no matter how ill gotten or deserving and should Burnley continue their idiosyncratic survival process, coupled with their recent showing in the Championship, they probably won't finish in their current expected position of bottom in May.

They'll most probably finish 19th.

If you want to check out all of Burnley's shot maps, along with all Premier League games for the last three seasons, download the free Infogol app 

Friday, 28 October 2016

The Addenbrooke to Zenga of Wolverhampton Wanderers.

It will be scant consolation to the recently dismissed Wolves manager, Walter Zenga, that managerial tenure has shown a decline over time and not only in the currently trigger happy East and West Midlands.

Wolves' first paid committee manager, Jack Addenbroke lasted an impressive 37 years, spanning the Victorian age and one World War.

But even if we begin at the start of the last old Golden Age with the appointment of Bill McGarry in the late 60's, time served by the boss has shown a downward trend.

McGarry's 398 games in charge during his first stint at the club was ended by relegation from the top tier after a May Monday night defeat at Molineux by Liverpool.

The maths were simple, a win for the hosts secured their First Division lives, while a win for the visitors won them the title. High drama that TV would die for, but in the 70's only radio put in an appearance to see a future Wolves captain lift the silverware for Liverpool.

Defeat sent Wolves on a footballing journey that only fleeting returned them to the top table.

Part of Wolves' Magical Mystery Tour post 1976. Blogger laughing because he's marking a goalkeeper!

At the dawn of footballing time, managers were lasting on average for around 150 matches, now it's down to about 50.

Success rate obviously plays a part in perceived managerial talent and Zenga's so so 47% success rate would typically entitle him to at least a season of honest toil, rather than the 17 matches he was actually granted.

His last game in charge perhaps sums up the knee jerk reactions prevalent today.

In keeping with 10 of the 14 league games contested by Wolves this term, their process created the better chances compared to their opponents.

In Zenga's final game in charge, typically a side would win slightly more times than they drew or lost were they have created 2.33 expected goals to 1.51 for their opponent

But not for the first time (7th April 1973), Leeds gained an undeserved 1-0 win.

Over Zenga's 14 league games, Wolves have a positive expected goals differential of 3.5 goals, rather than their actual goal difference of -1.

They've lost in three matches where they have been the superior expected goals team, drawing a further three in similar circumstances.

Their most likely current position based on process, without the sometimes perverse intervention of small sample sized randomness, is inside the Championship top ten rather than the current 18th spot that has played a part in Zenga's dismissal.

Even with their current lowly status, a more neutral division of shot outcomes over the remainder of the season places Wolves most likely finishing position at mid table........but in keeping with today's "enlightened" footballing age, the current small sample derived pecking order has already had a big say.

Spurious correlation. There is a medium to strong correlation between the first letter of a Wolves manager's surname and his tenure, and hence his Big Sam A it is!