Friday 21 July 2017

Shots, Blocks And Game State

In this post I described a way to quantify game state by reference to how well or badly a side was doing in relation to their pregame expectations.

So rather than simply using the current scoreline to define game state, it gave a much more nuanced description of the state of the game, particularly in those frequent phases of a match when the sides were level.

It also incorporates time remaining into the calculation. 

A team level after 10 minutes might be in a very different situation compared to the same score differential, but with ten minutes remaining. How they and their opponents played out the subsequent time may be very different in the two scenarios.

At a simplistic level, those teams in a happy place may be more content to prioritise actions that maintain the status quo, such as defend more, while those who'd wish to alter the state of the game might put more resources into attack than had previously been the case.

It seems logical that a more defensive approach should result that team accumulating more products of a packed defence, such as blocked shots, while any chances they do create may be meet with increasingly fewer defenders.

I took at look at the correlation between blocks and clear cut or so called big chances and the prevailing state of the game and there was a significant relationship between them.

A side in a poor state of the game had more chance of their goal attempts being blocked and his increased as their game state deteriorated.

Similarly, a side in a positive state of the game was more likely to create a chance that was deemed a big chance.

This appears to fit which the hypothesis of content teams packing their defence more, and increasing the likelihood that they block an attempt and if they do scoot off upfield, they're more likely to be met with a depleted defence.

However, correlation doesn't prove causation etc etc. 

In the case of a side being more likely to create big chances, there may be a confounding factor that is causing both the good state of the game and the big chances. (Think raincoats, wet pavements and weather).

That factor is possibly team quality.

The top six account for 30% of the Premier League, but took 48% of the wins, 43% of the goals scored and 45% of the league points won.

They're a league within a league, more likely to be in a very good game state and they also accounted for 43% of the league's big chances.

Team quality may be the causative agent for a good game state and for creating big chances, which correlates the two without either being causative agents of the other.

So I stripped out all games involving the big six to get a more closely matched initial contest, but the correlation persisted.

Teams in a good place against sides of similar core abilities were more likely to create very good chances and more likely to find defensive bodies to block the anticipated  onslaught from their opponents.

As a tentative conclusion, intuitive events that you might expect to be more likely to occur as strategies subtly alter do appear to be identifiable in the data.

Data from InfoGolApp

Saturday 15 July 2017

Lloris, the Best with Room to Improve?

Expected goals, saves or assists are now a common currency with which to evaluate players and teams, with an over achievement often being sufficient to label a player as above average/and or lucky, depending on the required narrative.

By presenting simple expected goals verses actual goals scored, much of the often copious amount of information that has been tortured to arrive at two simple numbers is hidden from the view of the audience.

Really useful additional data is sometimes omitted, even simple shot volume and the distribution in shot quality over the sample.

The latter is particularly salient in attempting to estimate the shot stopping abilities of goal keepers.

Unlike shot takers, it is legitimate to include post shot information when modelling a side's last line of defence.

Extra details, such as shot strength, placement and other significant features, like deflections and swerve on the ball, can hugely impact on the likelihood that a shot will end up in the net.

A strongly hit, swerving shot, that is heading for the top corner of the net is going to have a relatively high chance of scoring compared to a weakly struck effort from distance.

Therefore, the range probabilistic success rates for a keeper based shot model is going to be wider than for a mere shooter's expected goals model. not least because the former only contains shots that are on target.

We've seen that the distribution of the likely success of chances can have an effect on the range of actual goals that might be scored, even when the cumulative expected goals of those chances is the same.

To demonstrate, a keeper may face two shots, one eminently savable, with a probability of success of say 0.01 and one virtually unstoppable, with a p of 0.99. Compare this scenario to a keeper who also faces two shots, each with a 0.5 probability of success.

Both have a cumulative expectation of conceding one goal, but if you run the sims or do the maths, there's a 50% chance the latter concedes exactly 1 goal and a near 98% chance for the former.

The overall expectation is balanced by the former having a very small chance of allowing exactly 2 goals, compared to 25% for the keeper facing two coin toss attempts.

Much of this information about the shot volume and distribution of shot difficulty faced by a keeper can be retained by simulating numerous iterations of the shots faced to see how the hypothetical average keeper upon whom these models are initially built and seeing where on that distribution of possible outcomes a particular keepers actual performance lies.

Hugo Lloris has faced 366 non penalty shots and headers on goal over the last 3 Premier League seasons.

Those attempts range from ones that would result in a score once in 1,400 attempts to near certainties with probabilities of 0.99.

An average keeper might expect to conceded goals centred about 120 actual scores based on the quality and quantity of chances faced by Lloris.

Spurs' keeper allowed just 96 non penalty, non own goals and no simulation based on the average stopping ability of Premier league keepers did this well.. The best the average benchmark achieved begins to peter out around 100 goals.

Therefore, an assessment of the shot stopping qualities of a keeper might better be expressed  as the percentage of average keeper simulations that result in as many or fewer goals being scored than the keeper's actual record.

This method incorporates both the volume and quality of attempts faced.

The table above shows the percentage of average keeper simulations of all attempts faced by Premier League keepers since 2014 that equalled or bettered the actual performance of that particular keeper.

For example, there's only a 2.5% chance, assuming a reasonably accurate model, that an average keeper replicates or betters Cech's 2014-17 record and they would expected to equal or better Bravo's
in perpetuity.

Lloris' numbers are extremely unlikely to be replicated by chance by an average keeper and it seems reasonable to surmise that some of his over achievement is because of above average shot stopping talent.

Lloris over performs the average model across the board. Saving more easy attempts compared to the model's estimates and repeating this through to the most difficult ones.

Vertical distance from goal is a significant variable of any shot model and  Lloris' performs to average keeper benchmark save rates, but with the ball moved around 20% closer to the goal.

Intriguingly, this exceptional over performance is partly counter balanced by an apparent less than stellar return when faced with shots across his body.

Modelling Lloris when an opponent attempts to hit the far post produces a variable that his a larger effect on the likelihood of a goal then is the case in the average keeper model.

Raw figures alone hint at an area for improvement in Lloris' already stellar shot stopping.

The conversion rate for players who got an attempt on target, while going across Lloris' body converted 35% of the time, compared to the league average of 32%. He goes from the top of the tree overall to around average in these types of shots.

An average keeper gets more than a look in in this subset and the average model equals or beats Lloris' far post, on target actual outcome around 22% of the time. That's still ok, but perhaps suggests that even the very best have room to improve.

Below I've stitched together a handful of Lloris' attempts to keep out far post, cross shots to give some visual context.

For more recent good work, check out Will and Sam's twitter feed and Paul's blog & podcasts.

Data from Infogol.InfoGol

Thursday 13 July 2017

Gylfi, "On me head, son"

Expected assists looks at the process of chance creation from the viewpoint of the potential goal creator.

An assisted goal is a collaboration between the player making the vital final pass and his colleague who tries to beat the keeper, but over a season these sample sizes tend to be small.

Manchester City's Kevin De Bruyne topped the actual assist charts in 2016/17 with 18, but these numbers may have benefited from a statistically noisy bout of hot finishing or suffered from team mates who frequently sliced wildly into the crowd.

Therefore, it makes sense to use the probabilistic likelihood of success in the 85 additional instances when the Belgian carved out a chance that went begging.

Here's the top ten expected chance creators from the 2016/17 Premier League, along with their actual returns, courtesy of the recipients of these these key passes.

The list contains the kind of players you'd expect to see when trawling the Premier League for creative talent.

The expected assists are based on a model derived from the historical performance of every assisted goal attempt from previous Premier League seasons.

So De Bruyne's over performance may reflect the above average talent, not just of himself, but also his team mates or it could be that creating and finishing talent is tightly grouped in the top tier of English and Welsh football and randomness accounts for the majority of the disconnect between actual and ExpA over a single season.

Swansea's Gylfi Sigurdsson, a constant topic of transfer speculation, lies 3rd in both expected and actual assists, with 9 ExpA and 13 actual ones. This backs up the Icelander's importance to the Swans, where he was involved in nearly a quarter of Swansea's ExpG in 2016/17.

His relatively large over performance, compared to his ExpA cumulative total of just under 9 may suggest he is particularly adept at presenting chances to his team mates.

However, a simple random hot streak from both or either participant in the goal attempt should not be ruled our.

In 9% of simulations, an average assister/assisted combination would score 13 or more goals from the 77 opportunities crafted by Sigurdsson.

Neither is there anything untoward in the fit of the model to Sigurdsson's 77 assists. Lower quality chances are converted at a lower rate than those which had a higher expectation of producing a goal.

So far there's nothing to set off warning bells for any potential purchaser, Sigurdsson appears to be legitimately a top echelon goal creator, albeit one who may have run slightly hot in 2016/17.

But if we make some direct comparisons to say De Bruyne, differences begin to emerge.

De Bruyne's ExpA per key pass is 0.15 compared to 0.11 for Sigurdson, which suggests that De Bruyne is, on average creating higher quality opportunities.

The profile of the position of the recipients of Sigurdsson's key passes is also strikingly different from those of the Manchester City player.

De Bruyne is supplying chances for a much larger proportion of attacking minded players, such as out and out strikers, wingers and attacking midfielders.

Whereas, over 50% of Sigurdsson's key passes are picking out defenders, notably central defenders and that usually means headed chances, from set pieces.

This appears to be confirmed by the final column in the first graphics of this post. Only a third of Sigurdsson's assists arrived at the feet of a team mate, well below the figures for the remaining nine assisters in the table.

All of whom check in with at least 67% of their potential assists being finished off with the boot.

Gylfi's penchant for set play deliveries to a defenders head also features in Ted's article on the transfer speculation surrounding Sigurdsson in The Independent as part of Ted's grand tour of the British press.

Despite Sigurdsson's apparent niche assistance role, at least in 2016/17, his ExpA per potential assist does still hold up well.

He's below De Bruyne, as we've seen, but is above the remaining eight players in the top ten, bar Fabregas and an anonymous Stoke player, who we want to keep.

So although he does deliver aerial passes to generally less skilled finishers, his relatively impressive ExpA per key pass does suggest that he can put the ball into extremely dangerous areas and with accuracy to find a team mate.

Also his actual assists from headed chances of 8 compared to and expected total of just over 5 suggests he may be more skilled at such deliveries than is the average case, although such small samples inevitably prevent random chance being eliminated as the main causative agent in any over performance.

Overall, Gylfi Sigurdsson may be worth a great deal of a side that is set up to benefit most from his particular creative skill set.

But those teams may be few in number and principal among them are his current employers.

All data via Infogol

Wednesday 12 July 2017

Stoke Score More August Goals Than Andrew Cole

Hugely amusing tweet* doing the rounds, yesterday.

All great fun in the world of football bants and also an excellent case study in how to use "stats" to purvey a misleading impression that's likely to get picked up, circulated and no doubt recycled in September when the Premier League's fixture computer love affair with the Potters pitches them to the foot of the table.

So let's do a bit of due diligence .

Cole played 44 games to reach his 25 goals, playing, as he did in the 42 game, Premier League era, when they sometimes managed to cram six games in during the opening month.

Stoke scored their 23 goals in 28 matches.

So even this simple addition of context floors the deliberately provocative tweet.

Cole scored 0.57 Premier League goals/game in August, which is eclipsed by Stoke's 0.82 August goals/game.

The comeback would probably be "one is a team of 11 to 14 players".

But 1 of those 14 is a goal keepers, and keepers, with the exception of Stoke City ones, generally don't score.

Four or five are defenders who don't score a lot, which limits the fair comparison to Stoke players, from the August months of the Premier League era, who played in a similarly advanced role to Cole's position at Newcastle, Manchester United, Blackburn, Fulham, Manchester City and Portsmouth.

Designated Stoke forwards scored 13 of their 23 goals, so their scoring rate falls below Cole's 0.57 goals per game to 0.46 goals per game.

Stoke played an average of two out and out strikers over their Premier League existence, so we'll half that rate to 0.23 goals per game.

This puts Cole well back in the lead, allowing the rip to be taken out of the Potters again....?

However, we haven't considered the goal environments.

Stoke played against a batch of sides in August who conceded an average of 1,35 goals per game, as did Cole a decade earlier.

No change, there.

Cole's teams scored an average of 1.80 goals per game, meaning he played for sides who had a lot of attacking intent.

His 0.57 goals per game was around 30% of the baseline figure for his team.

The homogenised Premier League Stoke striker scored 22% of the 1.06 goals per game Stoke have averaged in the Premier League.

Those strikers included Dave Kitson, Mama Sidibe (legend), Ricky Fuller (legend), James Beattie, Kenwyne Jones and Peter Crouch.

Bottom line, Andrew Cole scored a higher proportion of goals for his club than did this mismatch of ageing, journeymen footballers did in their defensively structured, mid table one particular month.

Ha ha.


Thursday 6 July 2017

Game State Outliers

Newcastle's 2011/12 season remains one of the most interesting of recent times.

They scored just four more goals than Norwich, but gained 18 more league points and allowed two fewer goals than Stoke and won 20 more points.

Their meagre +5 goal difference was inferior to the three teams who finished immediately below them in the final table and a 5th place finish was partly down to the hugely efficient way in which they conceded and scored their goals.

The ability to leak goals only when a game was already lost and score at the most advantageous times proved transient and the following season Newcastle's elevation to the top tier of the Premier League stalled as they barely finished above Sunderland and relegation.

In this post I looked to define game state in terms of not simply the current score, but also the equally important factor of time elapsed.

The current state of the game for a side is a combination of the score line, the relative abilities of each side and how long remains for either team to achieve a favourable final outcome.

As an example, take Stoke's home game with Everton.

The matchup was fairly even, Everton the better team being balanced by the Bet365 stadium and after 6 minutes the hosts had around a 37% chance of winning and a 25% of drawing.

That equates to an expected league points of 1.4.

A minute later Peter Crouch scores to put Stoke 1-0 up and their expected points with 93% of the game remaining and a goal to the good, rises to 2.1 league points.

The goal's welcome, but mitigated by the large amount of time remaining and the evenly matched teams.

No VAR and the game ends 1-1.

The plot above has averaged the increase in expected points per goal scored in an attempt to see which sides were scoring goals that most advanced their potential expected league points, either by design or raw chance, combined with their core ability.

It shouldn't be surprising to see the better teams having the lowest average expected points improvement per goal in the Premier League.

They are more likely to win matches by large margins and the 4th goal in a rout will add little to the teams expected league points, which will already be close to 3.

However, even among the top teams there are variations.

Spurs have the lowest expected points increase per goal scored, partly due to wide margin wins against the lesser sides, while Chelsea, with a similar number of goals, found themselves celebrating a score with, on average, a more tangible game state reward.

Hull appeared to occasionally put themselves into relatively decent positions, despite meagre scoring, while Sunderland, not only scored fewer goals, but also frequently only netted when the spoils had largely been won by their opponents.

The same point may be better illustrated by plotting the success rate ( a combination of  wins and draws for each team) against their expected points increase per goal scored.

Chelsea are apparent outliers from the line of best fit, scoring goals that advance their game state, on average  by more than their fellow top sides.

Again this might suggest that they are employing slightly different in game tactics compared to others.

Perhaps one that deserts further attacking intent for a more defensive outlook once they find themselves in a favourable match position, as do Manchester United....Or perhaps there is an element of random good fortune in when they are scoring their goals, a la Newcastle 2011/12.

Both Championship enigmas, promoted Huddersfield and their beaten playoff rivals, Reading, show anomalies from the seasonal norm when we examine their change of expected points based on goals and time elapsed.

Huddersfield fly high above the general line of best fit for a side of their scoring capacity, fed by a glut of goals where the time factor had nearly ebbed away. Again, tactically and skill driven or transient good fortune or a bit of both?

Reading showed an uncanny ability to know instantly when they were beaten by "selectively" leaking many of their 64 goals allowed in a handful of games "allowing" themselves to spread the remainder  of their concessions more thinly and remain competitive in a large number of their matches.

A "trait" that will be eagerly anticipated for their 2017/18 season.