Sunday, 19 August 2012

Quantifying Which Team Is Happier With The Current Scoreline.

One vital ingredient you need to add to your analysis of a football match is the current score context. Teams adjust their playing style depending upon whether they are comfortably ahead or well behind, generally adopting a less attacking and more defensive stance in the former case and vice versa in the latter. These micro shifts in emphasis can in turn impact upon in game match events such as shots and saves. For example a team chasing a game may produce more goal attempts, but they are often from further out, instigated by a wider pool of players and subject to more defensive pressure. This kind a subtle difference in shot or save quality is often lost in aggregated data, but visible in more granular batches.

The general game state for each team in a match is fairly easy to qualify if one team has an advantage on the scoreboard, but actually quantifying these positions as well as the numerous occasions when the match is stalemated requires more effort. Arsenal and Sunderland were level after a hour at the Emirates on Saturday and the visitors from the North East would have been much more comfortable with the scoreline than were their hosts. The real question is how much happier were Sunderland ?

One way to add context to such games is to calculate the combined win and draw expectancies for each team in running and track the change in these values compared to where they stood at kick off.

Full Time Score. 0-0.

Arsenal were unsurprisingly large pregame favourites to beat Sunderland, they were in the region of 70% likely to win the game and a shade under 20% to draw it. Combined, these two figures suggest that Arsenal's long term success rate (wins + half draws divided by games played) from such a match up would average just under 0.8 making Sunderland's long term success rate just over 0.2. 

As the game progresses, goal expectacies for each team also decline and at the still stalemated hour mark Arsenal's win probability would have been around the region of 0.5 and their predicted long term success rate for this game position would have declined from just under 0.8 at kick off to just under 0.7 now. In raw terms The Gunners had lost 0.1 of their pre match predicted success rate by their failure to score. In running success rates of rivals are intimately entwined, they must always total one, so Sunderland had seen their pre game success rate climb by the same amount. 

In this situation using game win a draw probabilities allows the 60th minute to be contextualized for both Arsenal and Sunderland. In non numerical terms, Sunderland are very happy and Arsenal aren't and this will partly dictate how each team approaches the final third of the match.

A further example from yesterday illustrates the effect of  goals in a more evenly matched game such as Newcastle's entertaining of Tottenham.

1-0, D Ba, 55'
1-1, Defoe, 76'
2-1, H B Arfa (pen), 81'

Newcastle are probably inferior to Tottenham at the moment, but home advantage gave them a very slight match day edge. In contrast to the Arsenal/Sunderland game, while the game remained or became level, both teams were probably fairly happy with the scoreline. This is denoted by the closeness of each team's plots to the neutral zero line and tactical approaches from both sides are likely to mirror those used league wide in evenly matched contests. Tottenham found themselves twice in losing positions, one of which they managed to salvage and league wide, teams attempt this rescue operation by committing more to attack than defence. 

Utilizing time and score specific success rate movements such as these into more granular shot data will prevent erroneous conclusions regarding team shot conversion rates being formed and being incorporated into aggregated totals. Teams, even good ones may appear to have declining shot conversion rates from previous years or previous months, but often this is because they may have found themselves trailing or drawing more often and therefore had more frequently faced overtly defensively minded opponents. This variation in game position is to be expected for all teams across and during seasons. One year Arsenal will find themselves trailing or drawing more often than previously through random variation rather than a sea change in team quality.

Aggregated data can be very useful, but it can also mislead. 

No comments:

Post a Comment