Tuesday, 26 February 2013

Using the MCFC Data to Define Successful Playing Style.

The initiative last year by Manchester City to release a season's worth of game based stats in addition to a granular play by play break down of their Premiership match at home to Bolton from the 2011/12 season has provided a rich vein of data for the football analytics community to work with. Of the two, the former is easier to work with because it was released already in excel format, it was sorted by player and contains details on nearly 200 individual in match actions. Once these player actions are sorted by game and summed it is possible to produce an extensive record of the events that took place in every Premiership game from last season. Websites such as Joe B's football data had previously extended the data available from mere goals to include shots and cards and the City release has raised the bar considerably.

The second release, comprising every event from a single match appeared in xml format which required a degree of expertise to re configure, but yielded exceptional information, including x,y co-ordinates of each event, time stamps, enabling such things as passing sequences to be easily recorded. In short, the latter release is considerably more detailed, but restricted to a single game and the former has more general averaged information, but it's net is cast over an entire season.

The choice of City and Bolton was well made, as it highlighted the almost polar opposite approaches currently seen in top flight football. City's method relies on passing and possession, compared to a Bolton side which typically played with much shorter passing chains. The release of such comprehensive data is understandably a one off event, but we can use the initial, more general release to create an approximation of the more detailed release to cover such events as passing sequences.

The general release contains a column for each side's total number of passes and also for events which typically end a passing sequence, such as tackles and fouls, or on a more positive note, goals and shots at goal. By dividing the former by the total number of sequence ending events we can obtain a figure which should give an indication of the sides which enjoyed longer passing chains both over the season and in each individual match of the 2011/12 campaign.

Average Number of Passes per Passing Sequence Sorted from Longest To Shortest. 2011/12.

Rank. Team.
1 Swansea.
2 Manchester City.
3 Manchester United.
4 Arsenal.
5 Spurs.
6 Chelsea.
7 Fulham.
8 Liverpool.
9 Wigan.
10 Norwich.
11 Everton.
12 Wolves.
13 WBA.
14 Newcastle.
15 Sunderland.
16 Aston Villa.
17 Bolton.
18 QPR.
19 Blackburn.
20 Stoke City.

Conventional wisdom is confirmed. Swansea pass the ball a lot, while Stoke are a long ball side, where sequences end very quickly. The top four also occupy places in the top six, confirming their ability and desire to complete long passing sequences. What we are seeing is a representation of how each team predominately played their football last season, with passing teams at the head of the table and those teams which were less able or not tactically required to retain the ball at the bottom.

Using the MCFC data I have a hopefully accurate approximation of the average length of passing chains made by each side in all 380 matches from last season. If we ignore game states for a later post, there is a strong positive correlation between the number of consecutive passes made by Arsenal and the likelihood that they won the match. Arsenal probably led through the use of sustained passing sequences and then, particularly against weaker opponents, kept that lead by similar ball control tactics. Similar significant, individual match correlations hold for Manchester United, Chelsea and Spurs, but interestingly not for City, in a season where set play goals played a notable part in them lifting the title.

Another omission is table topping Swansea, there was no significant correlation between increased passing possession and an increased likelihood that they won a particular game. However, there is a significant connection between increased passing sequences and Swansea not losing a game.

United, Chelsea, Spurs and Arsenal were the teams who possibly used extensive passing as their primary match winning approach. City were less clear cut, relying also on set pieces as a source of goals, a route that may have dried up this term. Swansea's style, by contrast appears be used primarily as a defensive tactic, as they attempt to keep possession in less threatening areas of the pitch. They used passing in the EPL in 2011/12 as a means to not lose.

At the "bottom" of the passing table, Bolton are the only side with a significant correlation between their passing tendency and match results. The shorter their passing sequences, the more likely it was they won the match. It is tempting to think that a preferred route one approach gave way to Bolton's opponents allowing the relegated team a more leisured approach if a direct assault saw Bolton trailing rather than leading.

Stoke and Swansea. Two peas in a pod.

Stoke were the "route one" team most similar to Swansea in how their preferred style related to match result. The Potters' preferred approach, where prolonged passing sequences were rare, correlated strongly with them not losing. Swansea tried to defend by keeping the ball and it was a bonus if a scoring opportunity arose and Stoke used the long ball to keep the ball as far away from their goal as possible and again were happy to take a goal if the chance arose. In short, they had polar opposite approaches, but near identical tactical philosophies, whereby their on the ball style appeared to be used initially as an extension of their defensive ambitions and the final win/loss/draw records for both teams were almost identical, with draws proliferating.

The MCFC data dump is a great resource that can be made even more productive with a small amount of effort, but care must be taken not to assume that every team is trying to achieve the same goals via the same methods. Excellent ball retention may be essential to some teams, but of little importance to others.

1 comment:

  1. Hi!

    Can you tell me please Where can I download free soccer matches reliable data?

    I know that there's a website with very detailed data....

    However, I stumbled upon with an article....


    Shots on target are not the same comparing with Opta stats... =/