The data as presented in the csv file largely describes the actions made by players in a game. For example the number of forward passes made by Salif Diao for Stoke at the Emirates. One, as it happens. It is therefore a fairly simple task to accumulate match data comprising the total number of forward passes made by Stoke on that day against Arsenal. Once again Ravi's suggestion regarding the use of pivot tables in excel or Datapilot in Open Office is an excellent one.
We can therefore begin to build up a profile of the actions made by teams during games and try to marry these actions to game result to build up a picture of how teams achieve the results they do. This aim can be best achieved by looking at the stats differential between both teams in the match. Goals are strongly correlated to game success, but as Blackpool discovered, you must also be proficient at preventing goals. The defensive and preventative side of the game can often be overlooked, even though it has a similar level of importance in determining match outcome. In short it is goal difference that is the stronger indicator of success or failure compared to simply goals scored.
Below I've listed the strength of correlation between success over a season as measured by wins plus half draws divided by games played and various recorded events from the MCFC data and then I've listed the correlation between success and event differential. The closer the correlation is to 1.0, then the stronger the correlation.
How Match Events And Their Differentials Correlate With Seasonal Success.
|Match Event.||Correlation |
With Seasonal Success.
|Goals Scored/Allowed Differential||0.94|
|Shots On Target.||0.60|
|Goals From Corners.||0.27|
|Successful Passes ex Crosses.||0.54|
|Successful Final 3rd Passes||0.62|
|Touches In Opponents Box.||0.56|
|Shots On Target Inside Box.||0.57|
As it's simplest level the differential column now includes the defensive contribution to winning instead of merely the offensive output. Scoring goals on it's own is a major factor for success for a lot of the Premiership teams, but a stronger correlation can be found if we included goals allowed as well. By presenting the wider picture of events we can begin to understand how free scoring Blackpool spent just one season in the top flight and barely scoring Stoke have survived since 2008/09. Concentrating on having a strong defence can be both cost effective and successful and a partial antidote to a lacklustre attack.
The figures, which are far from exhaustive, outline the kind of things the majority of the successful teams excel at over a season. However, care must be taken to avoid making broad statements that do not apply to all teams. Those teams which adopt tactical approaches that are at odds with the majority of other teams will inevitably be flagged up as outliers who have been incredibly fortunate to survive, when in reality they have exploited a niche market that has allowed them to prosper.
|Headed Goals...A vital contribution for some teams.|
One particularly striking result is the apparent zero correlation between headed goals and success followed by only a slight improvement if we look at the differential between headed goals scored and headed goals allowed. However, if we dig a little deeper, rather than being a worthless artifact of a bygone age, headed goals are actually vital to a minority of teams.
Scoring headed goals is a much cheaper, if less efficient method of moving the scoreboard than taking the ground route. You can create headed chances with little more than tall attackers or defenders and a delivery system, (long throws, set pieces or crosses). Creating Barca style goals from intricate passes usually requires expensive skill throughout the midfield and attacking areas. So headed goals are vital to the prospects of Stoke and Norwich and previously Bolton and Blackburn. These teams make the best of their meagre resources, but are in the minority in prioritizing headed goals both scored and conceded and therefore cannot greatly influence the regression correlation.
Aggregated game stats can shed some light on the type of things some teams are doing and are allowing to be done to themselves over the course of a season. But the picture is broad and sweeping and much fine detail is lost due to lack of game position context and tactical approach of some teams over a season.
By looking at differentials we can strengthen the correlations between season long success, so lastly I'll look at how success as measured by individual wins on a game by game basis relates to positive on pitch actions for each team. Are these broad correlations observed on match day?
Final 3rd completions are widely regarded as the preferred tactical approach for the majority, but not all teams. So it seems reasonable that if this approach is effective, teams will be winning more often if they complete passes in their opponents final 3rd and limit their opponents in this area. Regressing differentials for final 3rd passing, we do find a clear and strong game by game correlation for the EPL as a whole last season. Below I've plotted the line of best fit for final 3rd differentials and the likelihood that the home team won the match.
For example if the home side out passed their opponents by 50 passes in a game, there was a 50% chance that they also won the game and the greater the differential the greater the likelihood that they also won the game. Correlation doesn't imply causation, but it does strengthen the case for final 3rd passes being an important component of some team's armoury. There will of course also be exceptions in this game by game analysis with Stoke, certainly and Newcastle, possibly plotting a different route through a tactical independence from the majority of the rest of the league where the importance of final 3rd completions is diminished.