I had some feedback,so I've put together this follow up post to hopefully reinforce my initial conclusions.
The main thrust of the initial quote was that clean sheets correlate better than goals scored with final league position.This may well be true generally,but as has been pointed out countless times there is very much a great divide in the EPL between the top 4 (of which Chelsea are a permanent member) and the rest.Upwards of 80% of the division bear very little resemblance to the big four and their numerical superiority will tend to overwhelm any general conclusions made by looking at the league as a whole.Southampton's 10 clean sheets in 1997 may well be a better indicator as to where they finished than their 50 goals scored,but this doesn't naturally translate into the same predictor for an elite set of outliers.
If we restrict the analysis to teams who finished in the top four over the lifetime of the EPL we find that there is indeed a correlation between clean sheets and finishing position,however,the outcome from increasing the tally of clean sheets may surprise some.Less cleans sheets increase the chance that will finish off the pace in 3rd or 4th,but when we get to the cream of the crop at the very top we find that more clean sheets for the very best increase the likelihood that the team will actually finish second,rather than first.
I've run regressions for finishing position and clean sheets for the elite teams over the lifetime of the EPL and the conclusions seem clear.If you follow a strategy that increases your tally of clean sheets and you achieve your aim then somewhere along the course of the season you lose points and ground to top teams whose strategy wasn't predicated towards not conceding goals.The EPL runners up averaged almost half a clean sheet a season more than the more adventurous Champions.
Finishing Position in the EPL for the Elite as a Function of Clean Sheets.
If we now move on to the second thrust of Mike Forde's premise,namely that goals scored doesn't correlate as well as clean sheets with finishing position.Again that may be true for the league as a whole,but if we restrict the study to the rarefied environment in which Chelsea operate we find that there is a much stronger correlation between goals scored and final league position for the league's best.
Finishing Position in the EPL for the Elite as a Function of Goals Scored.
Not only is the correlation between goals scored and league position strong,also,unlike clean sheets the benefit of increasing the tally is clear.The more goals you score the more likely you were to finish higher up the top 4 over the lifetime of the EPL. The Champions averaged 77 goals scored,compared to 70 for the runners up,66 for 3rd placed finishers and 62 for 4th placed teams,indicating a correlation between increased goalscoring by teams of top stature and the increased likelihood of those teams carrying off the top prize.If you are at the high extreme for goal scoring then the EPL years indicate you have an excellent chance of lifting the trophy,but if you are at the high extreme for clean sheets then you have a good chance of watching someone else take the honours.The situation may be reversed over the league as a whole,but at the summit goals trump clean sheets.
For an explanation as to why this is the case we need to look at the types of games where EPL teams are gathering most of their points.If we continue on the theme of cleans sheets and goals,we would generally expect games where one or more team in the matchup goes out with a clean sheet as a prominent part of the game plan to be less goal laden than normal.So if we chart the proportion of league points teams are gaining from games containing increasing number of total goals,we may see patterns emerging.
Percentage of Total EPL Points Gained by Teams in Games of Differing Total Goals.
Goals per Game. | Man Utd. | Arsenal. | Wigan. | Sunderland. |
0 | 3 | 4 | 6 | 7 |
1 | 19 | 15 | 35 | 25 |
2 | 19 | 19 | 24 | 23 |
3 | 25 | 26 | 15 | 22 |
4 | 17 | 21 | 8 | 15 |
5+ | 18 | 16 | 11 | 8 |
I've selected four teams representing the polar extremes of EPL attainment and calculated the proportion of points they have gained in games of different goal scoring environments spread over six EPL seasons.Goalless games are obviously going to see games where both teams achieved a clean sheet,but equally the less goals that were scored in a match,then the more likely that those games saw a defensive strategy being employed by the participants.Also a strategy aimed at securing clean sheets may not actually result in the ambition being achieved,it may simply result in a lower scoring game.From the choice of teams it appears that the perennial star teams garner proportionally more of their league points from matches where more goals are scored,whilst the reverse is true for Wigan and Sunderland.If we combine the records of the teams the propensity for under resourced,poorer sides to rely on gaining their points in lower scoring affairs is further illustrated.
Where EPL Teams Get their Points.
The Wigan's and Sunderland's of the EPL are finding it profitable to keep games relatively goal scarce because over the year these are the games where they are gaining the majority of their league points.By playing with a defensive outlook and by implication demoting the importance of raw goal scoring,a top team,with the clout to buy great strikers is choosing to play in the very environment where the poorer teams appear to do best and is potentially shunning the higher scoring arena where the best do better.Of course we could have cherry picked Arsenal/Man Utd/Sunderland and Wigan to provide the kind of points per game distribution that supports our argument,so the final step is to plot the points per game and finishing positions for every EPL team in differing goalscoring environments to see if relationships hold.Both Wigan and Sunderland gained the majority of their league points in low scoring games and if we plot the record and finishing positions of all the EPL teams from the last six seasons we can confirm a relatively strong correlation indicating the relationship is general.
Where Teams Claimed their League Points,EPL 2005-10.Goal Shy Edition.
Similarly if we flood the games with goals the direction of the correlation is reversed and in games where lots of goals were scored it's the better teams who start to harvest the majority of their points for the season.The top four teams over the last six seasons,for example have claimed over 30% of their total league points from games where 4 or more goals where scored compared to less than 20% for the very worst.
Where Teams Claimed their League Points,EPL 2005-10.Goal Feast Edition.
So to summarise the indicators spread over six seasons of EPL action.Clean sheets are relatively strong predictors of finishing position for the very best teams.However the teams who top the table consistently record less clean sheets than those who finish second,so the correlation is strong,but more clean sheets doesn't indicate primacy between the very best.Only three EPL Champions have also been that season's clean sheet champs as well,but by contrast 10 Champions have led the league in goals scored.Goals,not clean sheets win you the ultimate prize.Secondly,pre game,clean sheet friendly games where fewer goals are actually scored favour the pre game underdog in terms of gathering points.So if you have the ability to buy expensive goal scorers and the overall quality of your team marks you down as a member of the EPL elite,then that's the route you should avoid.You want to open the game up by scoring or pack your team with scoring ability and make your generally inferior opponents try to keep pace.Unless you have a one off wildcard up your sleeve(such as a Mourinho),Chelsea are destined to fall short of their full potential by following flawed conclusions about the importance of clean sheets in the EPL.
I found this clean sheet trend a few weeks ago and it really is pretty baffling.
ReplyDeleteQuite simply if you regress the win rate (% of wins in total season) on goals scored, goals conceded, and clean sheets you find that a clean sheet increases win rate 0.87% whilst a goal scored increases win rate by only 0.54% (for completeness conceding a goal is worth -0.28%. The p value for all is well below 1%. The R^2 for this model is 0.943. I presume it was this test that interested Chelsea.
If you regress the win rate on only goals scored and goals conceded, a goal scored is as valuable but a goal conceded becomes worth -0.53%. The R^2 for this model drops to 0.928. Make of this what you will.