How Goals Are Scored in the EPL.

Back in the day when football analysis was still finding it's feet there was a real dearth of data to actually analyse.Goals,either scored or conceded was about the limit of most readily available information.This was probably a good thing in the long run as it meant there was much more effort put into how the sparse data was analysed and areas such as sample size and recency of the figures were well too the fore at the very start.A reasonable understanding of a team's current well being probably requires reference to about three dozen of that team's most recent results and this figure was arrived at largely through laborious trial and error using actual results and out of sample data. At a time when using a team's last 5 results as an indicator of future performance was almost considered an extravagance,this was revolutionary stuff.

Another early "discovery" related to the general fundamentals of the leagues compared to the apparent idiosyncratic behaviour of some of some members of those leagues.Every season one or more teams would be hailed as either an away specialist or a draw specialist or a home specialist simply because they had played out a series of games where they had outscored the general expectation for the league as a whole by some considerable amount.And almost in every case the continuation of this strange and remarkable ability failed to materialise to anything like the same extent in future games.The "specialist" phenomena arose almost always through random chance and was usually "discovered" by comparatively small sample sizes being selectively culled from a much larger dataset.(Home specialists,for example always seemed to start their "run" immediately after a home defeat).Teams do vary,but almost never by enough to grant them specialist status and a much better appreciation of their future course is gained by either increasing the sample size or by lumping in a generous helping of league average figures.So the early watch words were,more data and/or a hefty tug back to the average for that particular league.

The tricky nature of interpreting football statistics also requires an appreciation of the difference between descriptive and predictive stats.If you're merely using statistics to illustrate how a team came to be holed up at 8th in the EPL after 20 games,then using each and every game incident is fine.It's once these,in part random events are used to predict how teams will perform next week or to characterise a particular team that problems can arise and we need to take slightly more care.

All of which brings us to the plethora of new and exciting data that is becoming available to anyone with the patience to collect it.Where there was once merely goals,we now have masses of extra information surrounding the staple diet of football.Goal times were the first necessary addition,but they've been joined by an expanded description so that goals can now be broken down as open play goals,goals from corners,set pieces,counter attacks,free kicks,headers,right footed,left footed and other body parts.(No prizes for guessing that Stoke were the top side for goalscoring with "other body part" in 2010/11).

This is great information to have,but initially it's really only of use to begin to understand the scoring dynamics of the league as a whole.Most teams score very few goals a season from direct free kicks,therefore it's very dangerous to assume that a relatively large raw number or proportion of goals scored by one team by that particular method will transfer from one season to a future season.Even the 30 or so direct free kicks scored by the league as a whole out of the 1000+ total goals is unlikely to be a definitive statistic for the EPL in subsequent seasons.So in short take the league wide conclusions with a pinch of salt and the team specific ones with a healthy degree of scepticism.The team numbers are descriptive,but with the possible exception of goals scored from open play probably not in the least bit predictive when taken in isolation.

Proportion of Goal Type scored in the EPL 2010/11.

Type of
Goal Scored.
of Total Goals.
Open Play 62.6
Corner Situation. 13.0
Penalty Kick. 7.8
Set Piece. 7.8
Counter Attack. 6.2
Direct Free Kick. 2.6

There's little to surprise here.The majority of goals you'll see follow a protracted build up and arise from open play.Penalties are perhaps more common than some would expect,their award has the power to elicit extreme emotions in both sets of fans,but that's down to the expected payoff rather than their rarity value.Most regular fans will see a spot kick sooner rather than later.

The figures overall produce a likely general narrative for a season but aside from creating a baseline figure against which individual teams can be compared,there's little to be gleaned from these dry numbers.If we see how these figures change by half and with the usual caveats that come with decreased sample size we can perhaps gently speculate as to what these figures may confirm about how an individual football game develops.It's well known that goals are more plentiful in the second half compared to the first,so the following table accounts for 400+ goals being scored in the first half and 500+ after the break.

Proportion of Goal Type scored in the EPL 2010/11 Sorted by Half.

Type of
Goals Scored.
of Total Goals.
Open Play,1st half. 62.9
Open Play,2nd half. 62.5
Corner,1st half. 13.6
Corner,2nd half. 12.6
Penalty,1st half. 7.0
Penalty,2nd half. 8.4
Set Piece,1st half. 7.8
Set Piece,2nd half. 7.9
Counter,1st half. 5.3
Counter,2nd half. 7.0
D.Free Kick,1st half. 3.4
D.Free Kick,2nd half 1.9

The largest discrepancy can be found between goals scored directly from free kicks in each half,however we're talking about sample sizes in the teens for each half,so until there are more seasons in the bank,a watching brief should be kept.The corner situation is more interesting,we are now talking about sample sizes in excess of 150 in total and it's curious to see goals scored from corners declining in the second period,especially as there is actually an increase in opportunity,post half time.Around 53% of corners come in the second half,but the amount of goals scored from these situations falls proportionally compared to the first half.If future or previous season's analysis confirm this bias it will certainly spawn a raft of imaginative explanations.

The proportionally larger number of second half penalties is potentially easier to explain.Managers are more reluctant to substitute defenders and less so to introduce fresh strikers,so tired defender verses lively attacker is a match up you're more likely to see later in the game.Also referees are less likely to want to put pressure on themselves by awarding borderline spot kicks too early in a match.So they may well hold fire on awarding early penalties and this may have become an ingrained and universally accepted attitude amongst officialdom.

A similar reason can be used if the split of counter attack goals being more common in the second period proves to be enduring over multiple seasons.The dynamics of attack and defence becomes more pronounced as teams trail and the clock ticks down,trailing teams becoming more vulnerable to counter attacks as they chase a goal.Also when we get down to team specific levels there's a weak to moderate correlation implying that better teams are more likely to grab a larger proportion of their goals from counter attacks than is usual and these are the kind of teams who are likely to be leading late in matches.There's also an interesting trend for less goals to be scored from distance as the game progresses,which could result from leading teams being happy to pack the final third of the field,making shooting from distance more difficult and rely themselves on more counter attacks.

Finally,I've tabulated the goals scoring preferences  for the top flight sides during the 201/11 season and to make the table manageable I've shown individual types of goals as a proportion of a teams goal total.It's vital however,that with the possible exception of goals from open play,these proportions are treated as being merely descriptive of each team's season.Aside from goals from open play hardly any team breaks double raw figures for any other category of goals.For example Fulham scored 21% of their goals from corners,but in  terms of raw goals that only amounted to 10 goals,4 less next season under a similar goals tally and they are right back down to the league average instead of being big outliers.

Where the Goals Came from by Proportion,EPL Teams 2010/11.

TEAM. Open play. Corner. D.Free Kick. Counter. Penalty. Set Piece. Success Rate.
Arsenal. 76.8 8.7 1.4 4.3 5.8 2.9 0.64
AVilla. 68.9 6.7 4.4 8.9 6.7 4.4 0.47
Birmingham 51.4 13.5 0 2.7 5.4 27.0 0.41
Blackburn. 60.0 17.8 4.4 2.2 4.4 11.1 0.42
Blackpool. 58.6 19.0 5.2 3.4 12.1 1.7 0.38
Bolton. 61.2 10.2 8.2 0 6.1 14.3 0.45
Chelsea. 55.1 14.5 2.9 10.1 8.7 8.7 0.66
Everton. 60.0 12.0 4.0 10.0 6.0 8.0 0.54
Fulham. 60.4 20.8 2.1 4.2 4.2 8.3 0.50
Liverpool. 67.2 8.6 1.7 5.2 10.3 6.9 0.54
Man City. 59.3 15.3 3.4 5.1 13.6 3.4 0.66
Man Utd. 70.7 10.7 1.3 10.7 4.0 2.7 0.75
Newcastle. 58.2 21.8 0 1.8 9.1 9.1 0.46
Stoke. 54.5 13.6 4.5 9.1 6.8 11.4 0.43
Sunderland. 66.7 6.7 2.2 6.7 11.1 6.7 0.46
Spurs. 69.1 10.9 0 3.6 9.1 7.3 0.61
WBrom. 64.9 14.0 1.8 8.8 7.0 3.5 0.46
WHam. 55.8 16.3 0 4.7 11.6 11.6 0.34
Wigan. 62.5 2.5 12.5 12.5 5.0 5.0 0.43
Wolves. 60.0 15.6 0 2.2 6.7 15.6 0.38
62.7 13 2.6 6.2 7.8 7.8 0.50

Success rate for each team has been added to the table to gauge how well each team performed over the course of the season and league average figures for the goal conversion types are also included as a baseline figure.As has been previously mentioned the majority of the numbers from the table are meant to be used merely as descriptive of how the season paned out.Teams at extremes either above or below the league average may owe their numbers partly to random factors that do not repeat between seasons and these numbers are likely to trend towards the league average in subsequent outings.

Open play goals comprise the lion's share of goals scored and are worthy of being plotted against success rate to see if the better teams are able to carve out more goals from the open field,perhaps befitting their higher levels of skill.And the expected correlation does appear to be relatively moderately confirmed,although to keep beating the drum we are sailing increasingly into speculative waters.

Arsenal amongst the better teams score over 75%+ of their goals from open play,perhaps indicating that another title challenge would materialise if they concentrated on increasing their haul from set pieces or corners.Manchester United achieved numbers that were more in keeping with the profile for the league as a whole and achieved a significantly higher success rate than the Gunners.Also the vulnerability of teams at the other end of the scale who rely on set piece type play to amass a larger than normal proportion of their goals is also illustrated.Both West Ham,Blackpool and Birmingham lacked the quality to score open play goals at anywhere near the league average and all three ultimately lacked the quality to survive.

The bottom line is that one season's worth of data doesn't really hack it if we're looking to make bold statements at a team level.The phenomenon of home advantage can be readily identified from one season if you amalgamate results from all 380 games,but as we've seen here the system breaks down due to too much random variation at a team level where just one season is used.And the same is almost certainly true when we try to analyses goalscoring method from a single season's worth of games from a predictive as opposed to merely descriptive stand point.

A tantalising initial glimpse for the moment,but nothing more.

