Tuesday 28 August 2012

How Fouls Turn Into Cards.

A largely uneventful Premiership clash between Arsene Wenger's Arsenal and Tony Pulis' Stoke City at the Britannia Stadium on Sunday was partly enlivened by the pregame war of words between the two respective managers. Their antipathy towards each other isn't well disguised and the current spat predictably centred around the disciplinary record of both sides. Pulis mused that Arsenal were hardly strangers to red or yellow cards during Wenger's tenure at the club, while Arsene wondered aloud as to where Stoke had finished in last season's Barclay's Fair Play League.

At first glance Wenger's point was the stronger, as Stoke finished 20th and bottom of the Fair Play League and Arsenal finished 7th.  However, Barclays and the Premier League appear to have opted for a wide definition of Fair Play and red or yellow cards only comprise a minor portion of the total points used to decide the table order. An identical proportion of points are also awarded for "positive play". Points are awarded for adopting an attacking outlook and continuing to press for goals even when a team is already in the lead.

It is rather apt that a competition designed to reward risk taking, even when this approach may endanger wider objectives, such as Premiership survival, should be sponsored by a global bank and unsurprisingly Stoke and other less successful sides suffer particularly in this category. Taking only cards accrued, Stoke would have finished 8th compared to 17th for The Gunners. Few people would immediately recognize attacking intent as an obvious component of fair play, so perhaps we need a more focused method based on fouls and cards.

The recent data dump of player actions from the 2011/12 season has enabled a much more granular approach to be undertaken and while there are obviously flaws in much of the analysis we can begin to explore cards and discipline in much more detail. The data contains every match appearance for every player over last season and fouls and cards are recorded on a game by game basis, although the reason for the caution is omitted. Around 80% of the cards issued in the EPL were for foul play, offences such as dissent to account for the rest. We can therefore delete players from the list who were cautioned without committing a foul, leaving a fairly homogeneous sample that relates cards issued largely to fouls.

I first set a baseline probability for the number of fouls a player needs to give away before a booking becomes  the most likely outcome. Pitch position and recklessness of the contact isn't recorded, but we can use player position as a reasonable proxy for the former. Defenders will be making the majority of their challenges and potential fouls in areas in and around their own box and they will also be more likely to be illegally preventing chances from being created or taken. Therefore it seems reasonable to assume that they will be given less leeway than an habitually "clumsy" striker. Regression analysis was used on last year's individual player data to calculate the chances of each set of defenders, midfielders or strikers ending a match with at least a yellow card to their name given various levels of fouling.

The chart demonstrates the handicap under which defenders have to operate, they are over twice as likely to receive a card for committing the same number of foul challenges as are strikers, the players they are more often challenging for the ball. Strikers only become more likely than not to leave the pitch with a caution when they have conceded 8 or more fouls compared to just 4 for defenders. Midfielders are allowed 5 challenges before a caution becomes odds on and are also treated much less leniently than out and out attackers. A striker who infringes frequently near to the opponents goal rarely seems to be cautioned for persistent fouling, whereas a defender risk this penalty much earlier in the cycle.

To take an obvious example, Bolton's Kevin Davies in 2011/12 committed 1 foul in each of seven games, 2  fouls three times, 3 fouls four times, 4 fouls twice and 5 fouls three times. Using the regression lines for strikers he would have expected to receive three yellow cards last term, which is precisely what happened. Had he been judged as a defender, he could have expected an average of just over seven cautions.

Davies pulls Huth's shirt, Huth kicks Davies, the ball is just a passing bystander.
Few will be surprised that defenders have to be especially careful about testing a referee's patience through persistent fouling and another preconceived notion that we can test involves the general leniency allowed to home players compared to travelling guests. Defenders require fewer indiscretions before they enter carding territory, so I've similarly calculated the likelihood of a booking arising for increasing number of fouls committed by defenders both at home and on the road. The effect is much less pronounced in this case, but it appears that two typical fouls away from home gives you a 1 in three chance of seeing yellow compared to the slightly more lenient sanction of 2 from 7 chances if the offences come in front of their own fans.

How Defenders Who Foul Fare At Home and On The Road.

Every step along the road to transgression, visitors are more at risk. However, we are only dealing with raw counted numbers here. Pitch position for the foul may be more advanced away from home due to a more adventurous approach from the hosts and the referee may, quite rightly judge a foul on the edge of the box more worthy of a caution than two in more benign areas of the pitch. Effectively elevating the overall likelihood of the same number of fouls by a visiting defender resulting in a caution compared to a hometown player. We certainly have evidence for visitors being more harshly treated in terms of foul numbers leading to cards, but that harshness may be fully justified.

These two examples begin to show the depth of analysis that is possible with more granular data. Bolton's apparently lenient treatment from referees last season, where they committed on average, a near league high number of fouls per booking is fully explained by the likely area of the pitch in which the fouls were made. Bolton's forwards were responsible for 34% of their team fouls compared to a league average of just 22%.

Similar broad conclusions can be teased from this extensive data set that may hint at the slight variations in refereeing stance that exists over different match ups and how players of differing styles are dealt with. By isolating games between the Top Four and the rest of the league, there appears to be good evidence that referees are partly protective towards the bigger teams, especially when facing inferior opposition. Players from inferior teams are more likely to be booked after just one foul than their more illustrious opponents in the same game. However, the referees appear to realize that they have taken this stance because the balance then switches with increasing fouls and the players from the Big Four are treated slightly more harshly as they become multiple offenders.

How A Player's Chances Of Being Carded Changes With The Matchup.

of Fouls.
Card Probability.
All Matchups,    All Players.
Big 4 (vs Rest).
Rest (vs Big 4).
Rest (vs Rest).
Big 4 (vs Big 4).

There also appears to be good news for combative players who make many tackles and are involved in lots of one on one duels. Refs appear to appreciate that the risk of fouling increases with increased involvement and players who make a large number of legal challenges are given slightly more leeway when they do foul compared to teammates who make far fewer challenges, but are quickly prone to illegality.

Examples Of How Previous Good Behaviour During A Match Can Help.

Number of Challenges by Player
During The Game.
Chance Of Being Booked.
4 1 0.214
34 1 0.107
6 2 0.329
21 2 0.245
4 3 0.495
18 3 0.400
9 4 0.618
21 4 0.539

Bookings, it would seem are more complex than merely taking cards to fouls ratios. Even without field positional data or even considering possible conflicts of cause and effect, we can start to scratch below the surface and begin to see if teams use fouls as a tactical ploy, if they might be aware of which areas of the field and players are less likely to draw a card and if their card count is merited by their foul count.

Where Teams Do Their Fouling.

Team. Proportion of Fouls By Defenders. Midfielders. Strikers.
Arsenal. 34% 56% 10%
Stoke. 42% 30% 27%

We can begin to use all of the above observations to look at the yellow card records of Stoke and Arsenal to get a better comparison than the one provided by the Fair Play League. The table above highlights who committed the fouls for each team and by extension we can conclude where on the pitch they were likely to have occurred and furthermore what their cumulative expected card total would be.

Arsenal appear more likely to disrupt opponents in the midfield region, whereas Stoke operate in the more risky final third of the pitch. Obviously there isn't a clear demarcation line past which a defender cannot make a challenge, but from the available data it makes sense to assume that midfielders make their challenges, on average further up the field than do defenders. If we now for both Stoke and Arsenal isolate the number of fouls committed by each player sorted by position and allowing for such factors as type of opponent and venue we can calculate the expected number of cards each side would receive under the current refereeing climate.

Did Stoke and Arsenal Receive The Cards They Deserved In 2011/12.

Team. Expected
Cards From Fouls.
Cards From Fouls.
Stoke. 54 51
Arsenal. 52 51

In light of the necessary approximations the agreement between expectation and reality is good. Stoke committed 450 fouls compared to 400 for Arsenal, but both teams ended the season with 51 yellows by way of foul. Stoke benefited from a larger proportion of fouls by strikers, these tend to occur much higher up the pitch and officials have become accustomed towards treating such offenders more leniently. The Potters also reduced their card count compared to raw foul numbers by virtue of the larger number of challenges made by the defenders especially, that were entirely fair.

In short the respective records of each team, from raw foul numbers to yellow cards last season were, like Sunday's result, stalemated.


  1. BEST analysis!!! i love all your articles. i follow all ur tweets. you r simple the best.~

  2. This is an interesting article.

    I would like to add a very important thing: how many mistakes did refs make when making decisions. Also about the cards and even about the fouls given.

    On two (debatable decisions + untold arsenal) sites you can find statistical evidence that Stoke has been treated very positive by the refs. And that Arsenal has been hit hard by mistakes by refs.

    Refs in general let a lot of fouls from Stoke pass and don't blow the whistle so they don't get named in the statistics.

    More information for the current season can be found on a new website that gives a complete analyse of all the games in the PL

    To give an example in the first two games from Stoke the refs made a total of 20 mistakes in favour of Stoke. And those 20 are missing in your article.

  3. 20 mistakes in two games in Stoke's favour.....that's bad. Players modify their behaviour depending upon how close they are to a caution, even team's such as Stoke who have a flagrant disregard for fair play. Therefore there is no guarantee that if one "incorrect" call was correctly made against a Stoke player, he would continue to play as unfairly as has been detected by these groups of impartial refereeing experts. In short you cannot simply apply miscarriages of justice additively to real match data.

    I had assumed that the sites you linked to existed to allow Arsenal fans to feel better about their inability to beat Stoke away from the Emirates. I wasn't aware that they had a serious point to make.

    thanks for your comment.