March 23, 2017

TSR, PDO, and the Bundesliga – Oh My!

It’s another international break week. Again. At least we can all catch up on yard work, homework, our bread baking, or festival-going. Whatever. Admit it, you need these breaks. In the meantime, let’s check in with our favorite league. What’s the pecking order looking like? This early part of the season, table position is probably less than important now than other factors in determining what’s going on. To find out, I’m going to use a couple of emerging analytical tools that can help us cut through some of the chaotic random statistical “noise” and identify meaningful statistical “signals” in world’s beautiful game.

I promise it’ll be interesting! Really.

First, some background. Look, whether or not we like to admit, much of what happens on the fußball pitch is probably just random chance. However, given our human bias to hunt for patterns and craft narratives (you all know that I love my narratives) from what is really a bunch of statistical “noise,” we commonly draw fallacious conclusions about the beautiful game – whether it’s post-hoc reasoning, the availability heuristic, small sample size errors, context-less observation, or simply conflating correlation with causation. It’s easy to go astray. Statistical thinking is usually counter-intuitive and hard. Constant vigilance is needed! Look, I’m no different, as I’m sure thinking errors are lurking in stuff I’ve written here and elsewhere.

Rather than lament our cognitive blindness, throw up my palms, babble on like Piers Morgan, and give up on the hunt for meaning, I think it’s important to start somewhere as you begin to examine your cognitive blindspots. Sure, the blindspots will inevitably creep back into your thinking, but at least you’ve begun the lengthy process of noticing and reflecting on them, then trying to correct for them.

With this spirit leading the way, let’s dive in with two analytical tools that – in a macro league-wide sense – can begin to help us pick out clubs who are over- and under- performing, or simply, who are lucky or unlucky. Fortunately, our sample size (eight matchdays) is just large enough for us to use these two tools for examining the Bundesliga league table.

TSR

“Total Shots Ratio” or TSR for short, refers to a ratio that quantifies a club’s on-pitch dominance. Basically, TSR creates a ratio between total shots a club takes against the total shots it concedes to opponents. The idea is that more dominant clubs take more shots than they concede. Hence, a more dominant club will usually have a higher TSR number. To paraphrase James Greyson, who pioneered this statistic in football, the higher a club’s TSR, the more this club is controlling the ball in matches.  As Greyson also explains, TSR’s value is its predictive ability – especially between TSR and final table points. Basically, a high TSR correlates pretty strongly with a high final table point tally. It seems obvious, but that’s point: there’s value in exploring whether the obvious really is meaningful or is a product of a cognitive delusion.

So what’s TSR composed of? Let’s take a look. Putting the elements of the ratio together produces the following formula for calculating TSR:

TSR = Total Shots Taken / Total Shots Taken + Total Shots Conceded

Next, let’s take a look at a sample club with real numbers to give you a feel for how TSR works. I’ll select Bayern, the current Bundesliga table leader:

Bayern’s TSR = 191 (total shots taken) / 191 (total shots taken) + 65 (total shots conceded)

Putting these numbers together give us a TSR of 0.705 for Bayern. As you’ll see in a second, only one other club in the Bundesliga has a TSR greater than 0.7: Dortmund. Through Matchday 8, the average Bundesliga TSR is 0.499, while the statistical median of the 18 clubs is 0.503.

Getting the picture?

If so, here’s the TSR of all 18 Bundesliga clubs (the clubs are listed according to table position, 1-18) and in parentheses I put the point total of each club:

  1. Bayern – 0.705 (20)
  2. Dortmund – 0.746 (19)
  3. Leverkusen – 0.505 (19)
  4. Gladbach – 0.419 (13)
  5. Hannover – 0.504 (13)
  6. Hertha Berlin – 0.516 (12)
  7. VfB Stuttgart – 0.514 (11)
  8. Schalke – 0.441 (11)
  9. Werder Bremen – 0.347 (11)
  10. Hoffenheim – 0.518 (10)
  11. Mainz – 0.498 (10)
  12. FC Augsburg – 0.626 (10)
  13. E. Frankfurt – 0.502 (9)
  14. VfL Wolfsburg – 0.528 (9)
  15. Hamburg SV – 0.458 (8)
  16. 1.FC Nürnberg – 0.336 (5)
  17. SC Freiburg – 0.334 (4)
  18. E. Braunschweig – 0.423 (4)

So if TSR is a symptom of dominance, then Bayern and BVB are obviously way ahead of the pack (notice the 0.2+ difference between Bayern/BVB and Leverkusen, the closest table rival, otherwise FCA has the next highest TSR at 0.626).

Here’s the scatter plot for the TSRs and table positions for the Bundesliga clubs:

TSR and tables positions for Bundesliga clubs through Matchday 8.
TSR and table positions for Bundesliga clubs through Matchday 8.

I didn’t want to clutter the chart with data labels, so if you don’t see your club, simply match its table position (from the list above) with the corresponding blue marker on the chart.

As you can see, most clubs are scattered close to the 0.5 line (recall that 0.499 is the mean), including 3rd place Leverkusen and 5th place Hannover. So by TSR figures alone, not a picture of dominance for these two clubs! However, a picture of dominance does emerge from Bayern and BVB’s position on the table. Simply put, these two powerhouses take many more shots than they concede. Indeed, in case you were wondering, BVB leads both the Bundesliga and Europe (yes!) in shots per match.

For my purposes, I’ve also highlighted some other clubs on our chart. First, take a look at poor Gladbach’s below average TSR (0.419). However, the Foals are sitting at 4th in the league, despite their mediocre number. What gives? Well, Gladbach has taken 102 shots (12.8 per match), while conceding 141 (17.6 per match). Not pretty. These tallies already raise the ugly specter of “luck” for the Foals, and lead me to predict a possible downward drift for them, if their TSR stays like this. Uh-oh. Put another way, conceding more shots than you create doesn’t bode well over the larger sample size of  a whole season.

Next, look at Bremen. Like Gladbach, Bremen concedes more shots that it creates, and has the 3rd worst TSR in the Bundesliga (that’s 1.FCN and SCF you see next to Eintracht Braunschweig on our table). However, Bremen is sitting squat in the damn middle of the league table. This data spectacle should confirm your suspicion (sorry Nik W.!) – a suspicion I’ve heard voiced on various Bundesliga media outlets – that Bremen are clearly not a mid-table side, more likely they really are a relegation candidate, as some people have feared. After all, only 1.FCN and SCF have (slightly) worse TSR numbers than Bremen.

Okay, FC Augsburg is next. Talk about dominance! This club has the 3rd best  TSR in the Bundesliga. Delightful. (By the way, notice that VfL Wolfsburg – at 14th in the table – isn’t too far behind, TSR-wise.) It appears that FCA has been dominant, but unlucky in terms of results. Indeed, FCA has taken 129 shots so far (16.1 per match), while only conceding 77 shots (9.6 per match) – good for 3rd place in the Bundesliga for both categories. For those of you cheering FCA on this season, you should take heart. The club seems to be underperforming, given its shooting dominance and control of matches. The tea leaves are looking good for Augsburg.

Finally, lil Eintracht Braunschweig. Sure, the club’s TSR (0.423) isn’t great -it’s below average – yet it’s not as bad as other clubs, like Gladbach (!), Bremen, or SC Freiburg. Faint hope, folks. Faint hope. You have to wonder to what extent Braunschweig has simply been unlucky. Hopefully for Braunschweig, as the season’ sample size grows, the newly promoted clubs will find some results that a bit more closely match their TSR number.

A word of caution: while there’s a strong correlation between TSR and final table position, you can’t simply assume this correlation operates like a fatalistic force in which “fairness” (i.e. a table position that “supposed” to correlate with a club’s TSR) wins the day and Eintracht Braunschweig is not the worst team in the league. That’s not how things work. Instead, there’s a likelihood – not a guarantee! – that Gladbach and Bremen will drift down the table, while FCA drifts up the table. A likelihood. Why? Because random stuff happens all the time in football (“Football, bloody hell”). Balls deflect off legs, heads, woodwork, corners do/don’t connect, Hummels lashes out and earns a red card, etc. Stuff happens. And I’d argue that without this randomness, football would not be merely as fun and compelling to watch.

Otherwise, just calculate your TSRs. And don’t even bother to watch any matches.

However, at the very least, we can form some predictions (another fun thing about sport, especially during “break” weeks like this) based on what TSR tells us about dominance and controlling matches.

Put your betting money on FCA, folks! (But don’t blame me, because “football, bloody hell.”)

PDO

Our next statistical tool was invented by a hockey analyst, Brian King, as a way to flush out what clubs are lucky/unlucky. James Greyson (his name again!) is the football analyst who applied the concept to football matches. Richard Whittall at the Counter Attack blog has popularized the concept and even includes PDO in the EPL league table. What’s fascinating about this statistic is that it heavily regresses to the mean, even after a relatively small number of matches. Like TSR, PDO is a ratio number. This time, it’s a ratio of a club’s shooting% (SH%) and save% (SV%). The formula is simple, and is expressed like this: PDO = (SH% + SV%) * 1000. In case you’re wondering, SH% is simply goals/shots on target, while SV% is saves/shots on target conceded. One more thing, since PDO works with all goals scored and conceded, the league average necessarily is 1 (or 1000, since SH% and SV% are multiplied by 1000). So keep 1000 in your mind as the mean.

So let’s take an example to illustrate how these numbers work. Again, I’ll select the table-leader, Bayern. Ready?

Remember that PDO = (SH% + SV%) * 1000.

First, Bayern’s SH% (goals/shots on target) is a league-best 84%. Next, Bayern’s SV% (saves/shots on target conceded) is actually below the league average: 24%.

Okay, adding SH% and SV% together then multiplying this number by 1000, produces a PDO of 1084 for Bayern. A bit above average.

Now that you’ve got the concept, here’s a list of PDO numbers for all 18 Bundesliga clubs (again, listed according to table position):

  1. Bayern – 1084
  2. Dortmund – 920
  3. Leverkusen – 1097
  4. Gladbach – 1192
  5. Hannover – 917
  6. Hertha Berlin – 1078
  7. VfB Stuttgart – 1099
  8. Schalke – 952
  9. Werder Bremen – 1021
  10. Hoffenheim – 1035
  11. Mainz – 915
  12. FC Augsburg – 819
  13. E. Frankfurt – 981
  14. VfL Wolfsburg – 808
  15. Hamburg SV – 1094
  16. 1.FC Nürnberg – 1065
  17. SC Freiburg – 1071
  18. E. Braunschweig – 826

You’ll quickly notice that PDO doesn’t correlate to table position (e.g. BVB has a quite below average PDO of 920, yet is 2nd on the table, thanks, in part, to a dominating TSR). Also, it’s worth noting that Gladbach has the highest PDO (1192), while VfL has the lowest (808).

Here’s a scatter plot of all 18 clubs:

Bundesliga PDO_Capture
PDO and table positions for Bundesliga clubs through Matchday 8.

This time, I highlighted the largest Gladbach) and smallest (VfL) PDO numbers in red. Inside these bookends you have everything else. I also highlighted BVB and FCA, since their PDOs are notable, especially given their TSR numbers. In case you’re wondering, the league average SH% is 34% and the league average SV% is 66%.

Let’s start with BVB. The league’s 2nd place team has a woeful PDO. What gives? Well, BVB SV% (63%) is slightly below the league average and their SH% (28%) is a fair bit below the league average. The SH% is interesting, because BVB often has the reputation – whether deserved or nor – as a “wasteful” club in terms of wasting scoring chances. Perhaps. But maybe some of the waste is also bad luck (i.e. hitting the woodwork, fluky saves, etc) in front of goal. With a bit more luck, BVB would probably – and perhaps easily – be the Bundesliga table leaders. especially given their dominant TSR. Nonetheless, BVB is still dominant enough to currently stake out 2nd place in the league table.

Next is Gladbach. Remember the Foals’ suspiciously low TSR? (That is, they concede more shots than they create.) Well, the club has a suspiciously high PDO – the highest in the league. For one thing, Gladbach has an above average SV% (72%) and the 3rd highest SH% (48%!) in the Bundesliga. In this context, a high SH% is interpreted as an indicator of luck – other metrics are being developed that evaluate the quality of shots taken on goals in relation to game states for example. Anyhow, Gladbach’s PDO number is another reason a bit of a slide down the table seems likely (not certain!) for the Foals. In summary, Gladbach isn’t very dominated in controlling matches and creating shots and also seems lucky by scoring goals on nearly half of their “on target” shots.

Third, we have FC Augsburg’s below average PDO (819, one of the worse in the league). Also recall that FCA had the league’s 3rd best TSR. What should we make of these numbers? FCA seems to be unlucky. Indeed, the club is far below average on SH% (24%) and SV% (58%). Do they need a keeper? (Only Hoffenheim has a worse SV%.) I dunno. PDO, by itself, doesn’t provide enough information or context for making individual personnel decisions.  However, since PDO numbers tend to heavily regress toward the mean, it’s reasonable to expect that FCA’s pitiable SV% will improve. After all, the club has shown its pretty dominant on the pitch (i.e. its TSR number), so movement up the Bundesliga seems likely for FCA.

Finally, poor VfL. Rotten luck. The league’s worse PDO. (But also one of the highest TSR numbers!) It breaks down like this: a below average SH% (only 22%!) and a below average SV% (59%). Like FCA, Wolfsburg smells of bad luck and it’s likely you’ll see some (positive!) regression to the mean with VfL’s SH% and SV%. Furthermore, VfL has been one of the more dominant Bundesliga clubs, according to its TSR. Expect some movement up the league table from Wolfsburg. It’s reasonable. Don’t be tricked by the club’s currently lowly 14th place in the table!

Conclusion

Hopefully, these two analytical tools (TSR and PDO) can help you peel back the surface layers of results, points, GD, and table position. Of course, the final table is all that matters in a teleological sense for any football league. However, how the final table emerges is no simple matter.  TSR and PDO offer some explaining power in accounting for both over- and under- performing clubs. For example, we’ve learned that a club can be dominant, but unlucky (FCA and VFL), or that a club isn’t very dominant, but lucky (Gladbach).

I’ll keep tabs on these numbers for the Bundesliga season – I’m curious to see how the final results shake out for FCA, VfL, and Gladbach. To me, TSR and PDO are engines catalyzing the narratives of football clubs in a broad sense – outside the craziness and emotional investment of weekly football matches.

Thanks to Jimmy Coverdale and BSports Football for providing some of the data used to calculate TSR and PDO.

The following two tabs change content below.
Travis serves as an editor and regular columnist here. Born and groomed in Santa Fe, New Mexico, Travis is a college English instructor in Pittsburgh. Coffee, books, and coaching the U6s are his passions. His writing has also appeared in Bloomberg Sports, the Good Man Project, and his former blog, Sportisourstory.tumblr.com, and elsewhere. He tweets at @tptimmons. Heja BVB!

2 Trackbacks / Pingbacks

  1. Statische Methoden zur Leistungsbewertung im Fußball: PDO » abseits.at
  2. #Link11: Franz rauchte bevor die taz Sport machte | Fokus Fussball

Leave a Reply