Following on from yesterday's post, here's another take on TdF speed trends post WWII:
It should be pretty self explanatory. Each year's speed and distance is shown and colour coded by decade so it's easy to see the general trends. Progressively the tour has been getting shorter since it recommenced after WWII, and speeds have in general been rising.
So when someone points out that speeds are increasing and wants to assign a causation there are of course a myriad of possible reasons, however one of them is clearly an overall reduction in distance ridden. even so, one needs to be careful when seeking to assign possible causal factors to this relationship, e.g. doping.
The idea for this chart was stolen from a post Robert Chung presented on Stack Exchange examining the TdF speed trends. Robert's original post and charts can be found here:
It's a good read and goes into a bit more depth as well as examining the trend line and residuals and why it's not so smart to immediately jump to conclusions about causal relationships.
Year to year variation, and possibly "era" to era variations are influenced by many things, the parcours is the most obvious example with some tours being more mountainous than others, while better/lighter/more aero equipment keeps coming along, influence of doping, better training and preparation, more dedicated focus on the tour, better pay attracting better athletes overall, general weather/environmental conditions (e.g. warm and dry vs cold and wet), changes in race strategy and tactics, and so on.
The data I used comes from the Tour de France online archive.
Here's another way to view the same data, which plots the same average speed trend line in yesterday's chart overlaid with the trend in race distance:
And for the sake of completeness (of stealing Robert Chung's plots that is), here are the residuals of speed on distance by year:
This plots how far above or below the speed v distance trend line the actual race speed is for that year. Also shown is a 5-year moving average of the residuals so a general trend above/below trend can be seen. IOW if there were some causal factor (e.g. doping) in the 1990s and 2000s that resulted in above trend speeds, then we'd also need to explain the above average speed trend in late-1950s and early-1960s as well.
When I looked at this yesterday, it was to point out some logical fallacies presented in a Facebook posts I saw, i.e. that the 2016 tour was faster than Armstrong's 2000 tour, and of course the (fallacious) logic that it implied doping was a bad as back then.
Well I thought it use to examine that non-sequitur and example of cherry picking data to suit a narrative.
For a start, yes the 2016 tour was faster than the one in 2000. Just. By 0.05km/h, but it was the fourth slowest tour since 1998, and only the 16th fastest since WWII.
So as a case of cherry picking, it was a poor effort. Once you looked at all the data then it is placed in better context.
Cherry picking is bad enough, but the non-sequitur was that the average speed tells you something about the doping status of the winner. It doesn't. In other words we really can't infer much either way about doping of riders in general, let alone an individual, from such data.
And while I'm at it, here's a chart plotting the trend in average stage length, which has been steadily dropping. It's similar to the trend in total stage distance but there are slight variations as the number of stages varies between 20 (on many occasions) and 25 (in 1987).