Statistics are a big part of my business as a journalist. I love it when I can refer to the fact that railways are far safer than other land-based modes of travel by referring to widely accepted metrics such as deaths per billion kilometres. I also enjoy using the rather remarkable fact that despite all the hype about privatisation being more efficient, over the past twenty years the cost of running the railway per passenger mile has remained virtually unchanged since the days of British Rail.

I have in the past enjoyed debunking statistics. In my book on the Public Private Partnership on the London Underground, for example, I discovered that some of the formulae for payment of the company was based on multiplying numbers to half a dozen decimal points and that was the point at which I realised the whole process was completely insane as such complex formulae cannot possibly accurately reflect a business relationship.

There are two kinds of statistics. There are some, like the railway safety figures mentioned above, which are firmly rooted in clear facts. Then, there are those which use estimates or assumptions that can be questioned. The projected cost of HS2 is a classic example, with estimates varying widely and then being used by either side of the debate to back their prejudices. Mixing these two types of figure can lead to completely misleading statements.

This is by way of saying that I do not agree with the old adage, coined by Mark Twain, that there are three kinds of lies: ‘lies, damned lies and statistics’. Statistics are an important way of getting to the nub of political debate and no public discourse is possible without them but we live in a world of fake news and, indeed, of blatant lying by politicians. It is not the fault of statistics but rather their misuse and the analysis that goes into them which can justify Twain’s famous statement.

This is to provide some context for the release, by the Railway Standards & Safety Board of research which purported to show that ‘the risk of Covid-19 infection [was] less than 0.01 per cent on an average journey’, and that even taking into account the risk of infection, rail still remained safer than travelling by car. This received considerable publicity. It was presented as good news for the rail industry and it seemed to back up what I had said numerous times in this column – that the risk of rail travel had been exaggerated. I gleefully tweeted out references to the research.

However, alerted by *Rail *reader Chris Barker, I started taking a more careful look at what the report was claiming and I immediately became rather more concerned about precisely what it proved. The researchers started from the point mentioned above that rail travel is far safer than other modes: ‘On safety alone, for an individual traveller per kilometre travelled, the car is 25 times less safe than rail. Cycling is 403 times, walking is 456 times, and travelling by motorcycle is 1,620 times less safe’. No problem there.

Then, however, it gets much less clear. In order to assess the risk of catching Covid from a train journey, a wide variety of assumptions had to be made. Indeed, the formula is worth reproducing just to demonstrate the breadth of assumptions that have gone into it:

‘The modified formula we use is: ???????????????????????????????????? ???????????????? ???????????? ???????????????????????????? = ???? × ???? × ???? × (???? + (1 − ????) × ???? + (1 − ????) × (1 − ????) × μ) Where:

s = The proportion of the population susceptible to Covid-19.

β = The chance of infection per contact given that one person in the contact has the disease.

I = The proportion of people in the population infected with Covid-19. σ = Proportion of cases that are asymptomatic.

δ = Proportion of time that an infection which eventually shows symptoms is presymptomatic.

μ = Proportion of persons showing symptoms and not self-isolating.’

There are lots of assumptions required here. The RSSB’s explanatory paper on the methodology, which is a stunning 18 pages long, provides various sources for choosing particular values for these variables. For example, the value given to μ is 30 per cent because one source suggests 20 per cent and another 40 per cent. Indeed, every one of these assumptions relies on previous research whose accuracy is more an estimate than a hard fact.

Then the researchers had to assess the time that people were at risk in different situations with a further ten variables such as: station layout; platform depth; platform access points relative to train stopping points; train type; seating configuration; and so on. The researchers realised that it was impossible to create simulations for all types of journeys and therefore ‘Our simulations use simplified assumptions concerning types of train services and different station layouts’. Fair enough but that does mean, essentially, that there is some guesswork around what constitutes the average time at risk on a particular journey.

Further assumptions were made such as assuming that all passengers wear face masks and that this reduces the risk of transmission by 56 per cent, a weirdly precise figure. Three train types were chosen and the train was assumed to move for thirty minutes when half the passengers would alight and be replaced by the same number.

OK that’s enough, dear reader, as space precludes further detail but suffice to say that the researchers admit that there are quite a lot of unknown unknowns which may affect the calculation, such as not knowing how the difference between an ordinary room in a house and a large railway carriage may precisely affect the likelihood of catching the disease.

After all these calculations and references to numerous other studies, the result was that the risk of car travel was still greater than travelling on a train during the pandemic despite the fact that, according to the research, there is a 1 in 11,000 chance of catching Covid on an hour’s train journey. However, the risk of car travel was now found to be just 1.14 times greater than the risk of rail travel for the same distance. Excellent, one might think, until it is explained that this means merely a 14 per cent increased risk of using a car, rather than a 2,500 per cent difference pre-Covid. That does not make such happy reading. Or frankly make for a convincing argument that justifies the media headlines. Or justify my naive retweets.

The key finding, in fact, is that train travel is in fact far more risky now than it was previously, nearly 25 times more so since it is now virtually the same as travelling by car. Change the metrics slightly such as assuming there are a few more people on every train and the numbers can shift dramatically. Indeed, if any of the variables listed above are changed to make the risk slightly higher, the 14 per cent would quickly become 0 per cent or even a negative number, implying that rail travel is more dangerous than travelling by car. Then the headlines would have been very different.

I accept that risk assessments have to be made and I greatly support their use. However – and this is a major counter argument– this is the sort of research that gives fuel to those who subscribe to Twain’s adage. This has been an attempt to find a number for something that is really unquantifiable.

Like everyone in the industry, I am desperate to find evidence that train travel is safe during the pandemic. Indeed, I reckon the estimate of 1 in 11,000 is far too high but that is merely a gut feeling. Research from other countries has indeed suggested that few, if any, clusters have their source in public transport use. I do not blame the RSSB for trying but perhaps it would have been better to accept that there were just too many assumptions to provide a reliable figure.

This highlights two wider issues. First, there should be a much greater reluctance to accept figures that result from complicated research that relies on many assumptions. The clear other example of this is when consultants say that HS2 will bring £60billion of benefits to the economy over the next 30 years. This sort of assessment is little better than guesswork – you only have to consider how the arrival of Covid has changed projections.

Secondly, for the past 40 years, since I have been a journalist, there has been an increase use of consultants and business analysts to try to back up particular ideas or schemes using similarly complex models. This attempt at quantification should be reversed. Again, using HS2 as an example, it is a megaproject that will undoubtedly have widespread impacts. Looking at those holistically, rather than trying to work out spuriously precise figures, is the way forward. The same applies to the response to Covid. Let’s get people back on rail, but not through unconvincing research.