Monday, 13 July 2020

Staring Statistics in the Face

By Thomas Scarborough

George W. Buck’s dictum has it, ‘Statistics don’t lie.’ Yet the present pandemic should give us reason for pause. The statistics have been grossly at variance with one another.

According to a paper in The Lancet, statistics ‘in the initial period’ estimated a case fatality rate or CFR of 15%. Then, on 3 March, the World Health Organisation announced, ‘Globally, about 3.4% of reported COVID-19 cases have died.’ By 16 June, however, an epidemiologist was quoted in Nature, ‘Studies ... are tending to converge around 0.5–1%’ (now estimating the infection fatality rate, or IFR).

Indeed it is not as simple as all this—but the purpose here is not to side with any particular figures. The purpose is to ask how our statistics could be so wrong. Wrong, rather than, shall we say, slanted. Statistical errors have been of such a magnitude as is hard to believe. A two-fold error should be an enormity, let alone ten-fold, or twenty-fold, or more.

The statistics, in turn, have had major consequences. The Lancet rightly observes, ‘Hard outcomes such as the CFR have a crucial part in forming strategies at national and international levels.’ This was borne out in March, when the World Health Organisation added to its announcement of a 3.4% CFR, ‘It can be contained—which is why we must do everything we can to contain it’. And so we did. At that point, human activity across the globe—sometimes vital human activity—came to a halt.

Over the months, the figures have been adjusted, updated, modified, revised, corrected, and in some cases, deleted. We are at risk of forgetting now. The discrepancies over time could easily slip our attention, where we should be staring them in the face.

The statistical errors are a philosophical problem. Cambridge philosopher Simon Blackburn points out two problems with regard to fact. Fact, he writes, 'may itself involve value judgements, as may the selection of particular facts as the essential ones'. The first of these problems is fairly obvious. For example, ‘Beethoven is overrated’ might seem at first to represent a statement of fact, where it really does not. The second problem is critical. We select facts, yet do so on a doubtful basis.

Facts do not exist in isolation. We typically insert them into equations, algorithms, models (and so on). In fact, we need to form an opinion about the relevance of the facts before we even seek them out—learning algorithms not excepted. In the case of the present pandemic, we began with deaths ÷ cases x 100 = CFR. We may reduce this to the equation a ÷ b x 100 = c. Yet notice now that we have selected variables a, b, and c, to the exclusion of all others. Say, x, y, or z.

What then gave us the authority to select a, b, and c? In fact, before we make any such selection, we need to 'scope the system'. We need to demarcate our enterprise, or we shall easily lose control of it. One cannot introduce any and every variable into the mix. Again, in the words of Simon Blackburn, it is the ‘essential’ facts we need. This in fact requires wisdom—a wisdom we cannot do without. In the words of the statistician William Briggs, we need ‘slow, maturing thought’.

Swiss Policy Research comments on the early phase of the pandemic, ‘Many people with only mild or no symptoms were not taken into account.’ This goes to the selection of facts, and reveals why statistics may be so deceptive. They are facts, indeed, but they are selected facts. For this reason, we have witnessed a sequence of events over recent months, something like this:
At first we focused on the case fatality rate or CFR
Then we took the infection fatality rate into account, or IFR
Then we took social values into account (which led to some crisis of thought)
Now we take non-viral fatalities into account (which begins to look catastrophic)
This is too simple, yet it illustrates the point. Statistics require the wisdom to tell how we should delineate relevance. Statistics do not select themselves. Subjective humans do it. In fact, I would contend that the selection of facts in the case of the pandemic was largely subconscious and cultural. It stands to reason that, if we have dominant social values, these will tend to come first in our selection process.

In our early response to the pandemic, we quickly developed a mindset—a mental inertia which prevented us from following the most productive steps and the most adaptive reasoning, and every tragic death reinforced this mindset, and distracted us. Time will tell, but today we generally project that far more people will die through our response to the pandemic than died from the pandemic itself—let alone the suffering.

The biggest lesson we should be taking away from it is that we humans are not rational. Knowledge, wrote Confucius, is to know both what one knows, and what one does not know. We do not know how to handle statistics.


Keith said...

I think to your point, Thomas, models of projected infections, deaths, and other factors related to the Covid-19 pandemic have ranged far and wide over the months, subject to repeated major redialing.

Here, in my opinion, is why:

Initial conditions (unknown, poorly known, unknowable assumptions) + Myriad, repeatedly bifurcating paths along which initial conditions might unfold + Uncertainty and nonlinear behavior + Complexity and chaos theories = Inaccurate, shifting forecasts.

“Turtles all the way down.”

Modeling sounds impressive, but often rests on quicksand.

Thomas O. Scarborough said...

I think you see the issue, and the need for humility in dealing with facts. At the beginning of the pandemic, though, the errors seem to have entered at a simpler level than modelling, with CFR being a simple equation. At the time, someone wrote that there was no simpler equation than one which involved deaths.

I would be concerned, though, that to say it was initial conditions, bifurcating paths, and so on, may set us at risk of disguising the problem. We (humanity) were not in a position, on the basis of the facts, to reach a realistic or even true assessment, and the consequences of this may be fairly catastrophic.

Facts are deceptive. The selection of facts often rests on no firm basis. We scope the system, yet we fail to differentiate between subjective scoping and (in a best case) objective facts.

I remember as a boy, in the Central Pacific, when they tarred the first road on Tarawa. Immediately it began to melt and to bubble. The suppliers had omitted an 'essential fact', namely that the capital was hotter than other capitals they had supplied. Speaking of which, the recent buckling or roads in Ohio is likely a case of overlooking essential facts.

Martin Cohen said...

Yes, I do agree with our blogger here. And the virus does put into sharp relief these big issues about how much a fact is a fact, and how much it is a reflection of value judgements. In the specific example Thomas uses, 'a' or the number of deaths is far from being straightforward. In the UK, and I think in many other countries, guidance for recording cause of deaths is novel for an illness because it states that if someone has been tested and found to have the virus (they may have mild or no symptoms) and they then go on to die (they may have another long-term illness or just be 'run over by a bus') they MUST be recorded as dying with the virus. In a sense, of course, they have, but it means statistics of people dying OF the virus are blurred.

Other government advice even says that where people are not known to have the virus but have met someone who has it, or where they die of have linked conditions like pneumonia, it is permitted and advised to record the death as being due to the COVID-18 virus.

So, yes, the facts look very shocking and everyone is conditioned from school and college to treat, in the words of the Guardian newspaper slogan, "facts as sacred". Facts are NOT sacred, though, as we know here, they are provisional, deceptive, slippery. As a fine book on science put it, they have a half-life, decaying rapidly. Today's facts are tomorrows fish and chips wrappings!

Post a Comment