Monday, 24 June 2019

The world in crisis: it’s not what we think

Posted by Thomas Scarborough

The real danger is an explosion - of Big Data

We lived once with the dream of a better world: more comfortable, more secure, and more advanced.  Political commentator Dinesh D’Souza called it ‘the notion that things are getting better, and will continue to get better in the future’.  We call it progress.  Yet while our world has in many ways advanced and improved, we seem unsure today whether the payoff matches the investment.  In fact, we all feel sure that something has gone peculiarly wrong—but what?  Why has the climate turned on us?  Why is the world still unsafe?  Why do we still suffer vast injustices and inequalities?  Why do we still struggle, if not materially, then with our sense of well-being and quality of life?  Is there anything in our travails which is common to all, and lies at the root of them all?

It will be helpful to consider what it is that has brought us progress—which in itself may lead us to the problem.  There have been various proposals:  that progress is of the inexorable kind; that it is illusory and rooted in the hubristic belief that earlier civilisations were always backward; or it is seen as a result of our escape from blind authority and appeal to tradition.  Yet above all, progress is associated with the liberating power of knowledge, which now expands at an exhilarating pace on all fronts.  ‘The idea of progress,’ wrote the philosopher Charles Frankel, ‘is peculiarly a response to ... organized scientific inquiry’.

Further, science, within our own generation, has quietly entered a major new phase, which began around the start of the 21st Century.  We now have big data, which is extremely large data sets which may be analysed computationally.

Now when we graph the explosion of big data, we interestingly find that this (roughly) coincides on two axes with various global trends—among them increased greenhouse gas emissions, sea level rise, economic growth, resource use, air travel—even increased substance abuse, and increased terrorism.  There is something, too, which seems more felt than it is demonstrable.  A great many people sense that modern society burdens us—more so than it did in former times.

Why should an explosion of big data roughly coincide—even correlate—with an explosion of global travails?

On the one hand, big data has proved beyond doubt that it has many benefits.  Through the analysis of extremely large data sets, we have found new correlations to spot business trends, prevent diseases, and combat crime—among other things.  At the same time, big data presents us with a raft of problems: privacy concerns, interoperability challenges, the problem of imperfect algorithms, and the law of diminishing returns.  A major difficulty lies in the interpretation of big data.  Researchers Danah Boyd and Kate Crawford observe, ‘Working with Big Data is still subjective, and what it quantifies does not necessarily have a closer claim on objective truth.’  Not least, big data depends on social sorting and segmentation—mostly invisible—which may have various unfair effects.

Yet apart from the familiar problems, we find a bigger one.  The goal of big data, to put it very simply, is to make things fit.  Production must fit consumption; foodstuffs must fit our dietary requirements and tastes; goods and services must fit our wants and inclinations; and so on.  As the demands for a better fit increase, so the demand for greater detail increases.  Advertisements are now tailored to our smallest, most fleeting interests, popping up at every turn.  The print on our foodstuffs has multiplied, even to become unreadable.  Farming now includes the elaborate testing and evaluation of seeds, pesticides, nutrients, and so much more.  There is no end to this tendency towards a better fit.

The more big data we have, the more we can tailor any number of things to our need:  insurances, medicines, regulations, news feeds, transport, and so on.  However, there is a problem.  As we increase the detail, so we require great energy to do it.  There are increased demands on our faculties, and on our world—not merely on us as individuals, but on all that surrounds us.  To find a can of baked beans on a shop shelf is one thing.  To have a can of French navy beans delivered to my door in quick time is quite another.  This is crucial.  The goal of a better fit involves enormous activity, and stresses our society and environment.  Media academic Lloyd Spencer writes, ‘Reason itself appears insane as the world acquires systematic totality.’  Big data is a form of totalitarianism, in that it requires complete obedience to the need for a better fit.

Therefore the crisis of our world is not primarily that of production or consumption, of emissions, pollution, or even, in the final analysis, over-population.  It goes deeper than this.  It is a problem of knowledge—which now includes big data.  This in turn rests on another, fundamental problem of science: it progresses by screening things out.  Science must minimise unwanted influences on independent variables to succeed—and the biggest of these variables is the world itself.

Typically, we view the problems of big data from the inside, as it were—the familiar issues of privacy, the limits of big data, its interpretation, and so on.  Yet all these represent an enclosed view.  When we consider big data in the context of the open system which is the world, its danger becomes clear.  We have screened out its effects on the world—on a grand scale.  Through big data, we have over-stressed the system which is planet Earth.  The crisis which besets us is not what we think.  It is big data.

The top ten firms leveraging Big Data in January 2018: Alphabet, Amazon, Microsoft, Facebook, Chevron, Acxiom, National Security Agency, General Electric, Tencent, Wikimedia (Source: Data Science Graduate Programs).

Sample graphs. Red shade superimposed on statistics from 2000.


Keith said...

The essay indicates, Thomas, that the two charts show the correlation between the upswing of ‘big data’ and the upswing of diverse, disquieting ‘global trends’. The two curves are very alike, as if they might precisely match if one curve were to overlay the other. (By the way, did you consider adding labels along the y-axis of each graph, to indicate what’s actually being measured in each case: values for the surge in big data and the surge in global trends?) Were the two curves created by charting actual data? Or are the curves meant only to be loosely graphically depictive of the correlations you discuss? I assume the latter, but was just curious, for clarification in how I should interpret the supporting graphs.

docmartincohen said...

It's an interesting hypothesis, Thomas! But I can't quite see the problem as 'big data'. Take global warming or sea level rise. The claim is that temperatures and sea levels are rising and are about to rise a lot more. Bu this claim is based on a theory about the workings of the atmosphere - not produced 'ex nihilo' as it were. Big data only comes in to SUPPORT the theorising.

Secondly, the claim that knowledge is exponentially increasing on all fronts I think is also very debatable. The amount of scientific papers increases steadily, I think not exponentially, but more careful reports note that most of what was published is reversing )or at least revising) previous publications and claims.

Keith said...

This is an interestingly original take on big data, Thomas. Truth be told, however, I’m an advocate of big data. Rather than worsening the kinds of global conditions you list in the top chart, I personally see big data as indispensable to their fix. Issues like longer-term artificial intelligence, editing of human genes, and climate change, to name just a few looming challenges, present so much complexity that I don’t see them being understood and able to lead us to informed decision-making without the aid of big data. I suggest that the bigger the data sets, along with better analytical methods and tools, the better we can analyze the complex initial conditions of such challenges, the many alternative paths along which these issues may travel, how our decisions and actions may influence initial conditions and outcomes, and probability-based forecasting.

What would make all those growing data sets all the more productive is breakthroughs in, for example, quantum computing and its readiness for prime time, massively increasing processing power beyond just what supercomputers can dish up. I don’t think we should shrink from things like ‘interoperability challenges’ and ‘imperfect algorithms’, as I see them as fixable. Also, I see big data as being at least one of the solutions to, as you say, science ‘progressing by screening things out’; the bigger the data sets, the less that scientists will have to do that. Besides, by my way of thinking, the greater danger in decisions foiling solutions to the global conditions you list, such as sea rise, may lie more in science denialism and specious science found among some policymakers than in science being driven by big data.

Thomas O. Scarborough said...

With regard to your comment, Martin, I have dropped a graph of scientific papers into the samples (above). This roughly coincides with the explosion of data.

One can say that data is ephemeral, yet not if its purpose is to make things fit: production to consumption, and so on. The daily calculations to make that happen are staggering. Take eBay alone, a drop in the ocean, which uses 87.5 petabytes (10^15 bytes) for search, consumer recommendations, and merchandising. This is not mere computations. It is calculating better fits, and making these happen involves enormous activity, which stresses our society and environment. By comparison, all this virtually didn't exist in 2000.

Thomas O. Scarborough said...

With regard to your first comment, Keith, I have added nine graphs (above) as samples, all from reputable sources, eg. Nature and Science. I have superimposed red shade on statistics from 2000. Some graphs are limited in scope, and there will surely be counter-examples. So my original graphs are, as you say, 'loosely depictive'. But I think there is a good case to be made. I briefly surveyed a few hundred graphs in writing the post.

While information storage data is not synonymous with big data, this exploded, too, from about 2000. 2002 marks the beginning of the 'digital age'. From that year, more than 50% of data was stored digitally. Or take, as an example, the world's effective capacity to exchange information through telecommunication networks: 2.2 exabytes in 2000, and an estimated 667 exabytes 2014.

Thomas O. Scarborough said...

Thank you, Keith, for your second comment. Seen as an enclosed system, big data has great benefits. However, seen as an open system -- open to the world -- it is monumentally dangerous. The problems of the enclosed system will be solved. My focus is: 'apart from the familiar problems, we find a bigger one'. This is the vast scale on which big data seeks to find better fits -- between goods and services, wants and inclinations, and so on. Lyotard called it efficiency. Yet Lyotard was unable to see what this would look like, taken to extremes. This is what we are beginning to see today.

Thomas O. Scarborough said...

Externalities. This is something I feel I would add to my post. Big data is not interested in those things which lie beyond its algorithms. If it is not in the algorithms, it tends to be ignored, or merely incidental.

Post a Comment