Information overflow asks for curative maintenance
Businesses and government agencies collect large volumes of data to get to needed information. In every subject discussed, data is the central underlying theme.
Investments in technology are ripe with promise and low on yield. Organisations experience this squeeze. I even observe a regression: new technology and more data are deteriorating the level of organisational intelligence. The confusion grows instead of having clarity on how to react to the information gathered.
This article investigates how unrealistic expectations of what technology can deliver leads to misjudgement of the value of data. This article also suggests what can be done to curate a set of valuable information.
A flood of information
We collect more information than we have done before. A sensor is installed because it is easy to do, with every customer contact we send an automated survey, we analyse every click and every movement on a website and put up cameras in stores. We know people do not behave rational, so we try to influence their decision-making process.
This flood of information has set emotions free. Social media have become the echo chamber of displeasure and plausibility. Filtering of those emotions, by staff desks, by journalists, by analyst desks of asset managers and banks, or by scientists has been displaced by direct access to unfiltered information for anyone.
Information gathered from a business perspective, to support decision-making, is influenced by the same human emotions. We perceive the emotions in a dimmed fashion because we have more distance to it.
It is time to evaluate the value of all this information. Collecting huge quantities of data does not deliver more information, it brings you less information. The image derived from the data is distorted by conflicting signals.
More data equates to more noise, which has made filtering of information a daunting task. It is a flood of emotional stimuli, the noise drowns out the information we are looking for.
Let’s just automate it
Automation offers a limited solution. IT has made volumes of data available, but the technological means to value these volumes are lagging. The hype surrounding AI is a hype in the conviction or in the hope that filtering out emotions can be automated.
Unfortunately, this is not what is happening. AI amplifies emotions in most cases. Bias in data is reflected in bias in the outcomes of algorithms.
The algorithm delivers a result, but a human has to do something with the result. If humans make a report, they deliver an interpretation with the analysis. A computer doesn’t interpret. AI is getting stronger in dealing with context, but that is focused on discovering patterns in data and classifying data to a perceived context enclosed in the data. AI cannot interpret data to the context of the question asked.
The challenge of processing context
Different types of context can be distinguished when using information.
First, context on how the information is gathered. What is its source? To what purpose has the information been gathered? What is the trustworthiness of the information source? Is reference information available to judge the data against?
Second, the context of use of information. What is the question we seek answers for? How did we derive information? Which assumptions and interpretations did we make when projecting information to our question?
The difficulty is in interpretation of information. We try to deploy computers and algorithms to interpret data for us, but it is where algorithms are very blunt instruments. Our current level of AI is rudimentary and progression is slow. AI based algorithms are masterful in pattern recognition and in shuffling the building blocks of patterns to new results. Deep fake video is impressive technology, but on a different world compared to interpreting data to the context of a question asked. Why is AI lacking in this?
Answering the question ‘what action do we take based on this information’ is not unambiguous and needs consensus. Computers don’t have a concept of consensus. Consensus is a human trait, which is extremely difficult to capture in software.
The moment we have reached consensus, and we want to repeat the outcome and distribute the consensus with the information, we try to enclose the consensus in our software: in data models, in algorithms, in transformations of data that we program in ETL tools or in languages like Python or Scala.
Context and the art of prediction
Predictive models are an extrapolation of observed past behaviours. This behaviour has never been captured in all its aspects in the data. Beforehand, choices are made which part of behaviour will be captured. Putting the ethical question aside if 100% capture of behaviour of people is wanted, 100% capture of behaviour won’t deliver you a better prediction.
The factors that drive the need of decision-making in a behaviour pattern are important, the rest is noise. To determine which are the factors, you need to perform an awful lot of analysis of the context in which certain behaviour does occur. The work of a data scientist is to determine which distinctive contexts these are, what factors in behaviour are important and which can be ignored.
How do you determine what is important? This, again, is human interpretation and reaching consensus of the outcomes of (statistics based) analysis. Within companies, you can observe the same phenomenon as you can observe on social media: the visibility of an individual opinion on the outcome of the multitude of analytics performed, drives us either to respect all opinions and keep everyone on board, or islets of opinions are starting conflicting actions within the organisation.
Toward curative maintenance of information
I advocate for scarcity in information provisioned. Scarcity stimulates human creativity and makes us inventive. Scarcity leads to the resolution of problems. With abundance of information, we become indifferent and weary. It is a bit like pruning your fruit trees in the garden to keep the productivity of the tree.
Conscious filtering of information gives you focus on eliminating noise. This implies that the main attention is on interpreting information, and the attention has to be with the users of that information. Trust as the antidote of the illusion of control by hoarding data. Having trust is what is championed for a long time, but we act opposite.
The data industry will be unable to keep supporting the weight of the monstrosity they have created. I’m not dystopian here, you won’t hear me say that otherwise computers will rule people. It is quite simple: we become unhappy under information overload.
Conclusion
Data is important to every organisation. The attention given to the subject is justified. Connectivity and cheap computer power in the Cloud has caused structural changes in how we organise business processes and has faded the boundaries of a company.
Today, the focus of most organisations is still on acquisition of information sources they previously did not have access to. This is caused by the technology impact of the last two decades and the resulting capital value of the FAANG companies.
We are on the brink of the next phase: the phase of curative maintenance of the information household of an organization, to regain the efficacy which is slowly crumbling under the wave of data. Curative maintenance means that an organization:
- Restrains from holding on to all kinds of data collected, to avoid the vail of confusion that is now obfuscating decision-making processes;
- Pays attention to the context of use and less to, often fruitless, data definition efforts;
- Clarifies how decisions come about, which will make the requirements to information transparent.