The Low-Down: Why Improving Data Quality Has Become a Strategic Imperative

Data is only as good as the people and systems that create and input it. And the data on that suggests organizations have a lot of work to do because in a data-driven economy, the financial and operational costs of bad data are significant. JL

Thomas Redman reports in MIT Sloan Management Review:

Bad data is the norm. 50% of newly created data records had critical errors. Knowledge workers waste up to 50% of their time dealing with data quality issues. For data scientists, this may go as high as 80%. A consequence is mistakes, errors in operations, bad decisions, bad analytics, and bad algorithms. Indeed, “big garbage in, big garbage out” is the new “garbage in, garbage out.”Only 16% of managers fully trust the data they use. The cost of bad data is 15% to 25% of revenue for most companies.
Getting in front on data quality presents a terrific opportunity to improve business performance. Better data means fewer mistakes, lower costs, better decisions, and better products. Further, I predict that many companies that don’t give data quality its due will struggle to survive in the business environment of the future.
Bad data is the norm. Every day, businesses send packages to customers, managers decide which candidate to hire, and executives make long-term plans based on data provided by others. When that data is incomplete, poorly defined, or wrong, there are immediate consequences: angry customers, wasted time, and added difficulties in the execution of strategy. You know the sound bites — “decisions are no better than the data on which they’re based” and “garbage in, garbage out.” But do you know the price tag to your organization?
Based on recent research by Experian plc, as well as by consultants James Price of Experience Matters and Martin Spratt of Clear Strategic IT Partners Pty. Ltd., we estimate the cost of bad data to be 15% to 25% of revenue for most companies (more on this research later). These costs come as people accommodate bad data by correcting errors, seeking confirmation in other sources, and dealing with the inevitable mistakes that follow.
Fewer errors mean lower costs, and the key to fewer errors lies in finding and eliminating their root causes. Fortunately, this is not too difficult in most cases. All told, we estimate that two-thirds of these costs can be identified and eliminated — permanently.
In the past, I could understand a company’s lack of attention to data quality because the business case seemed complex, disjointed, and incomplete. But recent work fills important gaps.
The case builds on four interrelated components: the current state of data quality, the immediate consequences of bad data, the associated costs, and the benefits of getting in front on data quality. Let’s consider each in turn.

Four Reasons to Pay Attention to Data Quality Now

The Current Level of Data Quality Is Extremely Low
A new study that I recently completed with Tadhg Nagle and Dave Sammon (both of Cork University Business School) looked at data quality levels in actual practice and shows just how terrible the situation is.
We had 75 executives identify the last 100 units of work their departments had done — essentially 100 data records — and then review that work’s quality. Only 3% of the collections fell within the “acceptable” range of error. Nearly 50% of newly created data records had critical errors.
Said differently, the vast majority of data is simply unacceptable, and much of it is atrocious. Unless you have hard evidence to the contrary, you must assume that your data is in similar shape.

Bad Data Has Immediate Consequences
Virtually everyone, at every level, agrees that high-quality data is critical to their work. Many people go to great lengths to check data, seeking confirmation from secondary sources and making corrections. These efforts constitute what I call “hidden data factories” and reflect a reactive approach to data quality. Accommodating bad data this way wastes time, is expensive, and doesn’t work well. Even worse, the underlying problems that created the bad data never go away.
One consequence is that knowledge workers waste up to 50% of their time dealing with mundane data quality issues. For data scientists, this number may go as high as 80%.
A second consequence is mistakes, errors in operations, bad decisions, bad analytics, and bad algorithms. Indeed, “big garbage in, big garbage out” is the new “garbage in, garbage out.”
Finally, bad data erodes trust. In fact, only 16% of managers fully trust the data they use to make important decisions.
Frankly, given the quality levels noted above, it is a wonder that anyone trusts any data.

When Totaled, the Business Costs Are Enormous
Obviously, the errors, wasted time, and lack of trust that are bred by bad data come at high costs.
Companies throw away 20% of their revenue dealing with data quality issues. This figure synthesizes estimates provided by Experian (worldwide, bad data cost companies 23% of revenue), Price of Experience Matters ($20,000/employee cost to bad data), and Spratt of Clear Strategic IT Partners (16% to 32% wasted effort dealing with data). The total cost to the U.S. economy: an estimated $3.1 trillion per year, according to IBM.
The costs to businesses of angry customers and bad decisions resulting from bad data are immeasurable — but enormous.