By Janelle Sheetz
Data can be really useful for a business, opening up new possibilities and providing insight—but what happens when data that’s supposed to be helpful becomes a problem instead? The more data you collect, the more likely you are to get some bad data in the process, which cannot only lead to unhappy customers and damage to a business’ reputation, but can also cause other problems when it comes to the bottom line. And business owners know how damaging errors in certain fields like health care can be.
A study done by Experian Data Quality found that bad data impacts 88 percent of organizations, leading to a revenue loss of up to 12 percent. Data analysis can get complicated, but fortunately, bad data can still be fixed, as well as prevented.
Determine your needs
Data clean-up can be a one-time thing, or it can be an ongoing project. Decide which approach works best for your business, and take advantage of cleansing tools that can be run automatically. Still, ideally, a one-time clean-up would only be used for things like migration to a new system or for a specific campaign. Otherwise, not consistently managing your data can create a cycle of bad data and necessary clean-up.
To prevent accumulating even more bad data, make sure you not only have a set of controls in place, but that you’re actually using them consistently.
Sometimes, ETL systems might have trouble processing data, especially if terms are similar, resulting in bad data. Fixes can range from forcing transformation to changing the model to prevent the input of bad or confusing data to begin with.
If bad data arrives in the system, a new technique called streaming analytics will clean up bad data as it comes in, turning it into usable data.
Determine what causes bad data
Finding and eliminating bad data is important, but so is stopping bad data before it starts—especially if you’ve already devoted time to fixing your existing data. Bad data can be a result from errors in both humans and technology, caused by typos made in, say, text boxes that allow users to type anything. Instead, improve applications so no additional bad data can be entered—for example, change an interface to only allow valid data.