Don’t Let Dirty Data Derail You

November 1, 1998
The problem with today’s computer systems is that they’re repositories for fool’s gold. There’s as much bad information as there is good lurking in all those databases, spreadsheets and contact lists. The 1985 movie Brazil illustrated the logical extreme of all this: Some poor slob (Jonathan Pryce) has his life ruined because a drop of water lands on a government printout and alters his name inside the computer. After his identity is confused with that of a known criminal, played by Robert DeNiro, he is summarily assassinated.

As far as I know, I’ve never been on anyone’s hit list. But like most people, I’ve had my run-ins with inaccurate and wrong data. Several years ago, after applying for a business-related credit card, the bank issued the piece of plastic with the words "business expense" imprinted directly on it. No biggie, I thought. Heck, let them print "Have a nice day" and a smiley face on it if that makes them happy. What do I care?

Unfortunately, it didn’t take long before this seemingly innocuous string of 15 characters began mutating. After only a few months, junk mail began piling into my mailbox with the name "Bus E. Greengard." At first glance, I thought it was simply mail sent to the wrong address. Then it hit me: somebody or some system had transformed the words into a new name in the database. As the credit card company brokered my name to other companies, the mail proliferated like bacteria in a tropical rainforest.

And that was just the beginning. Soon, my new moniker began further evolving into a half-dozen or more variants, including "Buse Spense" and "Bus E. Gree." Offers for computer software, business conferences and an array of other products and services began landing on my desktop. One mailing offered a pre-approved credit card from a well-known bank. Remarkably, they couldn’t get my name right, but they wanted to hand me a $10,000 credit line.

Unfortunately, it didn’t take long before this seemingly innocuous string of characters began mutating.

Bad data can create huge setbacks for a company.
Bad data are endemic in our society—worse than the common cold and tired political clichés. And don’t think for a second that human resources isn’t part of the problem. These days, dirty data are everywhere: embedded in employee records, recruiting systems, pension and benefits records, payroll systems, you name it.

The Hackett Group, a consulting firm in Hudson, Ohio, estimates that 2 to 4 percent of all data is wrong when it first enters a computer. And that’s just on the front end. At many firms, various databases are chronically out of sync and data are altered as they’re transferred from one system to another. What’s more, employees can access the wrong database and obtain old or incorrect information.

"Bad data can put a company at a competitive disadvantage," says Greg Hackett, president of the Hackett Group. "A company pays a huge penalty for not having the right information. It can affect growth, undermine strategies, increase costs and lead to mistakes, such as hiring or promoting the wrong person. Human resources is one of the most vulnerable of all departments, especially in today’s tight labor market."

Hackett isn’t just another guru mouthing off about HR’s shortcomings. He’s got the numbers to back up his argument. The Hackett Group found that the average company has 9.1 HR systems per 1,000 employees, while the top 25 percent of companies have only 1.9 systems per 1,000 employees. While the typical organization shells out $114 per employee per year for HR systems, top- tier companies spend only $47 per employee.

What drives the cost equation is the "fundamental complexity" of the systems, Hackett says. The more difficult it is to manage systems, and the more systems there are to manage, the greater the odds that data errors creep into the picture, and systems become more expensive to operate.

Multiple databases require special attention.
It’s not difficult to imagine all the ways data can become mangled and misused. An HR representative might type the wrong employee address into the HRMS. Or disparate human resources systems might display different data because updates between systems don’t occur regularly enough. A scanner might misread a résumé and fail to capture an applicant’s specific skills. Then when a hiring manager searches for a person with specific qualifications, the most qualified candidate falls through the electronic cracks. "It happens all the time," warns Hackett.

Not surprisingly, the biggest headaches occur at companies that rely on multiple HR databases. In some cases, informal "shadow" systems—surveys or records stored in a spreadsheet or unofficial database—wind up providing a higher level of accuracy than "official" systems. The HR department might conduct a study and amass accurate data about an issue, such as 401(k) participation. But nobody bothers to check whether the official HRMS data is correct.

This problem is common at firms that organize around specific business units and use separate IT systems for each of those units. Managers often tap into these local systems to obtain the data they need, not realizing more current and accurate data exist somewhere else within the organization.

Whether the problem is due to human error or a system meltdown, the end result can be devastating. The corporate world is rife with examples of companies that have hired or laid off too many people, built factories in the wrong places and miscalculated production. Within the HR department, errors can also translate into delayed paychecks, medical insurance nightmares and hiring fiascos. Although it’s impossible to ignore the fact that people—not computers—make bad decisions, it’s also clear that corporate processes and decision- making are inexorably tied to data.

Says Richard W. Lewis, vice president of business performance management at Stamford, Connecticut-based consulting firm META Group: "Without ensuring the quality of the data, it's impossible to achieve any significant results. All the technology becomes a huge waste of time and resources."

Use human intervention to flag bad data.
Solving the problem isn’t an easy task. The first step toward a solution is to understand that data problems almost always center on underlying processes rather than technology. As Hackett puts it: "Technology on top of bad processes only lets you do all the wrong things at the speed of light."

That’s not to suggest technology isn’t important. It’s the facilitating tool. The most successful companies have tightly integrated databases, and work off common platforms and standards. They use both software and human intervention to flag bad data, and they usually force various departments and divisions to adopt standard data entry and reporting tools.

Even if it’s impossible to link the entire corporation, it’s often possible to link systems within HR and reduce the complexity of data flow within the department. The intranet explosion of the last few years is certainly no fluke. Along with employee self-service, intranets have put data management in the hands of employees and managers who are far more likely to accept responsibility for accuracy than a clerk pounding a keyboard hour after hour, day after day. By creating a common set of standards for inputting, storing and viewing data, there’s far less room for errors, problems and incompatibilities.

It’s easy to see how a messed up mailing list can hit the marketing department squarely in the pocketbook, but it’s not always as obvious how HR is affected—especially since it’s difficult to measure opportunity costs. Today, a company’s most important asset isn’t raw materials or finished goods. And despite what human resources likes to think, it’s probably not people, either. It’s accurate information. An organization’s ability to grow, compete and embrace change lies in how it uses data, information and knowledge. It’s impossible to make the right decisions when you’re up to your eyeballs in dirty data.

Workforce, November 1998, Vol. 77, No. 11, pp. 107-108.