graphic for The 2019 Index of Intelligent Technology in HR Tech


Handling Data I

On September 25, 2013, in Big Data, HR Technology, HRExaminer, Industry Analysis, by John Sumser

Photo of Handline Big Data Part 1 Article on HRExaminer September 25, 2013 by John Sumser

Over the next several weeks, we’re going to look at the problems and opportunities for using data in HR.

Over the next several weeks, we’re going to look at the problems and opportunities for using data in HR. The topic is related to the industry’s concern about big data. It’s the grungy part of the topic.

In many cases, the hardest part of making data give up its secrets is getting various chunks of it to talk to each other. There are tens, maybe even hundreds of billions of dollars that go to the teams that perform this alchemy. (That’s hundreds of billions across all enterprise environments in the world, not just HR). On site teams are responsible for integration and making integration work. They juggle data.

It’s easier to see this on a molecular level.

Consider the recruiting department. Data flows in from all over the internet; individual resumes, references, background data, photos, job board output and input, interview notes, pictures, samples. Sometimes the data is structured (segmented into recognizable fields), sometimes not. In order to meet the requirements of various reviewers and interviewers, one of the recruiting departments’ jobs is to figure out how to make the data look the same. (That sameness speeds up the various evaluation processes that make up recruiting).

So how do you make the data more consistent?

If you’ve ever imported a file in and out of Excel, you’re familiar with the basics. When the input data is consistent, you make a map, one field at a time. It shows which data in the source document matches which field in the destination. You know, if you’ve done much of this, that the work is granular, boring, repetitive and unforgiving.

It would be nice if you could do the mapping once and forget about it.

Several things get in the way. In addition to the fact that the start and end points of the mapping can change (without warning), data is always inconsistent if it is entered more than once. Further, different systems make note of certain things in different ways.

Sticking with recruiting, lets say you want to check to see if a new set of candidate data is already in the Applicant tracking system. Since regulatory compliance is brokered by the ATS, you probably don’t want to put any extra data into it than you have to. You’re probably thinking that the problem is easy to solve. You just take the data from the two sources and merge it in an excel sheet and then check for dupes.

It turns out that many popular ATS providers don’t make it easy to export data for an ad hoc analysis. So, just to get the data out of the ATS, you might need some esoteric data extraction tools. It may well be that one look at the challenge of getting the ATS data is enough to stop the project.

If you’re lucky enough to have a team that can scrape the data or happen to have a more permissive ATS, you end up with data in some form of rows and columns.

Sadly, there are not very many sourcing providers who generate data in an easy to use format. So, just as there is a problem with the ATS data, there is a challenge with the sourcing data. Ultimately, with some amount of massage, the data can be consolidated in one place.

Then, you map the two piles so that their merger is consistent. Then you merge. All is well in the world.

Not so fast.

Every time a piece of data is entered, there is a probability that there will be corruption. Take the recruiting example. My name might be entered as John, J, John R, JR or John Raymond. And that’s if there aren’t misspellings.

The merged data is just the starting point. For the data to be ultimately useful, someone (really, someone, a person at some labor rate) has to review all of the data to find the places where the data is duplicated but doesn’t meet the very precise match requirements of a computer.

Given all of this detailed work, you might imagine that the exercise is executed as infrequently as possible.


The series:

graphic for The 2019 Index of Intelligent Technology in HR

Read previous post:
Contract Secrets by Heather Bussing HR Examiner
The Secret to Good Contracts

Contracts don’t have to be pages of incomprehensible legal mumbo-jumbo. They can be clean and simple and understandable.