As I said in my previous blog, the next stage in this Data Migration odyssey is cleaning the data. Unfortunately, in our experience, this is the stage that is most often overlooked and can have serious consequences. We are not alone in this view – various industry analysts feel the same, so we are in good company:

Fosway HR Critical Realities part 4 “…most vendors don’t appear to be paying much attention to providing tools to resolve data quality issues.”  

Raven Intel The top 7 regrets of Enterprise Software Implementation Number 2: “I wish we’d cleaned up our data prior to implementation…”

Raytheon – Professional Services ‘In our experience companies have found that spending $1 on data clean-up prior to the migration will save $16-$25 after the migration in an operating system.’

Let’s face it, data cleansing is hard and it’s not just the immense volume of data that represents the daunting challenge of finding and correcting these issues. In some cases, data quality issues can be seen with the human eye and in others they are difficult if not impossible to spot. Generally, data quality falls into the following categories:

Data Duplication – Where the same person appears twice because they left and came back or started as a contractor and then were made permanent.

Data Error – Where a person’s email does not conform to the correct email format e.g. and one of the dots or the @ is missing.

Data Sense – When the wrong hire date has been mistakenly been entered e.g. the year date should be 2011 but was entered 1911 – believe me it happens!

Data Logic – Where the time zone applied to an individual does not match the country code of the country where they are located.

Special Characters Special characters or as I called them ‘data gremlins’ – you know, the ones that look as if they are from another planet or worse still can’t be seen at all.

Warning. ‘Be afraid, be very, very afraid… If your data contains any of the above, they’ll probably frustrate the transformation or load process; and if you are unlucky enough to actually migrate them, then they’ll definitely impact on your shiny new LMS’s performance and reporting. Not what you want when all eyes are on you at the launch!

It’s imperative to understand the scale of the problem and how much work is involved in correcting the data issues. The task is made easier and quicker if you have access to dedicated software tools and a services companies available to help you identify how much work is required to get your data match fit and then to help actually rectify all those issues ready for the next stage – Transformation. But then I would say that!

Next blog in the Link

Previous blog in the Link