Data Migration is one of the most challenging initiatives in IT Industry. These projects yield high business benefits (such as cost savings, increased productivity and improved data manageability) but at the same time they tend to involve a high level of risk due to the volume and critical/complexity of the data.
In order to reduce risk and ensure that the data has been migrated and transformed, we need to implement a sound validation and QA strategy with companies like datalynx.net
There are various ETL tools available in the market for Data Migration load and validation. Organizations based on their specific needs either go for the specific ETL tool or in house framework/tool. Organizations do prefer in-house tools as this would provide more flexibility in enhancement, maintenance and are cost beneficial also.
There are projects where same resource is coding the transformation logic and performing validation on migrated data. However the best approach in my mind would be to have two different resources/teams responsible for coding and validation. Both these teams would independently prepare the artifacts after business mapping has been handed off by business. Having a second set of eyes to check the mapping code and make sure that nothing was misinterpreted by developer helps address that pitfall.
There are two aspects of Data Migration testing: – Internal Data validation and External data validation.
The Internal data validation would be to make sure that we have done data migration correctly as per the mapping requirement from business. This still does not guarantee that target environment will flawlessly process it.
This is why running the critical business processes in target environment is absolutely important. Often the outcome of this is that there are still some issues to be resolved before we successfully signed off on Data Migration. As with all issues, the earlier in the cycle these issues are identified the less the overall costs of resolution – to that end it is better to aim to get at least some of this kind of testing in as early as possible in project life cycle.
Validation Approach Recommended:-
Here are the best validation approaches in my view based on the experience of number of client implementation projects. This is used irrespective of whether we use ETL tools or in-house framework/tools. This will also help us in answering the first five question mentioned in problem statement section.
Internal Database Testing:-
- Load Balancing Validations – Load Balancing among Source system and Target system to ensure that the count of data element records are in sync.
- Business/Scenarios based Validations (MINUS Query approach) – Business/Scenarios based Validations to ensure that the data has been correctly migrated as per mapping requirements from business. These are based on scenarios being handled in the migration which are primarily driven by the Source to Target mapping and Functional Specifications. Here we should focus on testing every rule specified in mapping for individual fields which includes Move As-Is, Transform, Look Up, Default, Calculate etc.
- Integrity/Sanity Validations – Integrity Validations on source data to ensure that the source data are in sync with our expectations and data integrity is ensured. This ensures that mapping rules which are based on certain assumptions/constraints in source data hold true. This eventually helps in cleansing bad source data as per the need.
- Data plus List Reconciliations – This type of validations is needed to ensure whether the count of data element available in source tables that should be migrating after excluding all filtered rows as per business requirements and count of data element actually migrated into the specific table are in sync. This is a high level dashboard indicating the separate reasons for dropping various data elements and list of dropped records. Once this report is available, the dropped data elements can then be reconciled with the load balancing reports.
- Data Reports – These reports are needed as per the request from Operation team/Business team. This is used to summarize the post migrated data element records.
External Functional Testing:-
Functional testing is required to validate that the new system meets the needs of the business but invariably functional testing is conducted on simulated data. It’s entirely possible to have migrated data perfectly and to have passed all functional testing only to find that key functions on migrated data don’t work ‘well enough’. The result is a successful operation and a dead patient – this is not acceptable. So it’s essential in my view that some level of functional testing is performed on migrated data.