Historical Data

Historical Data

 

This section addresses one of the recurring issues facing most data migration projects; how far back do we need to go to in order to populate a target?

The type of data associated with this question is often activity based (or in the case of banking, transactional) and often involves millions of records. The sheer volume of data involved often has implications for the migration technical team; stretched out run times may conflict with a single weekend migration event approach; a staged variation becomes harder to design and problematic to test and dress rehearse, particularly with environment provision.

The default requirement from the business area is usually “take everything” and combined with the absence of reliable metrics to prove it cannot be done, it often becomes a subject of contention for the various parties involved.

So how does one resolve the issue on historical data? 

There is no obvious answer to this, but the following represents some of the considerations that should be given: -

 

  1. Business Requirement. A “take-all” requirement could and should be challenged and is often provided as an opening shot to avoid time consuming analysis.
  2. Legal Requirement. The 7 year regulatory rule is often cited as justification for taking data over.
  3. Functional Relevancy. If the target system requires data going back to a certain point of time, this should be clearly understood with the impact of not taking it over also understood. 

 

There are also a number of other factors that will influence the outcome of decisions on historical data with cost and time also important contributors. These include: -

  • Degree of Difficulty. If extra data does not require much more effort or carries minimal risk in run times, then it is as well to bring over.
  • Historical Data Quality. A common inhibitor to bringing over historical data is the further back you go, the more likely there is to have problems associated with it.  The project is unlikely to extend data analysis and cleansing effort to old data. 
  • Migrate to an Archive Option. For functionally dead data where the requirement is only to view as per Regulatory rules, this option appeals. It often represents a cheap solution and can be taken off the critical path to the Live Event.
  • Leave available on Source. This do nothing approach also appeals. However it often conflicts with the desire to decommission incumbent systems. Licensing costs incurred may influence whether this is adopted.
  • Project lead times. Having time to develop an elegant method to bring large volumes of data over means technical challenges can in time be overcome.
  • Where is the bottleneck? This often influences the outcome as if the Load is the problem area, then it is likely to be under vendor control, particularly where it is a take-on typed load mechanism.

 

The debate surrounding historical data is often not concluded until demonstrable metrics have been taken. This will conflict with a desire to obtain clarity over what is and what is not being taken over, or what an alternative solution would look like.

The approach often adopted is to understand and design around the exact mandatory data requirements that the system functionally needs. If the window available for migration is still insufficient then overall design must provide that sophistication and may impact the entire migration strategy. The impact of staging a migration between active data and dead data, whilst not welcome is less of an issue.

 

Where the migration involves a package implementation, the question often asked of the vendor is; what did other Clients do?  Whilst what other clients did will undoubtedly provide insight into possible solutions, it is dangerous to read too much into it. No two implementations are exactly the same and differences can be exaggerated when one considers packages can often point to different DBMS and technical platforms.   

 

Wherever there is a requirement to take historical data, extract routines should be parameterised to ensure smaller sets of data get extracted and put forward for transformation. Apart from providing flexibility about options, it will facilitate functional testing of the migration suite where volume is not required.

 
 

045 827 4802