The Golden Rule of Data Migrations

The fastest and most reliable way to run a migration is to repeat it many times. I call this the "Repetition Paradox."

The Golden Rule of Data Migrations

The Golden Rule of Data Migrations

Script everything.

I have performed many data migrations in my time as a developer.  I've handled many different scenarios: converting from Excel spreadsheets to Jet/ACE database files; upsizing from Jet/ACE file databases to SQL Server; migrating data from one table layout to another within the same database technology; and converting from an existing third-party software solution to a custom solution that we were developing.  While each scenario has its own set of challenges, there are universal truths and techniques that help ensure success.

The Repetition Paradox

The most important lesson--and the one I had to learn the hard way many times over before I accepted it--is that the fastest and most reliable way to run a migration is to repeat it many times.  On the surface, this seems like a paradox.  How could it be faster to do the same thing many times, dozens even, than to do it just once?

The reason is that every significant data migration is a multi-step process.  And the decisions you make in the early steps will impact the entire process.  If you commit to doing the migration exactly once, then you have no choice but to live with any poor decisions that you make early on.  No amount of care will prevent making those kinds of mistakes, either.  At the outset of the migration, there are simply things that you don't know you don't know.  The "unknown unknowns," as Don Rumsfeld would say.

This is all fine and good in theory, but can you really save that much time by starting over whenever you realize you've made one of these early mistakes?  The answer, if you are performing the migration manually, is a big No.  

I suppose I should clarify what I mean by a manual migration.  I'm not talking about copying and pasting individual records; only a Zen-level masochist would do that.  I'm talking about building and executing queries in the Access interface to move the data.  I'm talking about anything that requires more than a single click to run the entire migration.  That's a manual migration.  

Automated migrations

At the beginning of this article, I posited my Golden Rule of Data Migration: Script everything.

I don't mean you need to write your SQL statements in code (though I often like to do it that way).  By all means, link the source and destination tables to a single Access front-end and use the visual Access query designer to build every query if you want.  The important part is that you are not opening each query one at a time and running it.  Instead, you should execute the queries from code to meet the one-click standard.

The key is to remove as much friction as possible from the process of starting over.  That way you can make small improvements iteratively, which is easier than trying to get everything right the first time through.

Additional tips

Here are some additional tips to make your migration go smoother:

  1. Swallow no errors.  
  2. Verify all record counts.  (Just ask Public Health England.)
  3. Use development versions of data.
  4. Consider a "soft" migration.
  5. Run automated tests against the results.
  6. Commit often to document the process.
  7. Automate deletion and rebuild of the destination for faster dev cycles.

Image by Steve Bidmead from Pixabay

All original code samples by Mike Wolfe are licensed under CC BY 4.0