Data Pruning in Data Migration

by Katherine VasilegaDecember 7, 2010

The easiest way to save money and speed up your data migration project is to only move the data that is essential to your company. As a gardener trims away dry wood, so must the data migration expert look for opportunities to prune data that offers little value to the business.

Gaining data quality requires setting a clear strategy for data pruning in your data migration project. This is how you can approach it:

    • All data should be excluded from the target warehouse, unless you can justify a valid business or technical reason for including it in data migration.

    • Data profiling and scoping should be done as part of pre-migration activities.

    • Use advanced data quality tools combined with professional expertise to understand whether data is acceptable for migration.

    • Leverage data migration tools that do more than just moving data. Matching, standardization, transformation, and cleaning are very important features.

    • Assign a value to datasets and explain to business users why duplicated records increase costs and reduce quality.

    • Appoint a data steward who is familiar with all the key business information in your company. This person will supervise the data migration project in general, and will be responsible for data pruning in particular.

The importance of accurate data pruning in data migration projects should not be underestimated. For example, it requires about two hours to migrate a single attribute of data. If you have100,000,000 records against this attribute, it will take much longer to analyze, test and migrate them, than if you have 1000.