Guide To Data Migrations

Geoffrey Keating on October 28th 2021

Welcome to the great digital acceleration. Organizations are transitioning to the cloud and moving away from legacy systems to offer sleeker online services, more robust security, agility, and continuity of critical services—like payroll—even amid disasters.

Because the pace of business is moving faster than ever, complex data migration projects are inevitable. But getting data infrastructure to adapt to these changes can be the hardest part of digital transformation. In part, that’s because the subject of data migration is so complex that it’s sometimes hard for non-developers to even conceptualize how it works or why it’s necessary.

To simplify the process, we’ll break down:

  • What is data migration?

  • Types of data migration

  • Important considerations when migrating data

  • Basic data migration process explained

  • Consolidate your data into a single centralized platform with Segment

What is data migration?

Data migration is moving information from one place to another. These days, the term usually refers to digital information though it can also involve digitizing analog records. Data migration is a complex process that can involve transferring data to different storage locations, file formats, software applications, or all three.

Types of data migration

There are a number of reasons why organizations need to migrate their data, usually as part of a larger digital transformation. Often, organizations want to move away from legacy systems that aren’t capable of scaling to their needs or which have high security risks or maintenance costs. Here are the common types of data migration and why they might be necessary.

Migrate to or from databases

Database migration is when you completely move datasets from one or more source databases to one or more target databases. Then eventually, you delete the source databases. (Otherwise, it’s called replication, not migration.)

Migration can happen between databases that use different data models (e.g., relational to key-value) and between different technologies (e.g., MySQL to Oracle). That’s called a heterogeneous database migration.

A company will need to migrate to a different database if they run into performance issues. For example, when Netflix migrated users’ queue data from one NoSQL storage system (SimpleDB) to another (Cassandra), they cited the need for higher data consistency and a more scalable solution.

Database migrations can also be a version upgrade. For example, Facebook migrated from MySQL 5.6 (which had taken over a year) to 8.0 (which has already taken a few years). They did this to take advantage of new features like Document Store and a transactional data dictionary.

Migrate to a new storage destination

Storage migration is when you move data to a different data storage destination like a hard disk drive, solid-state drive, or cloud-based storage. Usually, it refers to moving data from an on-site data center to a cloud platform like AWS, Azure, or Google Cloud. Or it could mean migrating between storage services like Amazon S3 and Hadoop Hive.

You may also need to change storage destinations if you’ve changed the type of data you store. Relational databases used to be the standard. But now, real-time data streams, IoT data (e.g., data from smart home devices), and graph databases (e.g., social network relationships) are becoming more prominent. If you change database types, you may want (or need) to change cloud platforms to accommodate your choice.

Zoom with margin

Source

Moving to a new data warehouse

Data warehouses collect and store data from multiple sources (e.g., CRM data, IoT devices). A data warehouse forms a central, stable foundation for data analysis, user reports, and dashboards. This enables business leaders to look at the entire organization’s data as it changes over time and use it to make informed decisions.

Data warehouses can exist on-premises or in the cloud. Cloud data warehouses (like Snowflake or Amazon Redshift) offer lower costs, more flexibility, and scalability. They also offer faster ELT (Extract, Load, Transform) processes vs. the traditional ETL (Extract, Transform, Load) processes of on-premises data warehouses.

Moving data from an on-premises data warehouse to the cloud or establishing a completely new data warehouse are both examples of a data warehouse migration. You can also move between cloud data warehouses as Apisero did when they moved from Redshift to Snowflake.

Zoom with margin

Source

This type of migration may require you to rethink your data model, embedded logic, and custom data applications. You might have to change SQL syntax when migrating code since the two warehouses may use different variants. Also, one may not support the same commands and functionalities as the other.

Migrate to or from applications

Application migration is when you move a software application to a new environment (e.g., from an on-premises server to the cloud or between public and private clouds). You migrate the application itself and all of its accompanying data. Application migration could be to rehost, refactor, replatform, or replace. This type of migration is challenging because it could lead to unexpected application downtime or unplanned costs (like new licensing fees or training tools).

One example of an application migration is when a company moves its desktop application into a cloud environment. It can also be when you change application vendors: for example, migrating user mailboxes from Outlook to Gmail, as many schools have done to decrease costs and increase collaboration between students and teachers.

Upgrade or change operating systems

Another type of migration is changing operating systems. Right now, the most common type of operating system (OS) is still Windows. But that is shifting as more and more companies adopt open-source products that rely on the Linux OS.

Some companies are choosing to completely switch over to Linux due to lower license costs and fewer disruptions from policy changes. And many companies—roughly half—are using both Windows and Linux.

Another reason for an operating system migration is to upgrade to the next version or to upgrade a user’s hardware due to an expiring lease. This means that IT managers have to perform an OS migration every three to five years.

Important considerations when migrating data

Before migrating your information, you need a strategy—why you’re moving your data and what you hope to achieve. A data migration strategy helps to set expectations and minimize unwelcome surprises like going over budget or missing deadlines. Before moving forward, make a data migration strategy keeping these things in mind:

Identify the type of data migration you’ll perform

Look at the types of data migration above and scope out your project. Different data formats, operating systems, and storage systems may necessitate a specific approach. For example, an OS migration means you’ll first need to complete hardware, software, and license compliance audits to minimize post-migration user complaints.

And if you’re migrating petabytes of unstructured or semi-structured data to the cloud, consider storage infrastructure that is known to support storing and querying this type of data, like Snowflake.

Estimate the amount of data to migrate

Will you be migrating terabytes to petabytes of data? Huge data migration projects (like Sabre’s cloud migration project) may take several years, especially if you’re also doing some significant processing or translating. Keep in mind that when migrating a lot of data, you may need to keep both the source and destination live throughout the migration process.

Define the necessary tools

Identify any new applications, missing drivers, or migration software you’ll need. Look at source data and identify if compatibility exists between the source system and target system. If migrating to a new cloud platform like AWS, consider what integrations or APIs you need to gain real-time insights on streaming data.

Ensure the integrity & consistency of the data

Before migrating, make sure your data is rock solid. Data quality and data security go hand in hand as you form your data migration plan. Fix any quality issues before migration by archiving unnecessary data, and make sure any code changes are not being done at the same time. If there are any data errors, make sure to identify the causes of those errors and fix them before they impact customer experience or create any compliance issues.

Determine how quickly the data must be migrated

Will you fully migrate the data within a limited window of time, or will you complete it in phases? If the timeframe is too long, you can’t just be offline to do it—you’ll have to get more creative. For example, in a “lazy” migration of user data, data migrates only when a user logs in. At that point, user data is automatically transferred to the new system.

Basic data migration process explained

Migrating data is not as simple as flipping a light switch—it’s more like a whole-house rewire. Duplicate or corrupted data, software integration problems, and security risks when configuring data on the cloud are all major concerns. Below are the most crucial steps for success:

Zoom with margin

Source

1. Define the scope

Before you move it, you need to know the data inside and out. Analyze your business needs, source, necessary tools, and destination systems to make sure your data mapping and logic make sense with existing business systems. Involve business stakeholders early in the process to help define the data migration tools you need for a successful data transfer.

2. Check for incomplete or inconsistent data

You’ll want your migrated data to be accurate. To do that, check for missing data fields. Will you need to pull data from another source to fill gaps? Are some fields poorly populated? Identify any data that can be left behind and resolve any other issues before migration.

3. Build & test a migration logic

Before pulling the trigger on any migration, be sure you test, test, test. Once you transition to the new system, it’s much harder to deal with issues. And throughout the implementation and maintenance phases, continue testing to make sure the migration occurred successfully.

4. Validate data

Back up your data before migrating in case of any data loss during migration. Continuously run diagnostics throughout the process. And once the migration is complete, you’ll need to continue to maintain the data’s quality.

5. Decommission old systems

Continue to run the old system as a backup for a couple of months (a quarter at most). If you’ve done your job right and you trust your validation step, you should be able to get rid of the old system and move forward with the new one. At this point, you should decommission your old system and keep monitoring for data integrity.

Consolidate your data into a single centralized platform with Segment

Data migration is complicated. If you’re worried about constantly migrating data back and forth between different systems and tools, consider getting a centralized data hub like Segment Connections, so all your data is collected in one place. No more babysitting fragile systems. With Segment, leverage smart architecture so your business can achieve true agility.

The State of Personalization

Our annual look at how attitudes, preferences, and experiences with personalization have evolved over the past year.

Frequently asked questions

Become a data expert.

Get the latest articles on all things data, product, and growth delivered straight to your inbox.