Skip to main content

How to migrate data to a distributed database with ShardingSphere

Apache ShardingSphere's new elastic migration feature lets you move data from a single database to a distributed database in an SQL-like manner.
Laptop with transparent data and screens

Image by Tayeb MEZAHDIA from Pixabay

Apache ShardingSphere, a powerful distributed database, recently released a major update to optimize and enhance its features, performance, testing, documentation, and examples.

[ Download a sysadmin's guide to Bash scripting. ]

Data migration has always been a strong focus for the ShardingSphere community. But migrating data from one structure to another is complex. In previous versions of ShardingSphere, you had to add an external table as a single sharding table, then modify the sharding rules to trigger migration. This process tended to be a little too complex for general users.

ShardingSphere 5.2.0 provides a new feature coupled with DistSQL for elastic migration to improve the ease of data migration. You can now migrate data from an existing single database to a distributed database built on ShardingSphere and MySQL or PostgreSQL, and you can do it in an SQL-like manner. It's a natural transformation from a single database to a distributed one.

Get started with MySQL and MariaDB. ] 

Commands to migrate data

The new feature is capable of migrating Oracle data to PostgreSQL. First, create sharding rules and sharding tables through DistSQL. Next, create a new distributed database and tables and run MIGRATE TABLE ds.schema.table INTO table to trigger data migration. It's easy, and there's SQL to support the process.

  • Migrate from source to target: MIGRATE TABLE ds.schema.table INTO table 
    • For example: MIGRATE TABLE ds_0.public.t_order INTO t_order
  • Query migration list: SHOW MIGRATION LIST
  • Query job status: SHOW MIGRATION STATUS jobID
    • For example: SHOW MIGRATION STATUS 1234
  • Stop migration job: STOP MIGRATION jobId
    • For example: STOP MIGRATION 1234
  • Continue the job you just stopped: START MIGRATION jobId
    • For example: START MIGRATION 1234
  • Verify data consistency: CHECK MIGRATION jobId
    • For example: CHECK MIGRATION 1234
  • Show the available algorithm used for checking consistency: SHOW MIGRATION CHECK ALGORITHMS 
  • Use specified algorithm to check data consistency: CHECK MIGRATION jobId (by type(name=algorithmTypeName)?
    • For example: CHECK MIGRATION 1234 by type(name="DATA_MATCH")
  • Undo the job (Note: This statement will clean the target table): ROLLBACK MIGRATION jobId
    • For example: ROLLBACK MIGRATION 1234
  • Complete the migration job: COMMIT MIGRATION jobId
    • For example: COMMIT MIGRATION 1234

During the migration process, you can also use the dedicated DistSQL for data migration in the table to manage the migration job status and data consistency. Please refer to the official documentation for more information about this new feature.

This article is excerpted from Apache ShardingSphere 5.2.0 is released! on Medium and is republished with permission.

Topics:   Database   Software  
Author’s photo

Yacine Si Tayeb

I am passionate about technology and innovation. I moved to Beijing to pursue my PhD in Management and fell in awe of the local startup and tech scene. My career path has so far been shaped by opportunities at the intersection of technology and business. More about me

Try Red Hat Enterprise Linux

Download it at no charge from the Red Hat Developer program.