How to migrate data to a distributed database with ShardingSphere

October 6, 2022Yacine Si Tayeb1-minute read

Apache ShardingSphere, a powerful distributed database, recently released a major update to optimize and enhance its features, performance, testing, documentation, and examples.

[ Download a sysadmin's guide to Bash scripting. ]

Data migration has always been a strong focus for the ShardingSphere community. But migrating data from one structure to another is complex. In previous versions of ShardingSphere, you had to add an external table as a single sharding table, then modify the sharding rules to trigger migration. This process tended to be a little too complex for general users.

ShardingSphere 5.2.0 provides a new feature coupled with DistSQL for elastic migration to improve the ease of data migration. You can now migrate data from an existing single database to a distributed database built on ShardingSphere and MySQL or PostgreSQL, and you can do it in an SQL-like manner. It's a natural transformation from a single database to a distributed one.

[ Get started with MySQL and MariaDB. ]

Commands to migrate data

The new feature is capable of migrating Oracle data to PostgreSQL. First, create sharding rules and sharding tables through DistSQL. Next, create a new distributed database and tables and run MIGRATE TABLE ds.schema.table INTO table to trigger data migration. It's easy, and there's SQL to support the process.

Migrate from source to target: MIGRATE TABLE ds.schema.table INTO table
- For example: MIGRATE TABLE ds_0.public.t_order INTO t_order
Query migration list: SHOW MIGRATION LIST
Query job status: SHOW MIGRATION STATUS jobID
- For example: SHOW MIGRATION STATUS 1234
Stop migration job: STOP MIGRATION jobId
- For example: STOP MIGRATION 1234
Continue the job you just stopped: START MIGRATION jobId
- For example: START MIGRATION 1234
Verify data consistency: CHECK MIGRATION jobId
- For example: CHECK MIGRATION 1234
Show the available algorithm used for checking consistency: SHOW MIGRATION CHECK ALGORITHMS
Use specified algorithm to check data consistency: CHECK MIGRATION jobId (by type(name=algorithmTypeName)?
- For example: CHECK MIGRATION 1234 by type(name="DATA_MATCH")
Undo the job (Note: This statement will clean the target table): ROLLBACK MIGRATION jobId
- For example: ROLLBACK MIGRATION 1234
Complete the migration job: COMMIT MIGRATION jobId
- For example: COMMIT MIGRATION 1234

During the migration process, you can also use the dedicated DistSQL for data migration in the table to manage the migration job status and data consistency. Please refer to the official documentation for more information about this new feature.

This article is excerpted from Apache ShardingSphere 5.2.0 is released! on Medium and is republished with permission.

About the author

Yacine Si Tayeb

I am passionate about technology and innovation. I moved to Beijing to pursue my PhD in Management and fell in awe of the local startup and tech scene. My career path has so far been shaped by opportunities at the intersection of technology and business.

I took on a keen interest in the development of the ShardingSphere big data ecosystem and spen source community building and have since become a Committer in this community.

ShardingSphere is an Apache Top-Level project, and an open source ecosystem to transform any database into a distributed database system and enhance it with sharding, elastic scaling, encryption features, and more.

Read full bio

Browse by channel

Explore all channels

How to migrate data to a distributed database with ShardingSphere

Commands to migrate data

About the author

Yacine Si Tayeb

More like this

Browse by channel

Platforms

Tools

Try, buy, & sell

Communicate

About Red Hat

Change page language

Red Hat legal and privacy links

Red Hat legal and privacy links