订阅内容

Typically, data replication in PostgreSQL is done between an active (primary) database and one or more standby databases. While this is usually enough for many applications and to enable high availability, sometimes you need to replicate your data across more than one active database. With a multiactive database cluster, you can distribute not only your read queries but also your inserts and updates to multiple databases in a cluster. This enables parallel workloads and the possibility of bringing the data closer to the end users, leading to lower latency and modernized, evenly distributed architectures.

[ Learn best practices for implementing automation across your organization. Download The automation architect's handbook. ] 

PostgreSQL version 9.6, released in 2016, included a community extension called BDR that had some initial bidirectional replication support. The BDR extension was not updated or maintained in subsequent versions of PostgreSQL. Other databases provide support for multiactive clusters, and some products provide support for PostgreSQL, but there has not been a community-licensed, Postgres-native solution for multiactive replication. That changed following the recent launch of pgEdge Distributed PostgreSQL, a fully distributed database optimized for the network edge based on the standard and popular open source PostgreSQL database.

Technical background

Physical replication uses exact block addresses and byte-by-byte replication. This has been commonly used in PostgreSQL for creating a read replica that can be used as a hot standby or an additional read-only database for the application.

By contrast, logical replication involves replicating data objects and their changes by using their primary key. Rather than shipping the write-ahead log (WAL) files for all current states of all objects in the database to an exact matching database in recovery mode, logical replication uses publishers and subscribers to replicate inserts, updates, and deletes on specified objects. As a result, logical replication can be configured to be more finely grained, making it a powerful tool for modern databases.

Why logical replication enables multiactive replication

Logical replication allows you to limit replication to a specific database and provides options for row-level filtering. Logical replication therefore can be configured to replicate from database a to database b, and back from database b to database a. This multidirectional logical replication means that neither database has to be in recovery mode, and writes can happen to each with bidirectional replication between them to keep them in sync.

This means having multiple write endpoints for the application. In addition to providing a multiactive cluster, the version of the database becomes less important, meaning you could have a version 14 database replication and a version 15 database while being able to write to both, reducing downtime.

[ Learn about upcoming webinars, in-person events, and more opportunities to increase your knowledge at Red Hat events. ]

What Spock brings to the table

pgEdge's Spock extension introduces asynchronous multiactive replication with enhanced conflict resolution and conflict avoidance. It also provides better management, monitoring statistics, and integration.

You need conflict resolution when updates are happening on multiple databases at the same time. Updating a row in database a and performing a different update to the same row on database b creates conflict. With Spock, the last update wins, and the row will contain the value of the update from the latest commit without any failures. Spock also provides a resolutions table where conflict resolutions are recorded and can be monitored and analyzed.

Another conflict can arise from updates to an incrementing or sum field. For example, if 5 is added to a field on database a and 10 is added to that same field on database b, using the last-update-wins option would leave a total of plus 5 or 10, rather than the expected plus 15. Spock accounts for this with conflict-free delta-apply columns, altering this column with the delta of the update. The logical replication will ship the delta to the other database, so that the final value of the field in the above example will be the correct plus 15.

Spock also provides support for partitioned tables. Spock allows you to add either the parent table or specific partition tables to replication. This allows for geosharding, where certain partitions can be replicated between countries while other partitions remain only on the original country.

What's next?

Spock is open and pgEdge Community Licensed, which is similar to the Confluent Community License. This license allows unlimited end-user usage, including in production, but prevents third parties from packaging and selling a competitive cloud product.

Spock has many more features on the way. Right now, Spock can recover from intermittent outages: The streaming replication will persist, and the database will catch up and synchronize again. Planned improvements will make it easy to spin up full replacement nodes after a catastrophic node failure with near zero downtime.

Spock is a part of pgEdge Distributed PostgreSQL, available as either a managed database as a service called pgEdge Cloud or the self-hosted pgEdge Platform software.

The code and documentation for Spock can be found in the pgEdge Git repository.

[ Try OpenShift Data Science in our Developer sandbox or in your own cluster. ]


关于作者

Cady is a Software Engineer with pgEdge who has spent the past ten years working with PostgreSQL and listening to podcasts. 

Read full bio

Denis is a serial postgres entrepreneur and the co-founder and CTO of pgEdge.

Read full bio
UI_Icon-Red_Hat-Close-A-Black-RGB

按频道浏览

automation icon

自动化

有关技术、团队和环境 IT 自动化的最新信息

AI icon

人工智能

平台更新使客户可以在任何地方运行人工智能工作负载

open hybrid cloud icon

开放混合云

了解我们如何利用混合云构建更灵活的未来

security icon

安全防护

有关我们如何跨环境和技术减少风险的最新信息

edge icon

边缘计算

简化边缘运维的平台更新

Infrastructure icon

基础架构

全球领先企业 Linux 平台的最新动态

application development icon

应用领域

我们针对最严峻的应用挑战的解决方案

Original series icon

原创节目

关于企业技术领域的创客和领导者们有趣的故事