Typically, data replication in PostgreSQL is done between an active (primary) database and one or more standby databases. While this is usually enough for many applications and to enable high availability, sometimes you need to replicate your data across more than one active database. With a multiactive database cluster, you can distribute not only your read queries but also your inserts and updates to multiple databases in a cluster. This enables parallel workloads and the possibility of bringing the data closer to the end users, leading to lower latency and modernized, evenly distributed architectures.
[ Learn best practices for implementing automation across your organization. Download The automation architect's handbook. ]
PostgreSQL version 9.6, released in 2016, included a community extension called BDR that had some initial bidirectional replication support. The BDR extension was not updated or maintained in subsequent versions of PostgreSQL. Other databases provide support for multiactive clusters, and some products provide support for PostgreSQL, but there has not been a community-licensed, Postgres-native solution for multiactive replication. That changed following the recent launch of pgEdge Distributed PostgreSQL, a fully distributed database optimized for the network edge based on the standard and popular open source PostgreSQL database.
Technical background
Physical replication uses exact block addresses and byte-by-byte replication. This has been commonly used in PostgreSQL for creating a read replica that can be used as a hot standby or an additional read-only database for the application.
By contrast, logical replication involves replicating data objects and their changes by using their primary key. Rather than shipping the write-ahead log (WAL) files for all current states of all objects in the database to an exact matching database in recovery mode, logical replication uses publishers and subscribers to replicate inserts, updates, and deletes on specified objects. As a result, logical replication can be configured to be more finely grained, making it a powerful tool for modern databases.
Why logical replication enables multiactive replication
Logical replication allows you to limit replication to a specific database and provides options for row-level filtering. Logical replication therefore can be configured to replicate from database a to database b, and back from database b to database a. This multidirectional logical replication means that neither database has to be in recovery mode, and writes can happen to each with bidirectional replication between them to keep them in sync.
This means having multiple write endpoints for the application. In addition to providing a multiactive cluster, the version of the database becomes less important, meaning you could have a version 14 database replication and a version 15 database while being able to write to both, reducing downtime.
[ Learn about upcoming webinars, in-person events, and more opportunities to increase your knowledge at Red Hat events. ]
What Spock brings to the table
pgEdge's Spock extension introduces asynchronous multiactive replication with enhanced conflict resolution and conflict avoidance. It also provides better management, monitoring statistics, and integration.
You need conflict resolution when updates are happening on multiple databases at the same time. Updating a row in database a and performing a different update to the same row on database b creates conflict. With Spock, the last update wins, and the row will contain the value of the update from the latest commit without any failures. Spock also provides a resolutions table where conflict resolutions are recorded and can be monitored and analyzed.
Another conflict can arise from updates to an incrementing or sum field. For example, if 5 is added to a field on database a and 10 is added to that same field on database b, using the last-update-wins option would leave a total of plus 5 or 10, rather than the expected plus 15. Spock accounts for this with conflict-free delta-apply columns, altering this column with the delta of the update. The logical replication will ship the delta to the other database, so that the final value of the field in the above example will be the correct plus 15.
Spock also provides support for partitioned tables. Spock allows you to add either the parent table or specific partition tables to replication. This allows for geosharding, where certain partitions can be replicated between countries while other partitions remain only on the original country.
What's next?
Spock is open and pgEdge Community Licensed, which is similar to the Confluent Community License. This license allows unlimited end-user usage, including in production, but prevents third parties from packaging and selling a competitive cloud product.
Spock has many more features on the way. Right now, Spock can recover from intermittent outages: The streaming replication will persist, and the database will catch up and synchronize again. Planned improvements will make it easy to spin up full replacement nodes after a catastrophic node failure with near zero downtime.
Spock is a part of pgEdge Distributed PostgreSQL, available as either a managed database as a service called pgEdge Cloud or the self-hosted pgEdge Platform software.
The code and documentation for Spock can be found in the pgEdge Git repository.
[ Try OpenShift Data Science in our Developer sandbox or in your own cluster. ]
執筆者紹介
Cady is a Software Engineer with pgEdge who has spent the past ten years working with PostgreSQL and listening to podcasts.
Denis is a serial postgres entrepreneur and the co-founder and CTO of pgEdge.
チャンネル別に見る
自動化
テクノロジー、チームおよび環境に関する IT 自動化の最新情報
AI (人工知能)
お客様が AI ワークロードをどこでも自由に実行することを可能にするプラットフォームについてのアップデート
オープン・ハイブリッドクラウド
ハイブリッドクラウドで柔軟に未来を築く方法をご確認ください。
セキュリティ
環境やテクノロジー全体に及ぶリスクを軽減する方法に関する最新情報
エッジコンピューティング
エッジでの運用を単純化するプラットフォームのアップデート
インフラストラクチャ
世界有数のエンタープライズ向け Linux プラットフォームの最新情報
アプリケーション
アプリケーションの最も困難な課題に対する Red Hat ソリューションの詳細
オリジナル番組
エンタープライズ向けテクノロジーのメーカーやリーダーによるストーリー
製品
ツール
試用、購入、販売
コミュニケーション
Red Hat について
エンタープライズ・オープンソース・ソリューションのプロバイダーとして世界をリードする Red Hat は、Linux、クラウド、コンテナ、Kubernetes などのテクノロジーを提供しています。Red Hat は強化されたソリューションを提供し、コアデータセンターからネットワークエッジまで、企業が複数のプラットフォームおよび環境間で容易に運用できるようにしています。
言語を選択してください
Red Hat legal and privacy links
- Red Hat について
- 採用情報
- イベント
- 各国のオフィス
- Red Hat へのお問い合わせ
- Red Hat ブログ
- ダイバーシティ、エクイティ、およびインクルージョン
- Cool Stuff Store
- Red Hat Summit