Typically, data replication in PostgreSQL is done between an active (primary) database and one or more standby databases. While this is usually enough for many applications and to enable high availability, sometimes you need to replicate your data across more than one active database. With a multiactive database cluster, you can distribute not only your read queries but also your inserts and updates to multiple databases in a cluster. This enables parallel workloads and the possibility of bringing the data closer to the end users, leading to lower latency and modernized, evenly distributed architectures.
[ Learn best practices for implementing automation across your organization. Download The automation architect's handbook. ]
PostgreSQL version 9.6, released in 2016, included a community extension called BDR that had some initial bidirectional replication support. The BDR extension was not updated or maintained in subsequent versions of PostgreSQL. Other databases provide support for multiactive clusters, and some products provide support for PostgreSQL, but there has not been a community-licensed, Postgres-native solution for multiactive replication. That changed following the recent launch of pgEdge Distributed PostgreSQL, a fully distributed database optimized for the network edge based on the standard and popular open source PostgreSQL database.
Technical background
Physical replication uses exact block addresses and byte-by-byte replication. This has been commonly used in PostgreSQL for creating a read replica that can be used as a hot standby or an additional read-only database for the application.
By contrast, logical replication involves replicating data objects and their changes by using their primary key. Rather than shipping the write-ahead log (WAL) files for all current states of all objects in the database to an exact matching database in recovery mode, logical replication uses publishers and subscribers to replicate inserts, updates, and deletes on specified objects. As a result, logical replication can be configured to be more finely grained, making it a powerful tool for modern databases.
Why logical replication enables multiactive replication
Logical replication allows you to limit replication to a specific database and provides options for row-level filtering. Logical replication therefore can be configured to replicate from database a to database b, and back from database b to database a. This multidirectional logical replication means that neither database has to be in recovery mode, and writes can happen to each with bidirectional replication between them to keep them in sync.
This means having multiple write endpoints for the application. In addition to providing a multiactive cluster, the version of the database becomes less important, meaning you could have a version 14 database replication and a version 15 database while being able to write to both, reducing downtime.
[ Learn about upcoming webinars, in-person events, and more opportunities to increase your knowledge at Red Hat events. ]
What Spock brings to the table
pgEdge's Spock extension introduces asynchronous multiactive replication with enhanced conflict resolution and conflict avoidance. It also provides better management, monitoring statistics, and integration.
You need conflict resolution when updates are happening on multiple databases at the same time. Updating a row in database a and performing a different update to the same row on database b creates conflict. With Spock, the last update wins, and the row will contain the value of the update from the latest commit without any failures. Spock also provides a resolutions table where conflict resolutions are recorded and can be monitored and analyzed.
Another conflict can arise from updates to an incrementing or sum field. For example, if 5 is added to a field on database a and 10 is added to that same field on database b, using the last-update-wins option would leave a total of plus 5 or 10, rather than the expected plus 15. Spock accounts for this with conflict-free delta-apply columns, altering this column with the delta of the update. The logical replication will ship the delta to the other database, so that the final value of the field in the above example will be the correct plus 15.
Spock also provides support for partitioned tables. Spock allows you to add either the parent table or specific partition tables to replication. This allows for geosharding, where certain partitions can be replicated between countries while other partitions remain only on the original country.
What's next?
Spock is open and pgEdge Community Licensed, which is similar to the Confluent Community License. This license allows unlimited end-user usage, including in production, but prevents third parties from packaging and selling a competitive cloud product.
Spock has many more features on the way. Right now, Spock can recover from intermittent outages: The streaming replication will persist, and the database will catch up and synchronize again. Planned improvements will make it easy to spin up full replacement nodes after a catastrophic node failure with near zero downtime.
Spock is a part of pgEdge Distributed PostgreSQL, available as either a managed database as a service called pgEdge Cloud or the self-hosted pgEdge Platform software.
The code and documentation for Spock can be found in the pgEdge Git repository.
[ Try OpenShift Data Science in our Developer sandbox or in your own cluster. ]
저자 소개
Cady is a Software Engineer with pgEdge who has spent the past ten years working with PostgreSQL and listening to podcasts.
Denis is a serial postgres entrepreneur and the co-founder and CTO of pgEdge.
채널별 검색
오토메이션
기술, 팀, 인프라를 위한 IT 자동화 최신 동향
인공지능
고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트
오픈 하이브리드 클라우드
하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요
보안
환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보
엣지 컴퓨팅
엣지에서의 운영을 단순화하는 플랫폼 업데이트
인프라
세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보
애플리케이션
복잡한 애플리케이션에 대한 솔루션 더 보기
오리지널 쇼
엔터프라이즈 기술 분야의 제작자와 리더가 전하는 흥미로운 스토리
제품
- Red Hat Enterprise Linux
- Red Hat OpenShift Enterprise
- Red Hat Ansible Automation Platform
- 클라우드 서비스
- 모든 제품 보기
툴
체험, 구매 & 영업
커뮤니케이션
Red Hat 소개
Red Hat은 Linux, 클라우드, 컨테이너, 쿠버네티스 등을 포함한 글로벌 엔터프라이즈 오픈소스 솔루션 공급업체입니다. Red Hat은 코어 데이터센터에서 네트워크 엣지에 이르기까지 다양한 플랫폼과 환경에서 기업의 업무 편의성을 높여 주는 강화된 기능의 솔루션을 제공합니다.