PostgreSQL10:Quorum Commit for Synchronous Replication

10 版本同步复制支持 Quorum Commit,如下:

关于 Quorum Commit 说明

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
commit:  3901fd70cc7ccacef1b0549a6835bb7d8dcaae43
author: Fujii Masao <fujii@postgresql.org>
date: Mon, 19 Dec 2016 21:15:30 +0900
Support quorum-based synchronous replication.

This feature is also known as "quorum commit" especially in discussion
on pgsql-hackers.

This commit adds the following new syntaxes into synchronous_standby_names
GUC. By using FIRST and ANY keywords, users can specify the method to
choose synchronous standbys from the listed servers.

FIRST num_sync (standby_name [, ...])
ANY num_sync (standby_name [, ...])

The keyword FIRST specifies a priority-based synchronous replication
which was available also in 9.6 or before. This method makes transaction
commits wait until their WAL records are replicated to num_sync
synchronous standbys chosen based on their priorities.

The keyword ANY specifies a quorum-based synchronous replication
and makes transaction commits wait until their WAL records are
replicated to *at least* num_sync listed standbys. In this method,
the values of sync_state.pg_stat_replication for the listed standbys
are reported as "quorum". The priority is still assigned to each standby,
but not used in this method.

The existing syntaxes having neither FIRST nor ANY keyword are still
supported. They are the same as new syntax with FIRST keyword, i.e.,
a priority-based synchronous replication.

Author: Masahiko Sawada
Reviewed-By: Michael Paquier, Amit Kapila and me
Discussion: <CAD21AoAACi9NeC_ecm+Vahm+MMA6nYh=Kqs3KB3np+MBOS_gZg@mail.gmail.com>

Many thanks to the various individuals who were involved in
discussing and developing this feature.

备注:具体是说 synchronous_standby_names 参数支持 FIRST 和 ANY 两种模式指定同步复制备节点,语法如下:

1
2
FIRST num_sync (standby_name [,  ...])
ANY num_sync (standby_name [, ...])

  • num_sync 是指需要同步复制的备节点个数;
  • standby_name 是指同步复制备节点的名称,这个名称在备节点 recovery.conf 中的 primary_conninfo 参数 application_name 选项指定;
  • FIRST 表示列表中的同步节点优先按前后顺序排序,列表中越往前的节点优先级越高,同步节点为num_sync个;FIRST 1(node2,node3),表示 node2 为同步备节点,并且同步节点数为1个;
  • ANY 表示 quorum-based 同步复制,同步备节点为任意 num_sync 个。

搭建一主两从流复制

具体搭建步骤略,这里列出部分配置项
环境信息

1
2
3
xx.xx.xx.74 node1PRIMARY
xx.xx.xx.75 node2STADNBY
xx.xx.xx.76 node3 STANDBY

三节点 postgresql.conf 主要参数

1
2
3
wal_level = logical
synchronous_commit = on
synchronous_standby_names = 按需配置,详见文章后面的的演示

三节点 pg_hba.conf 添加 以下

1
2
3
host  replication   repuser   xx.xx.xx.74/32 md5
host replication repuser xx.xx.xx.75/32 md5
host replication repuser xx.xx.xx.76/32 md5

三节点 .pgpass 添加 以下

1
2
3
4
$ cat .pgpass
xx.xx.xx.74:1921:replication:repuser:repuser

[pg93@redhat6 ~]$ chmod 0600 .pgpass

pg_basebackup 命令参考

1
2
3
pg_start_backup('bak1');
pg_basebackup -D /database/pg10/pg_root -Fp -Xs -v -P -h xx.xx.xx.74 -p 1921 -U repuser
pg_stop_backup();

node2 节点 recovery.conf 配置

1
2
3
recovery_target_timeline =  'latest'
standby_mode = on
primary_conninfo = 'host=xx.xx.xx.74 port=1921 user=repuser application_name=node2'

node3 节点 recovery.conf 配置

1
2
3
recovery_target_timeline =  'latest'
standby_mode = on
primary_conninfo = 'host=xx.xx.xx.74 port=1921 user=repuser application_name=node3'

FIRST 同步复制策略

node1设置 synchronous_standby_names 参数

1
2
3
4
5
6
7
8
postgres=# ALTER SYSTEM SET synchronous_standby_names = 'first 1(node2,node3)';
ALTER SYSTEM

postgres=# select pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)

node1上查看

1
2
3
4
5
6
postgres=# select pid,usename,application_name,client_addr,state,sync_priority,sync_state from pg_stat_replication where application_name in ('node2','node3') order by application_name;
pid | usename | application_name | client_addr | state | sync_priority | sync_state
------+---------+------------------+-------------+-----------+---------------+------------
3313 | repuser | node2 | xx.xx.xx.75 | streaming | 1 | sync
3322 | repuser | node3 | xx.xx.xx.76 | streaming | 2 | potential
(2 rows)

备注: 注意sync_state值 ,sync 表示 同步备节点;potential 表示目前是异步备节点,当同步备节点宕机时potential节点有可能升级为同步备节点。

关闭 node2,再次在node1节点上查看

1
2
3
4
5
6
7
8
9
[pg10@db-tfcs02 ~]$ pg_ctl stop -m fast
waiting for server to shut down.... done
server stopped

postgres=# select pid,usename,application_name,client_addr,state,sync_priority,sync_state from pg_stat_replication where application_name in ('node2','node3') order by application_name;
pid | usename | application_name | client_addr | state | sync_priority | sync_state
------+---------+------------------+-------------+-----------+---------------+------------
3322 | repuser | node3 | xx.xx.xx.76 | streaming | 2 | sync
(1 row)

备注:node2关闭后, node3升级为同步备节点。

node1执行

1
2
postgres=# insert into test_sr(id) values(10);
INSERT 0 1

备注:node1 节点上的操作不受影响。

关闭 node3,再次在node1节点上查看

1
2
3
[pg10@db-tfcs03 ~]$ pg_ctl stop -m fast
waiting for server to shut down.... done
server stopped

node1执行

1
2
postgres=# insert into test_sr(id) values(12);
等待状态,INSERT 命令下不去,主库操作将处于等待状态

备注:synchronous_standby_names = ‘first 1(node2,node3)’时,当一个同步节点宕机时主库操作不受影响,当两个同步节点宕机时主库操作将处于等待状态。

ANY 同步复制策略

设备列表中任意两个节点为同步节点,按照预想,列表中如果宕掉一个同步备节点,主库上的操作将被等待,测试下
node1 设置 synchronous_standby_names

1
2
3
4
5
6
7
8
postgres=# ALTER SYSTEM SET synchronous_standby_names = 'ANY 2(node2,node3)';
ALTER SYSTEM

postgres=# SELECT pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)

node1 查看

1
2
3
4
5
postgres=# select pid,usename,application_name,client_addr,state,sync_priority,sync_state from pg_stat_replication where application_name in ('node2','node3') order by application_name;
pid | usename | application_name | client_addr | state | sync_priority | sync_state
------+---------+------------------+-------------+-----------+---------------+------------
3525 | repuser | node2 | xx.xx.xx.75 | streaming | 1 | quorum
3322 | repuser | node3 | xx.xx.xx.76 | streaming | 1 | quorum

关闭nod2

1
2
3
[pg10@db-tfcs02 ~]$ pg_ctl stop -m fast
waiting for server to shut down.... done
server stopped

node1 上操作

1
2
3
4
5
6
7
postgres=# select pid,usename,application_name,client_addr,state,sync_priority,sync_state from pg_stat_replication where application_name in ('node2','node3') order by application_name;
pid | usename | application_name | client_addr | state | sync_priority | sync_state
------+---------+------------------+-------------+-----------+---------------+------------
3322 | repuser | node3 | xx.xx.xx.76 | streaming | 1 | quorum
(1 row)

postgres=# insert into test_sr(id) values(8);

等待状态,INSERT 命令下不去,因为有一个同步节点宕掉了,验证了实验前的预想。

附 pg_stat_replication.sync_state

字段详细解释如下:

1
2
3
4
5
6
7
8
Synchronous state of this standby server.  Possible values are:
async: This standby server is asynchronous.

potential: This standby server is now asynchronous, but can potentially become synchronous if one of current synchronous ones fails.

sync: This standby server is synchronous.

quorum: This standby server is considered as a candidate for quorum standbys.

参考

最后推荐和张文升共同编写的《PostgreSQL实战》,本书基于PostgreSQL 10 编写,共18章,重点介绍SQL高级特性、并行查询、分区表、物理复制、逻辑复制、备份恢复、高可用、性能优化、PostGIS等,涵盖大量实战用例!

购买链接:https://item.jd.com/12405774.html

PostgreSQL实战
感谢支持!
0%