PostgreSQL 13: 多源逻辑复制实践

上篇博客介绍了PostgreSQL 13: 普通表数据逻辑复制到分区表,那么是否支持多源逻辑复制到同一个目标表呢?

生产库上有这种需求场景,假如一个业务系统的数据库水平拆分成四个库,对于四个库的某些表有逻辑复制汇总的需求,详见以下:

上图业务系统数据库根据地域拆分成mydb1、mydb2、mydb3、mydb4四个库,四个库表结构相同,只是存储的数据按地域水平拆分,四个库中的tbl_log表(分片字段city)都需要将数据同步到汇总库 mydb。

本文基于上图部署架构进行多源逻辑复制实践。

环境规划

环境规划,如下:

节点 数据库版本 IP 端口 数据库
源库 PostgreSQL 13beta1 192.168.2.11 1922 mydb1
源库 PostgreSQL 13beta1 192.168.2.11 1922 mydb2
源库 PostgreSQL 13beta1 192.168.2.11 1922 mydb3
源库 PostgreSQL 13beta1 192.168.2.11 1922 mydb4
目标库 PostgreSQL 13beta1 192.168.2.13 1924 mydb

环境准备

源库和目标库安装 PostgreSQL 13beta1软件,本文略。

创建数据库并赋权

1
2
3
4
5
6
7
8
9
10
11
12
--CREATE DATABASES
CREATE DATABASE mydb1 WITH OWNER = postgres TEMPLATE = template0 ENCODING = 'UTF8' TABLESPACE = tbs_mydb;
CREATE DATABASE mydb2 WITH OWNER = postgres TEMPLATE = template0 ENCODING = 'UTF8' TABLESPACE = tbs_mydb;
CREATE DATABASE mydb3 WITH OWNER = postgres TEMPLATE = template0 ENCODING = 'UTF8' TABLESPACE = tbs_mydb;
CREATE DATABASE mydb4 WITH OWNER = postgres TEMPLATE = template0 ENCODING = 'UTF8' TABLESPACE = tbs_mydb;


--GRANT
grant all on database mydb1 to pguser with grant option;
grant all on database mydb2 to pguser with grant option;
grant all on database mydb3 to pguser with grant option;
grant all on database mydb4 to pguser with grant option;

同时,四个库都创建 pguser 模式。

创建表

在四个库上分别创建以下表,并建立约束,如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
--CREATE TABLE
CREATE TABLE tbl_log (
id serial,
user_id int4,
city character varying(32),
create_time timestamp(0) without time zone
);

\c mydb1 pguser
ALTER TABLE tbl_log ADD CONSTRAINT ck_city CHECK( city='city1');

\c mydb2 pguser
ALTER TABLE tbl_log ADD CONSTRAINT ck_city CHECK( city='city2');

\c mydb3 pguser
ALTER TABLE tbl_log ADD CONSTRAINT ck_city CHECK( city='city3');

\c mydb4 pguser
ALTER TABLE tbl_log ADD CONSTRAINT ck_city CHECK( city='city4');

创建发布

四个库依次创建发布,如下:

1
2
3
4
5
6
7
8
9
10
11
\c mydb1 pguser
CREATE PUBLICATION pub1 FOR TABLE tbl_log ;

\c mydb2 pguser
CREATE PUBLICATION pub2 FOR TABLE tbl_log ;

\c mydb3 pguser
CREATE PUBLICATION pub3 FOR TABLE tbl_log ;

\c mydb4 pguser
CREATE PUBLICATION pub4 FOR TABLE tbl_log ;

四个库依次赋权,如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
\c mydb1 pguser
GRANT CONNECT ON DATABASE mydb1 TO repuser;
GRANT USAGE ON SCHEMA pguser TO repuser;
GRANT SELECT ON ALL TABLES IN SCHEMA pguser TO repuser;

\c mydb2 pguser
GRANT CONNECT ON DATABASE mydb2 TO repuser;
GRANT USAGE ON SCHEMA pguser TO repuser;
GRANT SELECT ON ALL TABLES IN SCHEMA pguser TO repuser;

\c mydb3 pguser
GRANT CONNECT ON DATABASE mydb3 TO repuser;
GRANT USAGE ON SCHEMA pguser TO repuser;
GRANT SELECT ON ALL TABLES IN SCHEMA pguser TO repuser;

\c mydb4 pguser
GRANT CONNECT ON DATABASE mydb4 TO repuser;
GRANT USAGE ON SCHEMA pguser TO repuser;
GRANT SELECT ON ALL TABLES IN SCHEMA pguser TO repuser;

创建订阅

目标库上创建订阅,如下:

1
2
3
4
CREATE SUBSCRIPTION sub_mydb1 CONNECTION 'host=192.168.2.11 port=1922 dbname=mydb1 user=repuser' PUBLICATION pub1;
CREATE SUBSCRIPTION sub_mydb2 CONNECTION 'host=192.168.2.11 port=1922 dbname=mydb2 user=repuser' PUBLICATION pub2;
CREATE SUBSCRIPTION sub_mydb3 CONNECTION 'host=192.168.2.11 port=1922 dbname=mydb3 user=repuser' PUBLICATION pub3;
CREATE SUBSCRIPTION sub_mydb4 CONNECTION 'host=192.168.2.11 port=1922 dbname=mydb4 user=repuser' PUBLICATION pub4;

目标库创建分区表

在目标库创建分区表,如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
--创建父表
CREATE TABLE tbl_log (
id serial,
user_id int4,
city character varying(32),
create_time timestamp(0) without time zone
) PARTITION BY RANGE(create_time);

--创建子表
CREATE TABLE tbl_log_his PARTITION OF tbl_log FOR VALUES FROM (minvalue) TO ('2020-01-01');
CREATE TABLE tbl_log_202001 PARTITION OF tbl_log FOR VALUES FROM ('2020-01-01') TO ('2020-02-01');
CREATE TABLE tbl_log_202002 PARTITION OF tbl_log FOR VALUES FROM ('2020-02-01') TO ('2020-03-01');
CREATE TABLE tbl_log_202003 PARTITION OF tbl_log FOR VALUES FROM ('2020-03-01') TO ('2020-04-01');
CREATE TABLE tbl_log_202004 PARTITION OF tbl_log FOR VALUES FROM ('2020-04-01') TO ('2020-05-01');
CREATE TABLE tbl_log_202005 PARTITION OF tbl_log FOR VALUES FROM ('2020-05-01') TO ('2020-06-01');
CREATE TABLE tbl_log_202006 PARTITION OF tbl_log FOR VALUES FROM ('2020-06-01') TO ('2020-07-01');
CREATE TABLE tbl_log_202007 PARTITION OF tbl_log FOR VALUES FROM ('2020-07-01') TO ('2020-08-01');

--创建索引
CREATE INDEX idx_tbl_log_ctime ON tbl_log USING BTREE (create_time);

数据验证

源库 mydb1、mydb2、mydb3、mydb4 依次插入以下数据,如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
[pg13@ydtf01 ~]$ psql mydb1 pguser -p 1922
psql (13beta1)
Type "help" for help.

mydb1=> INSERT INTO tbl_log(user_id,city,create_time) SELECT round(100000000*random()),'city1',generate_series('2019-10-01'::date, '2020-06-20'::date, '1 minute');
INSERT 0 378721

mydb1=> \c mydb2 pguser
You are now connected to database "mydb2" as user "pguser".
mydb2=> INSERT INTO tbl_log(user_id,city,create_time) SELECT round(100000000*random()),'city2',generate_series('2019-10-01'::date, '2020-06-20'::date, '1 minute');
INSERT 0 378721

mydb2=> \c mydb3 pguser
You are now connected to database "mydb3" as user "pguser".
mydb3=> INSERT INTO tbl_log(user_id,city,create_time) SELECT round(100000000*random()),'city3',generate_series('2019-10-01'::date, '2020-06-20'::date, '1 minute');
INSERT 0 378721

mydb3=> \c mydb4 pguser
You are now connected to database "mydb4" as user "pguser".
mydb4=> INSERT INTO tbl_log(user_id,city,create_time) SELECT round(100000000*random()),'city4',generate_series('2019-10-01'::date, '2020-06-20'::date, '1 minute');
INSERT 0 378721

目标库验证数据,如下:

1
2
3
4
5
6
7
8
9
10
[pg13@ydtf03 ~]$ psql mydb pguser -p 1924
psql (13beta1)
Type "help" for help.


mydb=> select count(*) from pguser.tbl_log;
count
---------
1514884
(1 row)

四个库的数据都已同步。

参考

最后推荐和张文升共同编写的《PostgreSQL实战》,本书基于PostgreSQL 10 编写,共18章,重点介绍SQL高级特性、并行查询、分区表、物理复制、逻辑复制、备份恢复、高可用、性能优化、PostGIS等,涵盖大量实战用例!

购买链接:https://item.jd.com/12405774.html

PostgreSQL实战
感谢支持!
0%