PostgreSQL10: Additional FDW Push-Down

在外部表上做 aggregates 操作时,10版本之前的处理过程是先将远程库的数据全部取到本地库,之后在本地库上aggregates操作,10版本后部分 aggregates 操作可下推到远程库,这样将大幅减少远程库传到本地库的数据量,性能会有较大提升,关于这个特性,文档中有以下描述

Committed Patch 的内容

postgres_fdw: Push down aggregates to remote servers.Now that the upper planner uses paths, and now that we have proper hooks to inject paths into the upper planning process, its possible for foreign data wrappers to arrange to push aggregates to the remote side instead of fetching all of the rows and aggregating them locally. This figures to be a massive win for performance, so teach postgres_fdw to do it.
Jeevan Chalke and Ashutosh Bapat. Reviewed by Ashutosh Bapat with
additional testing by Prabhat Sahu. Various mostly cosmetic changes
by me.

PostgreSQL10 Release Note 中的描述

Push aggregates to foreign data wrapper servers, where possible (Jeevan Chalke, Ashutosh Bapat)
This reduces the amount of data that must be passed from the foreign data wrapper server, and offloads aggregate computation from the requesting server. The postgres_fdw is able to perform this optimization. There are also improvements in pushing down joins involving extensions.

接下来分别在 9.6 版本和 10 版本测试。

PostgreSQL9.6 版本测试

创建 postgres_fdw

1
2
3
4
5
6
[pg96@db1 pg_root]$ psql
psql (9.6beta1)
Type "help" for help.

francs=# create extension postgres_fdw;
CREATE EXTENSION

创建远程PG库SERVER和MAPPING USER

1
2
3
4
5
6
7
francs=# grant usage on foreign data wrapper postgres_fdw to francs;
GRANT

francs=> CREATE SERVER pgsql_srv FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host '127.0.0.1', port '1921', dbname 'francs');
CREATE SERVER
francs=> CREATE USER MAPPING FOR public SERVER pgsql_srv OPTIONS (user 'francs', password 'francs');
CREATE USER MAPPING

创建测试表并插入数据

1
2
3
4
5
6
create table test_fdw3 (
id int4,
flag int4
);

insert into test_fdw3(id,flag) select n, mod(n,3) from generate_series(1,100000) n;

创建外部表

1
2
3
4
CREATE FOREIGN TABLE ft_test_fdw3 (
id int4 ,
flag int4
) SERVER pgsql_srv OPTIONS (schema_name 'francs', table_name 'test_fdw3');

查看执行计划

1
2
3
4
5
6
7
8
9
10
11
12
francs=> explain (analyze on,verbose on)  select flag,count(*)  from ft_test_fdw3 group  by flag order by flag;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=100.00..222.22 rows=200 width=12) (actual time=202.422..440.217 rows=3 loops=1)
Output: flag, count(*)
Group Key: ft_test_fdw3.flag
-> Foreign Scan on francs.ft_test_fdw3 (cost=100.00..205.60 rows=2925 width=4) (actual time=59.773..395.126 rows=100000 loops=1)
Output: id, flag
Remote SQL: SELECT flag FROM francs.test_fdw3 ORDER BY flag ASC NULLS LAST
Planning time: 0.185 ms
Execution time: 441.758 ms
(8 rows)

PostgreSQL10 版本测试

重复创建外部表、测试表等以上操作,最后查看以下执行计划

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
[pg10@csv-tfcs01 ~]$ psql francs francs  
Password for user francs:
psql (10beta1)
Type "help" for help.

francs=> explain (analyze on,verbose on) select flag,count(*) from ft_test_fdw3 group by flag order by flag;
QUERY PLAN
---------------------------------------------------------------------------------------------------------
Sort (cost=211.41..211.91 rows=200 width=12) (actual time=19.662..19.662 rows=3 loops=1)
Output: flag, (count(*))
Sort Key: ft_test_fdw3.flag
Sort Method: quicksortMemory: 25kB
-> Foreign Scan (cost=129.25..203.76 rows=200 width=12) (actual time=19.648..19.649 rows=3 loops=1)
Output: flag, (count(*))
Relations: Aggregate on (francs.ft_test_fdw3)
Remote SQL: SELECT flag, count(*) FROM francs.test_fdw3 GROUP BY flag
Planning time: 0.212 ms
Execution time: 19.928

备注:仔细观察这份执行计划,发现变化有三

  1. 多了Relations: Aggregate on 步骤,Aggregate 操作被下推到远程库上执行。
  2. Foreign Scan 的 actual rows=3,而9.6版本这步操作的 Foreign Scan 的 actual rows=100000
  3. 执行时间快很多,10 版本这个查询仅需 19.9ms,9.6版本这个查询需要 441 ms,这里说明下9.6、10所在的的硬件环境不对等,尽管如此,可以推测即使是在同样的硬件环境下,由于 Aggregate 操作的下推,10 版本这类 SQL 的性能比 9.6 版本要好很多。

说明

今天仅做了一个Aggregate操作下推到远程库的测试案例,是否所有Aggregate操作都能下推,值得思考与测试,今天先测试到这里,有兴趣的朋友可测试其它Aggregate案例。

参考

最后推荐和张文升共同编写的《PostgreSQL实战》,本书基于PostgreSQL 10 编写,共18章,重点介绍SQL高级特性、并行查询、分区表、物理复制、逻辑复制、备份恢复、高可用、性能优化、PostGIS等,涵盖大量实战用例!

购买链接:https://item.jd.com/12405774.html

PostgreSQL实战
感谢支持!
0%