PostgreSQL10:Parallel Queries 增强

PostgreSQL9.6 版本时并行查询仅支持并行 sequential scans、并行 aggregates 操作,PostgreSQL10版本对 Parallel Queries 功能有了较大增强,如下:

1、Support parallel btree index scans (Rahila Syed, Amit Kapila, Robert Haas, Rafia Sabih)
Allows btree index pages to be checked by separate parallel workers.

2、Support parallel bitmap heap scans (Dilip Kumar)
This allows a single index scan to dispatch parallel workers to process different areas of the heap.

3、Allow merge joins to be performed in parallel (Dilip Kumar)

4、Allow non-correlated subqueries to be run in parallel (Amit Kapila)

5、Improve ability of parallel workers to return pre-sorted data (Rushabh Lathia)

6、Increase parallel query usage in procedural language functions (Robert Haas, Rafia Sabih)

备注:今天主要演示下前两种情况,后面几种情况有兴趣的朋友可自行构建测试模型测试。

环境准备

创建测试表并插入1000万数据

1
2
3
4
5
6
7
8
9
10
create table test_big1(id int4 primary key, create_time timestamp without time zone default clock_timestamp(), name character varying(32));
insert into test_big1(id,name) select n,n*random()*10000 from generate_series(1,10000000) n ;

francs=> select * from test_big1 limit 3;
id | create_time | name
----+----------------------------+------------------
1 | 2017-05-21 16:02:24.921751 | 2298.13809040934
2 | 2017-05-21 16:02:24.922051 | 7580.18649183214
3 | 2017-05-21 16:02:24.922064 | 24218.4893181548
(3 rows)

Parallel Index Only Scan

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
francs=> show max_parallel_workers;
max_parallel_workers
----------------------
4
(1 row)

francs=> explain analyze select count(*) from test_big1 where id <1000000 ;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=18576.59..18576.60 rows=1 width=8) (actual time=73.362..73.362 rows=1 loops=1)
-> Gather (cost=18576.17..18576.58 rows=4 width=8) (actual time=73.200..73.355 rows=5 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Partial Aggregate (cost=17576.17..17576.18 rows=1 width=8) (actual time=68.992..68.992 rows=1 loops=5)
-> Parallel Index Only Scan using test_big1_pkey on test_big1(cost=0.43..16947.37 rows=251523 width=0) (actual time=0.053..54.343 rows=200000 loops=5)
Index Cond: (id < 1000000)
Heap Fetches: 174195
Planning time: 0.105 ms
Execution time: 74.572 ms
(10 rows)

备注:从执行计划中看到了 “Parallel Index Only Scan”。

关闭并行

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
francs=>  set max_parallel_workers=0;
SET

francs=> explain analyze select count(*) from test_big1 where id <1000000 ;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=18576.59..18576.60 rows=1 width=8) (actual time=257.585..257.585 rows=1 loops=1)
-> Gather (cost=18576.17..18576.58 rows=4 width=8) (actual time=257.579..257.579 rows=1 loops=1)
Workers Planned: 4
Workers Launched: 0
-> Partial Aggregate (cost=17576.17..17576.18 rows=1 width=8) (actual time=257.251..257.251 rows=1 loops=1)
-> Parallel Index Only Scan using test_big1_pkey on test_big1(cost=0.43..16947.37 rows=251523 width=0) (actual time=0.042..183.384 rows=999999 loops=1)
Index Cond: (id < 1000000)
Heap Fetches: 999999
Planning time: 0.102 ms
Execution time: 257.717 ms
(10 rows)

备注:开启并行查询时,此SQL执行时间为74ms,关闭并行查询时执行时间257ms,慢了3倍多。

Parallel Index Scan

1
2
3
4
5
6
7
8
9
10
11
12
13
francs=> explain analyze select count(name)  from test_big1 where id <1000000  ;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=18580.76..18580.77 rows=1 width=8) (actual time=76.601..76.601 rows=1 loops=1)
-> Gather (cost=18580.34..18580.75 rows=4 width=8) (actual time=75.541..76.595 rows=5 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Partial Aggregate (cost=17580.34..17580.35 rows=1 width=8) (actual time=71.752..71.752 rows=1 loops=5)
-> Parallel Index Scan using idx_test_big1_id on test_big1(cost=0.43..16951.53 rows=251527 width=16) (actual time=0.036..54.157 rows=200000 loops=5)
Index Cond: (id < 1000000)
Planning time: 0.126 ms
Execution time: 76.919 ms
(9 rows)

备注:从执行计划中,走了“Parallel Index Scan”。

Parallel Bitmap Heap Scan

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
francs=> explain analyze select count(*)  from test_big1 where id <1000000  or id >  9000000;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=72069.09..72069.10 rows=1 width=8) (actual time=194.016..194.017 rows=1 loops=1)
-> Gather (cost=72068.67..72069.08 rows=4 width=8) (actual time=193.676..194.011 rows=5 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Partial Aggregate (cost=71068.67..71068.68 rows=1 width=8) (actual time=189.722..189.722 rows=1 loops=5)
-> Parallel Bitmap Heap Scan on test_big1(cost=21277.11..69887.94 rows=472292 width=0) (actual time=84.034..155.426 rows=400000 loops=5)
Recheck Cond: ((id < 1000000) OR (id > 9000000))
Heap Blocks: exact=1291
-> BitmapOr (cost=21277.11..21277.11 rows=1987954 width=0) (actual time=85.468..85.468 rows=0 loops=1)
-> Bitmap Index Scan on test_big1_pkey(cost=0.00..10290.13 rows=1006093 width=0) (actual time=42.868..42.868 rows=999999 loops=1)
Index Cond: (id < 1000000)
-> Bitmap Index Scan on test_big1_pkey(cost=0.00..10042.39 rows=981861 width=0) (actual time=42.597..42.597 rows=1000000 loops=1)
Index Cond: (id > 9000000)
Planning time: 0.120 ms
Execution time: 195.811 ms
(15 rows)

备注:从执行计划中看到了“Parallel Bitmap Heap Scan on test_big1”。

参考

最后推荐和张文升共同编写的《PostgreSQL实战》,本书基于PostgreSQL 10 编写,共18章,重点介绍SQL高级特性、并行查询、分区表、物理复制、逻辑复制、备份恢复、高可用、性能优化、PostGIS等,涵盖大量实战用例!

购买链接:https://item.jd.com/12405774.html

PostgreSQL实战
感谢支持!
0%