query plan is different between select from table and select from view
Hi Vertica Community,
I have a question regarding the query plan that is different between select directly from the table and select from the view. If you flatten out the view's code, its exactly the same as directly select from the table.
check below as an example:
Test query 1:
select * from dev.v_table1_table2 where clt = 'XXX' and date > '2020-01-01';
Test query 2:
select coalesce(v_table2_view1.id, v_table2_view2.id, 1) as id, table1.* from dev.table1 left outer join dev.v_table2_view1 on table1.clt = v_table2_view1.clt and table1.sty = v_table2_view1.sty and table1.ncc = v_table2_view1.ncc and table1.sc = v_table2_view1.sc left outer join dev.v_table2_view2 on table1.clt = v_table2_view2.clt and table1.sty = v_table2_view2.sty and table1.cd = v_table2_view2.cd and table1.sc = v_table2_view2.sc where table1.code_num = '123456' and table1.clt = 'XXX' and table1.date > '2020-01-01';
Notes:
Both queries are using the exact same projections, but the storage access part in query plans are different.
Both queries have the exact same query underneath if you flatten them out.
Question:
1.
Test query1 only has "Filter: (table1.clt = 'XXX')" in storage acess table1 part;
While Test query 2 has "Filter: (table1.clt = 'XXX')" in storage acess table1 part, "Filter: (table2.clt = 'XXX')" in storage acess table2 part (for dev.v_table2_view1), "Filter: (table2.clt = 'XXX')" again in storage acess table2 part (for dev.v_table2_view2).
Which means the Test query 2 can filter out data faster than Test query 1, wondering what's the reason for this, is it because Vertica treats views different from the table?
2.
I also see "local round robin" in the execution plan, I am not very familiar with this round-robin concept.
Can someone please share some light about what is "round robin" in general? and is there a "global round robin" in vertica? and what does "local round robin" in the execution plan means.
Built below 2 tables and 3 views for this testing ~~
Tables Used
1. dev.table1
CREATE TABLE dev.table1 ( clt char(3) NOT NULL, date date NOT NULL, sty varchar(50) NOT NULL, ncc varchar(50) NOT NULL, sc varchar(30) NOT NULL, code_num varchar(15) NOT NULL, cd varchar(80), CONSTRAINT C_PRIMARY PRIMARY KEY (clt, date, sty, ncc, sc, code_num) ENABLED );
- only has one super projection, and the sort order is the same as the column order;
- SEGMENTED BY hash(table1.clt, table1.sty) ALL NODES KSAFE 1.
- dev.table2
CREATE TABLE dev.table2 ( clt char(3) NOT NULL, id int NOT NULL, sty varchar(25), ncc varchar(20), cc varchar(10), sc varchar(80), cd varchar(80), flag boolean DEFAULT true, CONSTRAINT C_PRIMARY PRIMARY KEY (clt, id) ENABLED );
- only has one super projection, and the sort order is the same as the column order;
- SEGMENTED BY hash(table2.clt, table2.id) ALL NODES KSAFE 1.
Views Used
1._ dev.v_table2_view1_
CREATE VIEW dev.v_table2_view1 AS SELECT table2.clt, table2.sty, table2.ncc, table2.sc, max(table2.cc) AS cc, max(table2.id) AS id, max(table2.cd) AS cd FROM dev.table2 WHERE (table2.flag = true) GROUP BY table2.clt, table2.sty, table2.ncc, table2.sc;
2._ dev.v_table2_view2_
CREATE VIEW dev.v_table2_view2 AS SELECT table2.clt, table2.sty, table2.cd, table2.sc, max(table2.id) AS id, max(table2.ncc) AS ncc FROM dev.table2 WHERE (table2.flag = true) GROUP BY table2.clt, table2.sty, table2.cd, table2.sc;
- dev.v_table1_table2
create view dev.v_table1_table2 as select coalesce(v_table2_view1.id, v_table2_view2.id, 1) as id, table1.* from dev.table1 left outer join dev.v_table2_view1 on table1.clt = v_table2_view1.clt and table1.sty = v_table2_view1.sty and table1.ncc = v_table2_view1.ncc and table1.sc = v_table2_view1.sc left outer join dev.v_table2_view2 on table1.clt = v_table2_view2.clt and table1.sty = v_table2_view2.sty and table1.cd = v_table2_view2.cd and table1.sc = v_table2_view2.sc where table1.code_num = '123456';
Comments
Hello,
Which version are you using?
Can you also share the full plans of the 2 queries on table vs view (where you noticed difference at storage access level)
The Vertica Version we are using is: "Vertica Analytic Database v9.3.0-1".
The query plans are attached, the column header indicates which plan belongs to which test query.
And I highlighted the "Filter: (table*.clt = 'XXX')" part that I was talking about.
For this testing, I inserted only one row (some dummy data) into each table, dev.table1 and dev.table2.
The data I inserted can be found in the "table data" tab in the same file I attached.
Note: We are running queries on a four-node on-prem environment.