query plan is different between select from table and select from view

sma · July 2020

Hi Vertica Community,
I have a question regarding the query plan that is different between select directly from the table and select from the view. If you flatten out the view's code, its exactly the same as directly select from the table.
check below as an example:
Test query 1:

select * from dev.v_table1_table2
where clt = 'XXX' and date > '2020-01-01';

Test query 2:

select coalesce(v_table2_view1.id, v_table2_view2.id, 1) as id, table1.*
from dev.table1
left outer join dev.v_table2_view1
   on table1.clt = v_table2_view1.clt
  and table1.sty = v_table2_view1.sty
  and table1.ncc = v_table2_view1.ncc
  and table1.sc = v_table2_view1.sc
left outer join dev.v_table2_view2
   on table1.clt = v_table2_view2.clt
  and table1.sty = v_table2_view2.sty
  and table1.cd = v_table2_view2.cd
  and table1.sc = v_table2_view2.sc
where table1.code_num = '123456'
  and table1.clt = 'XXX'
  and table1.date > '2020-01-01';

Notes:
Both queries are using the exact same projections, but the storage access part in query plans are different.
Both queries have the exact same query underneath if you flatten them out.
Question:
1.
Test query1 only has "Filter: (table1.clt = 'XXX')" in storage acess table1 part;
While Test query 2 has "Filter: (table1.clt = 'XXX')" in storage acess table1 part, "Filter: (table2.clt = 'XXX')" in storage acess table2 part (for dev.v_table2_view1), "Filter: (table2.clt = 'XXX')" again in storage acess table2 part (for dev.v_table2_view2).
Which means the Test query 2 can filter out data faster than Test query 1, wondering what's the reason for this, is it because Vertica treats views different from the table?
2.
I also see "local round robin" in the execution plan, I am not very familiar with this round-robin concept.
Can someone please share some light about what is "round robin" in general? and is there a "global round robin" in vertica? and what does "local round robin" in the execution plan means.

Built below 2 tables and 3 views for this testing ~~
Tables Used
1. dev.table1

CREATE TABLE dev.table1
(
    clt char(3) NOT NULL,
    date date NOT NULL,
    sty varchar(50) NOT NULL,
    ncc varchar(50) NOT NULL,
    sc varchar(30) NOT NULL,
    code_num varchar(15) NOT NULL,
    cd varchar(80),
    CONSTRAINT C_PRIMARY PRIMARY KEY (clt, date, sty, ncc, sc, code_num) ENABLED
);

only has one super projection, and the sort order is the same as the column order;
SEGMENTED BY hash(table1.clt, table1.sty) ALL NODES KSAFE 1.

dev.table2

CREATE TABLE dev.table2
(
    clt char(3) NOT NULL,
    id int NOT NULL,
    sty varchar(25),
    ncc varchar(20),
    cc varchar(10),
    sc varchar(80),
    cd varchar(80),
    flag boolean DEFAULT true,
    CONSTRAINT C_PRIMARY PRIMARY KEY (clt, id) ENABLED
);

only has one super projection, and the sort order is the same as the column order;
SEGMENTED BY hash(table2.clt, table2.id) ALL NODES KSAFE 1.
Views Used
1._ dev.v_table2_view1_

CREATE VIEW dev.v_table2_view1 AS
 SELECT table2.clt,
        table2.sty,
        table2.ncc,
        table2.sc,
        max(table2.cc) AS cc,
        max(table2.id) AS id,
        max(table2.cd) AS cd
 FROM dev.table2
 WHERE (table2.flag = true)
 GROUP BY table2.clt,
          table2.sty,
          table2.ncc,
          table2.sc;

2._ dev.v_table2_view2_

CREATE VIEW dev.v_table2_view2 AS
 SELECT table2.clt,
        table2.sty,
        table2.cd,
        table2.sc,
        max(table2.id) AS id,
        max(table2.ncc) AS ncc
 FROM dev.table2
 WHERE (table2.flag = true)
 GROUP BY table2.clt,
          table2.sty,
          table2.cd,
          table2.sc;

dev.v_table1_table2

create view dev.v_table1_table2 as
select coalesce(v_table2_view1.id, v_table2_view2.id, 1) as id, table1.*
from dev.table1
left outer join dev.v_table2_view1
   on table1.clt = v_table2_view1.clt
  and table1.sty = v_table2_view1.sty
  and table1.ncc = v_table2_view1.ncc
  and table1.sc = v_table2_view1.sc
left outer join dev.v_table2_view2
   on table1.clt = v_table2_view2.clt
  and table1.sty = v_table2_view2.sty
  and table1.cd = v_table2_view2.cd
  and table1.sc = v_table2_view2.sc
where table1.code_num = '123456';

chaima · July 2020

Hello,
Which version are you using?
Can you also share the full plans of the 2 queries on table vs view (where you noticed difference at storage access level)

sma · July 2020

The Vertica Version we are using is: "Vertica Analytic Database v9.3.0-1".

The query plans are attached, the column header indicates which plan belongs to which test query.
And I highlighted the "Filter: (table*.clt = 'XXX')" part that I was talking about.

For this testing, I inserted only one row (some dummy data) into each table, dev.table1 and dev.table2.
The data I inserted can be found in the "table data" tab in the same file I attached.

Note: We are running queries on a four-node on-prem environment.

query plan is different between select from table and select from view

Comments

Leave a Comment