query plan is different between select from table and select from view

smasma Vertica Customer
edited July 2020 in General Discussion

Hi Vertica Community,
I have a question regarding the query plan that is different between select directly from the table and select from the view. If you flatten out the view's code, its exactly the same as directly select from the table.
check below as an example:
Test query 1:

select * from dev.v_table1_table2
where clt = 'XXX' and date > '2020-01-01';

Test query 2:

select coalesce(v_table2_view1.id, v_table2_view2.id, 1) as id, table1.*
from dev.table1
left outer join dev.v_table2_view1
   on table1.clt = v_table2_view1.clt
  and table1.sty = v_table2_view1.sty
  and table1.ncc = v_table2_view1.ncc
  and table1.sc = v_table2_view1.sc
left outer join dev.v_table2_view2
   on table1.clt = v_table2_view2.clt
  and table1.sty = v_table2_view2.sty
  and table1.cd = v_table2_view2.cd
  and table1.sc = v_table2_view2.sc
where table1.code_num = '123456'
  and table1.clt = 'XXX'
  and table1.date > '2020-01-01';

Notes:
Both queries are using the exact same projections, but the storage access part in query plans are different.
Both queries have the exact same query underneath if you flatten them out.
Question:
1.
Test query1 only has "Filter: (table1.clt = 'XXX')" in storage acess table1 part;
While Test query 2 has "Filter: (table1.clt = 'XXX')" in storage acess table1 part, "Filter: (table2.clt = 'XXX')" in storage acess table2 part (for dev.v_table2_view1), "Filter: (table2.clt = 'XXX')" again in storage acess table2 part (for dev.v_table2_view2).
Which means the Test query 2 can filter out data faster than Test query 1, wondering what's the reason for this, is it because Vertica treats views different from the table?
2.
I also see "local round robin" in the execution plan, I am not very familiar with this round-robin concept.
Can someone please share some light about what is "round robin" in general? and is there a "global round robin" in vertica? and what does "local round robin" in the execution plan means.

Built below 2 tables and 3 views for this testing ~~
Tables Used
1. dev.table1

CREATE TABLE dev.table1
(
    clt char(3) NOT NULL,
    date date NOT NULL,
    sty varchar(50) NOT NULL,
    ncc varchar(50) NOT NULL,
    sc varchar(30) NOT NULL,
    code_num varchar(15) NOT NULL,
    cd varchar(80),
    CONSTRAINT C_PRIMARY PRIMARY KEY (clt, date, sty, ncc, sc, code_num) ENABLED
);
  • only has one super projection, and the sort order is the same as the column order;
  • SEGMENTED BY hash(table1.clt, table1.sty) ALL NODES KSAFE 1.
  1. dev.table2
CREATE TABLE dev.table2
(
    clt char(3) NOT NULL,
    id int NOT NULL,
    sty varchar(25),
    ncc varchar(20),
    cc varchar(10),
    sc varchar(80),
    cd varchar(80),
    flag boolean DEFAULT true,
    CONSTRAINT C_PRIMARY PRIMARY KEY (clt, id) ENABLED
);
  • only has one super projection, and the sort order is the same as the column order;
  • SEGMENTED BY hash(table2.clt, table2.id) ALL NODES KSAFE 1.
    Views Used
    1._ dev.v_table2_view1_
CREATE VIEW dev.v_table2_view1 AS
 SELECT table2.clt,
        table2.sty,
        table2.ncc,
        table2.sc,
        max(table2.cc) AS cc,
        max(table2.id) AS id,
        max(table2.cd) AS cd
 FROM dev.table2
 WHERE (table2.flag = true)
 GROUP BY table2.clt,
          table2.sty,
          table2.ncc,
          table2.sc;

2._ dev.v_table2_view2_

CREATE VIEW dev.v_table2_view2 AS
 SELECT table2.clt,
        table2.sty,
        table2.cd,
        table2.sc,
        max(table2.id) AS id,
        max(table2.ncc) AS ncc
 FROM dev.table2
 WHERE (table2.flag = true)
 GROUP BY table2.clt,
          table2.sty,
          table2.cd,
          table2.sc;
  1. dev.v_table1_table2
create view dev.v_table1_table2 as
select coalesce(v_table2_view1.id, v_table2_view2.id, 1) as id, table1.*
from dev.table1
left outer join dev.v_table2_view1
   on table1.clt = v_table2_view1.clt
  and table1.sty = v_table2_view1.sty
  and table1.ncc = v_table2_view1.ncc
  and table1.sc = v_table2_view1.sc
left outer join dev.v_table2_view2
   on table1.clt = v_table2_view2.clt
  and table1.sty = v_table2_view2.sty
  and table1.cd = v_table2_view2.cd
  and table1.sc = v_table2_view2.sc
where table1.code_num = '123456';

Comments

  • chaimachaima Vertica Employee Employee

    Hello,
    Which version are you using?
    Can you also share the full plans of the 2 queries on table vs view (where you noticed difference at storage access level)

  • smasma Vertica Customer
    edited July 2020

    The Vertica Version we are using is: "Vertica Analytic Database v9.3.0-1".

    The query plans are attached, the column header indicates which plan belongs to which test query.
    And I highlighted the "Filter: (table*.clt = 'XXX')" part that I was talking about.

    For this testing, I inserted only one row (some dummy data) into each table, dev.table1 and dev.table2.
    The data I inserted can be found in the "table data" tab in the same file I attached.

    Note: We are running queries on a four-node on-prem environment.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file