We're Moving!

The Vertica Forum is moving to a new OpenText Analytics Database (Vertica) Community.

Join us there to post discussion topics, learn about

product releases, share tips, access the blog, and much more.

Create My New Community Account Now


Resegmentation During GROUP BY — Vertica Forum

Resegmentation During GROUP BY

Hi,
i am worried about the resegmentation during Group BY
say

-- to create super projection use ENCODING AUTO
create table foo (a int ENCODING AUTO, b int, c int)

CREATE PROJECTION foo_pj
(a,b,c) AS
select a,b,c from foo
SEGMENTED BY HASH(a,b,c)
ALL NODES OFFSET 0;

CREATE PROJECTION foo_pj1
(b,c) AS
select b,c from foo
SEGMENTED BY HASH(b,c)
ALL NODES OFFSET 0;

---Check projection is fine
select projection_basename,segment_expression,is_super_projection from projections where anchor_table_name='foo'

foo hash(foo.a, foo.b, foo.c) true
foo_pj hash(foo.a, foo.b, foo.c) true
foo_pj1 hash(foo.b, foo.c, foo.a) true

Now, when i execute following

explain SELECT F.b,F.a FROM foo AS F, foo WHERE foo.b = F.a GROUP BY 1,2;

this is pipelined as

+-GROUPBY PIPELINED [Cost: 4, Rows: 1 (NO STATISTICS)] (PATH ID: 1)
| Group By: F.a, F.b
| +---> JOIN HASH [Cost: 3, Rows: 1 (NO STATISTICS)] (PATH ID: 2)

but for

explain SELECT F.b,F.c FROM foo AS F, foo WHERE F.b = F.c GROUP BY 1,2;\

i an getting HASH,

+-GROUPBY HASH (LOCAL RESEGMENT GROUPS) [Cost: 5, Rows: 1 (NO STATISTICS)] (PATH ID: 1)
| Group By: F.b, F.c
| +---> JOIN (CROSS JOIN) [Cost: 4, Rows: 1 (NO STATISTICS)] (PATH ID: 2)
| | +-- Outer -> STORAGE ACCESS for foo [Cost: 1, Rows: 1 (NO STATISTICS)] (PATH ID: 3)

This means the second projection (foo_pj1) created has no effect on the SEGMENTATION?
Or my understanding of this is wrong please clarify.

Karthik S G

Comments

  • Jim_KnicelyJim_Knicely - Select Field - Administrator
    edited October 2017

    Hi,

    I'm not sure I understand your join condition. Your WHERE clause is comparing data in the same table. You are probably getting a cross join.

    Do you mean to join foo to foo? Then you need to alias each instance of foo...

    SELECT F.b,F.a FROM foo AS F, foo E WHERE foo.b = E.a GROUP BY 1,2;
    SELECT F.b,F.c FROM foo AS F, foo E WHERE F.b = E.c GROUP BY 1,2;
    

    Also, try ti use SQL99 join syntax:

    SELECT F.b,F.a FROM foo AS F JOIN foo E ON foo.b = E.a GROUP BY 1,2;
    SELECT F.b,F.c FROM foo AS F JOIN foo E ON F.b = E.c GROUP BY 1,2;
    

    See:
    https://my.vertica.com/docs/9.0.x/HTML/index.htm#Authoring/SQLReferenceManual/Statements/SELECT/joined-table.htm

  • hi Jim,

    It was a typo,

    queries are

    explain SELECT F.b,F.c FROM foo AS F, foo WHERE Foo.b = F.c GROUP BY 1,2;
    and
    explain SELECT F.b,F.a FROM foo AS F, foo WHERE foo.b = F.a GROUP BY 1,2;

    for first i am getting

    +-GROUPBY HASH (LOCAL RESEGMENT GROUPS) [Cost: 4, Rows: 1 (NO STATISTICS)] (PATH ID: 1)

    and second

    +-GROUPBY PIPELINED [Cost: 4, Rows: 1 (NO STATISTICS)] (PATH ID: 1)

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file