Please take this survey to help us learn more about how you use third party tools. Your input is greatly appreciated!

Resegmentation During GROUP BY

Hi,
i am worried about the resegmentation during Group BY
say

-- to create super projection use ENCODING AUTO
create table foo (a int ENCODING AUTO, b int, c int)

CREATE PROJECTION foo_pj
(a,b,c) AS
select a,b,c from foo
SEGMENTED BY HASH(a,b,c)
ALL NODES OFFSET 0;

CREATE PROJECTION foo_pj1
(b,c) AS
select b,c from foo
SEGMENTED BY HASH(b,c)
ALL NODES OFFSET 0;

---Check projection is fine
select projection_basename,segment_expression,is_super_projection from projections where anchor_table_name='foo'

foo hash(foo.a, foo.b, foo.c) true
foo_pj hash(foo.a, foo.b, foo.c) true
foo_pj1 hash(foo.b, foo.c, foo.a) true

Now, when i execute following

explain SELECT F.b,F.a FROM foo AS F, foo WHERE foo.b = F.a GROUP BY 1,2;

this is pipelined as

+-GROUPBY PIPELINED [Cost: 4, Rows: 1 (NO STATISTICS)] (PATH ID: 1)
| Group By: F.a, F.b
| +---> JOIN HASH [Cost: 3, Rows: 1 (NO STATISTICS)] (PATH ID: 2)

but for

explain SELECT F.b,F.c FROM foo AS F, foo WHERE F.b = F.c GROUP BY 1,2;\

i an getting HASH,

+-GROUPBY HASH (LOCAL RESEGMENT GROUPS) [Cost: 5, Rows: 1 (NO STATISTICS)] (PATH ID: 1)
| Group By: F.b, F.c
| +---> JOIN (CROSS JOIN) [Cost: 4, Rows: 1 (NO STATISTICS)] (PATH ID: 2)
| | +-- Outer -> STORAGE ACCESS for foo [Cost: 1, Rows: 1 (NO STATISTICS)] (PATH ID: 3)

This means the second projection (foo_pj1) created has no effect on the SEGMENTATION?
Or my understanding of this is wrong please clarify.

Karthik S G

Comments

  • Jim_KnicelyJim_Knicely Administrator
    edited October 2017

    Hi,

    I'm not sure I understand your join condition. Your WHERE clause is comparing data in the same table. You are probably getting a cross join.

    Do you mean to join foo to foo? Then you need to alias each instance of foo...

    SELECT F.b,F.a FROM foo AS F, foo E WHERE foo.b = E.a GROUP BY 1,2;
    SELECT F.b,F.c FROM foo AS F, foo E WHERE F.b = E.c GROUP BY 1,2;
    

    Also, try ti use SQL99 join syntax:

    SELECT F.b,F.a FROM foo AS F JOIN foo E ON foo.b = E.a GROUP BY 1,2;
    SELECT F.b,F.c FROM foo AS F JOIN foo E ON F.b = E.c GROUP BY 1,2;
    

    See:
    https://my.vertica.com/docs/9.0.x/HTML/index.htm#Authoring/SQLReferenceManual/Statements/SELECT/joined-table.htm

  • hi Jim,

    It was a typo,

    queries are

    explain SELECT F.b,F.c FROM foo AS F, foo WHERE Foo.b = F.c GROUP BY 1,2;
    and
    explain SELECT F.b,F.a FROM foo AS F, foo WHERE foo.b = F.a GROUP BY 1,2;

    for first i am getting

    +-GROUPBY HASH (LOCAL RESEGMENT GROUPS) [Cost: 4, Rows: 1 (NO STATISTICS)] (PATH ID: 1)

    and second

    +-GROUPBY PIPELINED [Cost: 4, Rows: 1 (NO STATISTICS)] (PATH ID: 1)

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file

Can't find what you're looking for? Search the Vertica Documentation, Knowledge Base, or Blog for more information.