Resegmentation During GROUP BY
Hi,
i am worried about the resegmentation during Group BY
say
-- to create super projection use ENCODING AUTO
create table foo (a int ENCODING AUTO, b int, c int)
CREATE PROJECTION foo_pj
(a,b,c) AS
select a,b,c from foo
SEGMENTED BY HASH(a,b,c)
ALL NODES OFFSET 0;
CREATE PROJECTION foo_pj1
(b,c) AS
select b,c from foo
SEGMENTED BY HASH(b,c)
ALL NODES OFFSET 0;
---Check projection is fine
select projection_basename,segment_expression,is_super_projection from projections where anchor_table_name='foo'
foo hash(foo.a, foo.b, foo.c) true
foo_pj hash(foo.a, foo.b, foo.c) true
foo_pj1 hash(foo.b, foo.c, foo.a) true
Now, when i execute following
explain SELECT F.b,F.a FROM foo AS F, foo WHERE foo.b = F.a GROUP BY 1,2;
this is pipelined as
+-GROUPBY PIPELINED [Cost: 4, Rows: 1 (NO STATISTICS)] (PATH ID: 1)
| Group By: F.a, F.b
| +---> JOIN HASH [Cost: 3, Rows: 1 (NO STATISTICS)] (PATH ID: 2)
but for
explain SELECT F.b,F.c FROM foo AS F, foo WHERE F.b = F.c GROUP BY 1,2;\
i an getting HASH,
+-GROUPBY HASH (LOCAL RESEGMENT GROUPS) [Cost: 5, Rows: 1 (NO STATISTICS)] (PATH ID: 1)
| Group By: F.b, F.c
| +---> JOIN (CROSS JOIN) [Cost: 4, Rows: 1 (NO STATISTICS)] (PATH ID: 2)
| | +-- Outer -> STORAGE ACCESS for foo [Cost: 1, Rows: 1 (NO STATISTICS)] (PATH ID: 3)
This means the second projection (foo_pj1) created has no effect on the SEGMENTATION?
Or my understanding of this is wrong please clarify.
Karthik S G
Comments
Hi,
I'm not sure I understand your join condition. Your WHERE clause is comparing data in the same table. You are probably getting a cross join.
Do you mean to join foo to foo? Then you need to alias each instance of foo...
Also, try ti use SQL99 join syntax:
See:
https://my.vertica.com/docs/9.0.x/HTML/index.htm#Authoring/SQLReferenceManual/Statements/SELECT/joined-table.htm
hi Jim,
It was a typo,
queries are
explain SELECT F.b,F.c FROM foo AS F, foo WHERE Foo.b = F.c GROUP BY 1,2;
and
explain SELECT F.b,F.a FROM foo AS F, foo WHERE foo.b = F.a GROUP BY 1,2;
for first i am getting
+-GROUPBY HASH (LOCAL RESEGMENT GROUPS) [Cost: 4, Rows: 1 (NO STATISTICS)] (PATH ID: 1)
and second
+-GROUPBY PIPELINED [Cost: 4, Rows: 1 (NO STATISTICS)] (PATH ID: 1)