How to use live aggregated projection ?

I have situation where I need to use aggregated query within a sub query.

select
x.col1,
x.col2,
x.col3,
b.col5,
b.col6
(
select
a.col1,
sum(a.col2) as col2,
sum(a.col3) as col3,
sum(nvl(col3,0)) as col4
from
table1 a
where
col4 between to_date('09/01/2016', 'MM/dd/yyyy') AND to_date('08/31/2018', 'MM/dd/yyyy')
group by
col1
) x
join table2 b on x.col1 = b.col1

Here in this query, inner sub-query is taking to long time. So I try to use live aggregated projection. After creating a projection like:

CREATE PROJECTION table1_lap AS
SELECT
col1
SUM(col2) col2,
SUM(col3) col3
FROM table1
GROUP BY col1;

But execution time did not reduce. Please tell me the solution.
Thank you....
Bimod

Answers

  • s_crossmans_crossman Vertica Employee Employee

    Hi bdpanta,

    I think you are close. The LAP projection needs to specify columns to store the selected column data in. So your create would be more like:

    CREATE PROJECTION table1_lap (col1, col2, col3) AS
    SELECT
    col1
    SUM(col2) col2,
    SUM(col3) col3
    FROM table1
    GROUP BY col1;

    If it syntactically qualifies as a LAP definition you should get a couple WARNING messages similar to below:
    WARNING 6852: Live Aggregate Projection "table1_lap" will be created for "table1". Data in "table1" will be neither updated nor deleted
    WARNING 4468: Projection is not available for query processing. Execute the select start_refresh() function to copy data into this projection.
    The projection must have a sufficient number of buddy projections and all nodes must be up before starting a refresh

    Once refreshed the LAP should get used if the optimizer determines it satisfies the query. Here's a very simple example.

    dbadmin=> create table table1 (col1 int, col2 int, col3 int);
    CREATE TABLE
    dbadmin=> insert /+direct/ into table1 values('1','1','1');

    OUTPUT

      1
    

    dbadmin=> insert /+direct/ into table1 values('2','2','2');

    OUTPUT

      1
    

    dbadmin=> commit;
    COMMIT

    note the projection used is the default super projection

    dbadmin=> explain select col2,col3 from table1;

    QUERY PLAN

    QUERY PLAN DESCRIPTION:


    explain select col2,col3 from table1;
    Access Path:
    +-STORAGE ACCESS for table1 [Cost: 18, Rows: 2 (NO STATISTICS)] (PATH ID: 1)
    | Projection: public.table1_super
    | Materialize: table1.col2, table1.col3

    create the LAP projection

    dbadmin=> create projection table1_lap (col2, col3) as select col2,col3 from table1;
    WARNING 4468: Projection <public.table1_lap> is not available for query processing. Execute the select start_refresh() function to copy data into this projection.
    The projection must have a sufficient number of buddy projections and all nodes must be up before starting a refresh
    CREATE PROJECTION

    refresh

    dbadmin=> select refresh();
    refresh
    ---------------------------------------------------------------------------------------------------------------- Refresh completed with the following outcomes:

    Projection Name: [Anchor Table] [Status] [Refresh Method] [Error Count] [Duration (sec)]

    "public"."table1_lap": [table1] [refreshed] [scratch] [0] [0]

    rerun the explain select col2,col3 and notice it now chose the LAP projection

    also notice the Cost: is 10 vs 18 with the super projection, indicating it should be faster

    dbadmin=> explain select col2,col3 from table1;

    QUERY PLAN

    QUERY PLAN DESCRIPTION:


    explain select col2,col3 from table1;
    Access Path:
    +-STORAGE ACCESS for table1 [Cost: 10, Rows: 2 (NO STATISTICS)] (PATH ID: 1)
    | Projection: public.table1_lap
    | Materialize: table1.col2, table1.col3

    I hope it helps,

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file