I have a situation

Hi ,

I have a situation where I need to use different queries on the same tables based on the value in the date field. For example if the date is less than 20170201 , use QUERY1 else use QUERY 2.

I was wondering as to how I could generate projections for these queries. One option would be to generate projections for both queries separately. But in this case, both sets of projections would be updated if one adds new data, whereas actually only one of the sets would need to be updated (since the other set is used only on historical old data).

Is there a way in vertica to handle this particular scenario ? Feel free to let me know if you need more information.

Comments

  • Jim_KnicelyJim_Knicely Administrator
    edited October 2018

    Hi,

    You do not have to create 2 sets of projections to service multiple queries. A single super projection can handle ALL queries against its anchor table.

    Are you looking to create query-specific projections for performance reasons?

    Note that you can not create a projection that contains a WHERE clause. That is, for an an anchor table, you won't be able to store only data less than 02/01/2017 in one projection and data greater than or equal to 02/01/2017 in another.

    Example:

    dbadmin=> CREATE TABLE test (a INT, b INT, c DATE);
    CREATE TABLE
    
    dbadmin=> CREATE PROJECTION test_query_1_pr AS SELECT a FROM test WHERE c < '02/01/2017';
    ERROR 5600:  Invalid predicate in projection-select. Only PK=FK equijoins are allowed
    

    What's the issue with having additional query specific projections for all of the data in the table? Storage?

    To avoid that you could separate the data into 2 tables, where one table contains data less than 02/01/2017 (i.e. a history table), and the other contains everything else. Then a database view can be used to combine the data into a single object.

    Example:

    dbadmin=> CREATE TABLE test (a INT, b INT, c DATE);
    CREATE TABLE
    
    dbadmin=> CREATE PROJECTION test_pr AS SELECT a, b, c FROM test ORDER BY c, a;
    CREATE PROJECTION
    
    dbadmin=> CREATE TABLE test_hist (a INT, b INT, c DATE);
    CREATE TABLE
    
    dbadmin=> CREATE PROJECTION test_hist_pr AS SELECT a, b, c FROM test_hist ORDER BY c, b;
    CREATE PROJECTION
    
    dbadmin=> INSERT INTO test SELECT 1, 1, '02/01/2017';
     OUTPUT
    --------
          1
    (1 row)
    
    dbadmin=> INSERT INTO test_hist SELECT 2, 2, '01/31/2017';
     OUTPUT
    --------
          1
    (1 row)
    
    dbadmin=> CREATE VIEW test_vw AS
    dbadmin-> SELECT a, b, c
    dbadmin->   FROM test
    dbadmin-> UNION ALL
    dbadmin-> SELECT a, b, c
    dbadmin->   FROM test_hist;
    CREATE VIEW
    
    dbadmin=> SELECT *
    dbadmin->   FROM test_vw
    dbadmin->  WHERE c >= '02/01/2017';
     a | b |     c
    ---+---+------------
     1 | 1 | 2017-02-01
    (1 row)
    
    dbadmin=> SELECT *
    dbadmin->   FROM test_vw
    dbadmin->  WHERE c < '02/01/2017';
     a | b |     c
    ---+---+------------
     2 | 2 | 2017-01-31
    (1 row)
    

    Not sure if I'd ever implement this suggestion in a production system :wink: If you are truly concerned with space, I wouldn't worry about it as Vertica's going to compress the data for each projection between 50-90%!

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file