Partitioning problem + question

kfruchtmankfruchtman Vertica Customer

Hi Experts!
I have been trying to partition a billion rows table according to time by CALENDAR_HIERARCHY_DAY algorithm.
when I run the statement:
ALTER TABLE public.spins
PARTITION BY time::DATE
GROUP BY CALENDAR_HIERARCHY_DAY(time::DATE, 2, 1) REORGANIZE;

I get the error:
ROLLBACK 2648: Column time in PARTITION BY expression is not allowed, since it is not present in some projections
HINT: Projection "spins_pk" does not contain column "time"

When I try to drop the projection (I don't really need it..) I get:

WARNING 7000: The projection 'spins_pk' was used to enforce the enabled key constraint 'pk', and may be regenerated to validate a DML statement on the base table

how can I workaround that issue?should I add the column time to the pk projection?

And a general partitioning question
If I want my partitions to be as follows: 24 hours - by hours, 7 days , 1 month, 3 month, 1 year..
what is the group by clause needed for the
GROUP BY CALENDAR_HIERARCHY algorithm.

Many thanks!!
Keren

Answers

  • Jim_KnicelyJim_Knicely - Select Field - Administrator
    edited April 2021

    This smells llike a bug... But, if you disable the PK you should be able to add the PARTITIONing...

    Example:

    verticademos=> CREATE TABLE spins(c INT NOT NULL, time TIMESTAMP) ORDER BY time;
    CREATE TABLE
    
    verticademos=> ALTER TABLE spins ADD CONSTRAINT spins_pk PRIMARY KEY (c) ENABLED;
    ALTER TABLE
    
    verticademos=> INSERT INTO spins SELECT 1, SYSDATE;
     OUTPUT
    --------
          1
    (1 row)
    
    verticademos=> ALTER TABLE spins PARTITION BY time::DATE GROUP BY CALENDAR_HIERARCHY_DAY(time::DATE, 2, 1) REORGANIZE;
    ROLLBACK 2648:  Column time in PARTITION BY expression is not allowed, since it is not present in some projections
    HINT:  Projection "spins_pk" does not contain column "time"
    
    verticademos=> DROP PROJECTION public.spins_pk;
    WARNING 7000:  The projection 'spins_pk' was used to enforce the enabled key constraint 'spins_pk', and may be regenerated to validate a DML statement on the base table
    DROP PROJECTION
    
    verticademos=> ALTER TABLE spins ALTER CONSTRAINT spins_pk DISABLED;
    ALTER TABLE
    
    verticademos=> ALTER TABLE spins PARTITION BY time::DATE GROUP BY CALENDAR_HIERARCHY_DAY(time::DATE, 2, 1) REORGANIZE;
    NOTICE 8364:  The new partitioning scheme will produce partitions in 1 physical storage containers per projection
    WARNING 9249:  Using PARTITION expression that may result in NULL values
    HINT:  DML statements which produce NULL values for partition expressions will roll back
    NOTICE 4785:  Started background repartition table task
    ALTER TABLE
    
    verticademos=> ALTER TABLE spins ALTER CONSTRAINT spins_pk ENABLED;
    ALTER TABLE
    
    
  • Jim_KnicelyJim_Knicely - Select Field - Administrator

    And for your general partitioning question... You can get pretty creative with the PARTITION GROUP BY clause!

    Example:

    verticademos=> SELECT c FROM t;
                 c
    ----------------------------
     2021-08-14 00:00:00
     2023-04-14 00:00:00
     2024-04-15 00:00:00
     2021-06-14 00:00:00
     2021-04-29 02:46:58.528717
     2021-04-17 02:46:58.516394
     2021-04-14 02:46:58.470838
     2021-04-14 04:46:58.488554
     2021-04-14 04:46:58.500997
     2021-04-15 02:46:58.510026
    (10 rows)
    
    verticademos=> SELECT c,
    verticademos->        CASE
    verticademos->          WHEN DATEDIFF('HOUR', NOW(), c) <= 24
    verticademos->            THEN 1
    verticademos->          WHEN DATEDIFF('DAY', NOW(), c) <= 7
    verticademos->            THEN 2
    verticademos->          WHEN DATEDIFF('MONTH', NOW(), c) <= 1
    verticademos->            THEN 3
    verticademos->          WHEN DATEDIFF('MONTH', NOW(), c) <= 3
    verticademos->            THEN 4
    verticademos->          ELSE 5
    verticademos->        END
    verticademos-> FROM t;
                 c              | case
    ----------------------------+------
     2021-08-14 00:00:00        |    5
     2023-04-14 00:00:00        |    5
     2024-04-15 00:00:00        |    5
     2021-06-14 00:00:00        |    4
     2021-04-29 02:46:58.528717 |    3
     2021-04-17 02:46:58.516394 |    2
     2021-04-14 02:46:58.470838 |    1
     2021-04-14 04:46:58.488554 |    1
     2021-04-14 04:46:58.500997 |    1
     2021-04-15 02:46:58.510026 |    1
    (10 rows)
    
    verticademos=> ALTER TABLE t PARTITION BY (c)
    verticademos->   GROUP BY (
    verticademos(>     CASE WHEN DATEDIFF('HOUR', NOW(), c) <= 24
    verticademos(>            THEN 1
    verticademos(>          WHEN DATEDIFF('DAY', NOW(), c) <= 7
    verticademos(>            THEN 2
    verticademos(>          WHEN DATEDIFF('MONTH', NOW(), c) <= 1
    verticademos(>            THEN 3
    verticademos(>          WHEN DATEDIFF('MONTH', NOW(), c) <= 3
    verticademos(>            THEN 4
    verticademos(>          ELSE 5
    verticademos(>      END) REORGANIZE;
    WARNING 6100:  Using PARTITION expression that returns a Timestamp value
    HINT:  This PARTITION expression may cause too many data partitions.  Use of an expression that returns a more accurate value, such as a regular VARCHAR or INT, is encouraged
    NOTICE 4785:  Started background repartition table task
    ALTER TABLE
    
  • kfruchtmankfruchtman Vertica Customer

    Trying that now.Thank you so much!

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file