Why is merge_partition deprecated?

blaublau Registered User

What should be done as an alternative?

We want to partition by day but do not want the number of partitions to grow too much that it impinges performance?

Comments

  • Rahul_ChoudharyRahul_Choudhary Registered User

    Hi

     

    It has been deprecated due to internal complicacies where it gets too complex and resource intensive when the ROS splitting occurs.

     

    For e.g: in case after merge partition if you want to un-merge and drop any partition, it has to do ROS splitting - big task and that time it would could increase the ROS container limit too.

     

    The alternatives to merge_partitions are:

    1. You can use MOVE_PARTITIONS_TO_TABLE and create a view combining these tables.
    2. Redesign partition key so that we will have fewer partitions.

     

     

    Thanks

    Rahul Choudhary

  • blaublau Registered User

    Thanks Rahul.  I suppose that the problems with dropping part of a merged partition is problematic.

     

    However, I have a use case for merge_partition() for which I can't find a good alternative

     

    I have a very wide table (over 200 columns) and it needs to be maintained by updates instead of inserts (only partial information is available much of the time)

    To prevent the accumulation of delete vectors, I'd like to drop the last day and replace it with a fresh copy.

    But the width of the table precludes partitioning by day unless we can merge all the older partitions.

  • Sharon_CutterSharon_Cutter Registered User

    You mentioned that the width of the table precludes partitioning by day.  There isn't any relationship between table width and the number of partitions.

     

    But if the issue is that you have a retention policy that includes more than 2 years, and still want to be able to partition at least the most recent day's data by date - in addition to the view option that Rahul mentioned, you can also use a partition expression that uses a CASE statement to, for example, partition the most recent year's worth of data by day and the older data by month.  Then once per year update the partition expression and reorganize the table.  

     

      --Sharon

     

  • victtimvicttim Registered User

    Can you provide an example of partitioning using a CASE statement?

    Also, I don't understand how this will work for incoming data. For instance, assume I have a "Date" field in my table that I want to partition using a CASE statement as follows:

    (note Date is an INT in the form 20170915 for Sep 15, 2017.

    PARTITION BY
    CASE 
            WHEN DATEDIFF(day, TO_DATE(TO_CHAR(Date), 'yyyyMMdd'), CURRENT_DATE) < 1 then Date 
            WHEN DATEDIFF(week, TO_DATE(TO_CHAR(Date), 'yyyyMMdd'), CURRENT_DATE) = 1 then Date // 10
            ELSE Date // 100
        END
    

    My partition requirement is that if DATE is TODAY(), then partition by DATE. If DATE is up to 1 week ago, partition by week, otherwise partition by MONTH.

    So when data is loaded for today, it gets loaded into partition 20170915. Tomorrow when data comes in for 20170916, wouldn't the data loaded for 29170915 still remain in the "by date" partition. Ideally what I want is to MERGE partitions older than a certain time frame. But now that MERGE is gone, I am not sure if this CASE statement methodology works. At least not the way I have it now.

  • victtimvicttim Registered User

    I suppose, the way to do what I'm asking is to
    a) Initially PARTITION BY DATE
    b) Periodically, REORGANIZE the partitions so as to consolidate older dates into fewer partitions.

    Example:

    ALTER TABLE MyTable 
    PARTITION BY
    CASE 
            WHEN DATEDIFF(day, TO_DATE(TO_CHAR(Date), 'yyyyMMdd'), CURRENT_DATE) < 1 then Date 
            WHEN DATEDIFF(week, TO_DATE(TO_CHAR(Date), 'yyyyMMdd'), CURRENT_DATE) = 1 then Date // 10
            ELSE Date // 100
    END
    REORGANIZE;
    

    This would need to be done periodically.

    If anyone has a better idea, please let me know. Thanks;

  • jheffnerjheffner Employee, Registered User

    The upcoming Vertica 9 release will have hierarchical partition management, which is very close what you're asking for, and is designed to address the shortcomings of merge_partition. https://www.microfocus.com/about/press-room/article/2017/micro-focus-introduces-vertica-9/