Reorganizing tables appears to hang

Hello,
On top of our database, we have an application that, at startup, tries to reorganize partitions on the tables that we want to do housekeeping on.
We do this with queries such as :
ALTER TABLE "SCHEMA"."TABLE" PARTITION BY EXTRACT(MINUTE FROM TASK_START_TIME) + EXTRACT(HOUR FROM TASK_START_TIME) * 100 + EXTRACT(DAY FROM TASK_START_TIME) * 10000 + EXTRACT(MONTH FROM TASK_START_TIME) * 1000000 + EXTRACT(YEAR FROM TASK_START_TIME) * 100000000 REORGANIZE
This used to work well when the database was "empty", but as the database grows, we almost systematically gets error like:
ERROR: Unavailable: [Txn 0xa0000003cfe36f] O lock table - timeout error Timed out O locking Table:SCHEMA.TABLE
And the application fails to start.
Any advise on what we could try to avoid this ?
Thanks

Comments

  • 1. Check what do a lock:
        => select * from locks where transaction_id = 45035996337660783;
    => select * from lock_usage;
    2. Check partition reorganize errors:
        => select * from partition_reorganize_errors;
    3. Check the transaction that holds a lock:
        => select * from query_requests where transaction_id = 45035996337660783;
    PS
    INTEGER(0xa0000003cfe36f) = 45035996337660783 
  • Thanks Daniel!

    If I may add a question to his response:  Do you need to perform this reorganization on database startup?  It's a very expensive operation, especially with large data sets; I'm not surprised that you're seeing lock timeouts as a result.  And once you've set the table partitioning once, unlike some other systems, it will generally stay that way (Vertica will automatically file all new data according to the table's partitioning, sort order, etc); there should be no need to regularly re-apply the data layout.

    Adam
  • Hi,
    Thanks for answering,

    I need to talk with our application's developper, reorganizing tables each time our apps starts might not be a good idea, and this can certainly be optimized.
    We tried to understand what was happening by directly doing the ALTER TABLE from VSQL, and it seems that it only "hangs" if the reorganization percentage for the table is not null, in other words, if a re-organize on that table was already asked.
    If this is the case, we have a simple way to avoid the problem by testing this percentage before starting a new re-organize.

    Does it make sense ?
  • Hi Luc,

    Yes, that makes sense.  Let us know if you're able to optimize the application, or if there's some reason that the data must always be explicitly reorganized.

    The "hang" makes sense in this case:  You can't run two reorganizes at the same time.  (Say you had someone sorting a filing cabinet.  How much sense would it make to have someone else come along and try to sort the filing cabinet, potentially in a different way, at the same time?  It'd be a bit awkward and silly, and I'm not sure why you'd do it.)  So we queue these requests; the second one will sit in the queue (and therefore appear to hang) until the first has finished.

    Checking for a NULL in that field would likely be an effective workaround.  You could also look for locks on the table in question using the queries that Daniel suggested above.

    Adam

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file