Analyze statistics takes comparatively longer time if the table has no data

I observed that if analyze_statistics is invoked for the table with no data (and hence, no projections), it takes more than 2 minutes to complete. 

When analyze_statistics is issued for such table, message "<INFO> AnalyzeStatsForTable: No accessable, up to date super projection found for table " is logged in vertica log file, and around 3 minutes after that, the message in log file was "Commit Complete".

Is this a known issue?

Comments

  • Hi Rupendra,

    Is the table actually empty?  ("TRUNCATE TABLE t;")  Or is there just no data accessible to the current transaction? ("DELETE FROM t;")

    Vertica's performance degrades, sometimes dramatically, in the face of lots of DELETEs.  (Partition- or table-level operations, staging tables and INSERT .. SELECT, etc., should generally be preferred instead.)  This is because Vertica keeps around old data -- you can query recently-DELETEd rows with the AT EPOCH syntax; it may also still be in use by other long-running transactions.  If you need to deal with DELETEs, there's a bunch of reading in our documentation about how old data is cleaned up, how to recover performance on tables with lots of deleted data, etc.

    If the table really is empty, I suspect you have a more-general issue with your database installation.  Are any other queries unexpectedly slow?

    Adam
  • [EDIT] Right, no up-to-date projections.  The above still may apply if you have only out-of-date projections (though then you likely have other issues as well); and the last question applies regardless.
  • Thanks Adam for response.

    The table was created and then TRUNCATE TABLE was issued. But, data was never loaded to the table. So, the table is really empty and there are no projections for this table. For other non-empty tables, the analyze_statistics gets completed in seconds, though. 

    No significant issues with other queries. but I can see warning like below frequently in log file-
    <WARNING> MemoryPool static OPT::Plan* OPT::OptimizerInterface::makePlan(CAT::VQuery*, OPT::OptimizerConfig&) is using more memory than allowed. 176689112 bytes (PG memory 1, used 176687456, free 83343008) allocated, limit was 100 MB

    Thanks,
    Rupendra
  • Hi Rupendra,

    Hm...  That warning shouldn't be a major issue.  It is a little weird, though.

    Just to double-check -- when you see that warning, does your system actually run out of memory?; is it swapping or similar?  (The "v_monitor"."memory_usage" system table should be able to give you some information about that.)

    Also, out of curiosity, is this a very wide table (with lots of columns and/or very big columns)?

    If you have an Enterprise account, I'd encourage you to file a support case.  There is a bunch of information that would be useful to gather in order to figure out what's going on with your system; they have the tools to do so.  (I don't see this issue when I create an empty table and run analyze_statistics on it.)

    Thanks,
    Adam
  • I can't reproduce.
    1. What Vertica version you are running?
    2. How long db is running? (On fresh db v6, v7 - it works immediately)
    3. Do you get the issue with ANALYZE_WORKLOAD ?
    4. What is LGE and current epoch? Can you monitor for Stale Checkpoint events?

    About stale: http://vertica-forums.com/viewtopic.php?f=7&t=1710&p=5648#p5648

    PS

    Example for epoch:
    and around 3 minutes after that, the message in log file was "Commit Complete".
    [Txn] <INFO> Commit Complete: Txn: fff0000000000cc1 at epoch 0x1
    What is epoch and what is LGE?

  • Hi Adam,

    I am using v6.1.3-0, and the database is running for last 93 days. 

    The commit complete at epoch 0xe9c25 and node current cpe at that time was e9c24. 
    2014-05-19 17:46:50.962 Timer Service:0x7d3f2c0 [Recover] <INFO>   My local node LGE = 0xe9c24 and current epoch = 0xe9c25 
    I have not seen may issues for analyze_workload. 

    To answer your questions on your previous comment, the table is not a wide table. It has only 11 columns, and when I got the memory related warnings, there seems to be only 44% of system memory used.

    Let me create a support case for this.

    Thanks,
    Rupendra
  • Hi,
    there is noticeable slowdown if there is no superprojection on the table, altough 2 minutes is really big difference. I tried following test scenario and i am repeatable getting order of magnitude difference (10ms vs. 100ms):
    \timing
    --Create test schema
    create schema s1;
    --Create table without superprojection.
    create table s1.t1 (col1 int);
    --Run analyze_statistics - takes more than 100ms
    select analyze_statistics('s1.t1');
    --Create table with superprojection.
    create table s1.t2 as select col1 from s1.t1;
    --Run analyze_statistics - takes less than 20ms
    select analyze_statistics('s1.t2');
    --Cleanup test schema                                                                                            
    drop schema s1 cascade;
    Vladimir

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file