Trouble inserting into temp tables

Recently we've had trouble inserting into temp tables.  For instance, we have a SELECT query that only outputs 2,500 rows with 8 columns and takes less than 10 seconds to run.  When we add "INSERT INTO <temp_table>" OR "CREATE TABLE <temp_table> AS" to the beginning of the query it causes the query to run forever, never finish and flare up resources (CPU and disk I/O) causing the entire cluster to slow down.

This is a new problem that started this week and occurs for multiple queries.  There are some queries that are not effected though.

I looked at all settings in config_parameters and they are all set to default values.  We have not applied any patches or updates recently.

Is there anything else we should be looking at to try and diagnose this issue?

Comments

  • Hi!

    What Vertica version are you running (with all fixes please)? I got same issue(a long time ago), but with specific version and it was repaired with next fix.
  • Hi Daniel, we are currently using Vertica Analytic Database v7.0.0-1.
  • Hi!

    Thanks. I got the same issue with 6.0.0-0 and after it I didn't met it anymore.
    Can I ask you to try a one more thing? Try to dump query as csv file and load it as file. Does file get stacked too?(or it with INSERT only?)
  • Dumping as a CSV and importing works fine.  Problem only exists with INSERT.
  • Hi Jason!

    1. What you are reporting is a very serious issue. Are you enterprise customer? Can you open a ticket? 

    2. Can you install last fixes? Does problem still appear? (as I know last version is 7.0.1-1)

    3. The thing I can help so far its only in investigation of a problem. Have you a time/option for me?
    Can you enable QUERY PROFILING and to try to locate a step when query stacked? Does SELECT(fetch data) goes forever? Or actually INSERT(flush data)?
    (I think its INSERT, because you succeed with dump to csv file)
    Can you monitor a memory? If it always grows or not?
    Can you monitor a cpu? Does CPU do anything (sorting/encoding) or nothing?
  • Hi Daniel,
    I appreciate your help.  We are an enterprise customer and we've opened a ticket, so hopefully we can get it solved through that system.

    In regards to your other points, we are currently waiting until we have a second cluster to try the new version (7.0.1) in case anything goes wrong with the upgrade.

    We did QUERY PROFILING and, oddly, the point where the queries got stuck was during the SELECT.

    I was hoping that this was a known issue which is why I posted in the community, but now that we've determined it to be a unique issue I think it's best to try and solve through the ticket system like you mentioned.

    If/when we solve the problem, I will post the solution here.

    Thanks a lot for your help Daniel!

    - Jason
  • Hi Jason,

    Did you ever try this:

    1) This should give you the table structure just like the old one.
    CREATE TABLE TEST_NEW LIKE TEST_OLD;

    2) This should insert all of your records with no problems.
    INSERT /*+ DIRECT */ INTO TEST_NEW Select * from TEST_OLD;

    Try the above steps and let me know how it works.

    Thanks,
    Sashi

  • It's not an exact copy of a table.  It's creating a new table from a join query that uses multiple source tables.

    Also, the problem suddenly went away after 2 weeks.  We didn't do anything to fix it, which is concerning because we are worried it will come back again.
  • CREATE TABLE Sashi_Sales as Select A.sale_date_key, A.ship_date_key, A.call_center_key, B.online_page_key, A.sales_quantity, A.sales_dollar_amount, B.page_number from online_sales.online_sales_fact A join online_sales.online_page_dimension B on A.online_page_key = B.online_page_key;

    I just did that with 5 Million records and I had no issues. There could have been some resource contention that time. 

    We can wait to see if it happens again.

  • There was no resource contention.  We were consistently able to reproduce the problem even during off hours when there were no other users logged in.  The select query would take only a few seconds.  But when doing a "create table as" with the exact same select query would run forever and spike all resources until we cancelled it.

    After two weeks the problem magically went away.  It is concerning.
  • So the only thing I can think of is your source tables would have had delete vectors and they would have not been purged out of the db. So when you have a CTAS query running, it would probably have to filter records that were deleted before writing to the new CTAS table. Delete in Vertica is not a complete delete, only the records are marked for delete.This effectively leads to replay delete which is not desired. 

    Always please make sure to monitor the AHMEpochs and AHM time. Select make_ahm_now(); will move AHM forward. It will also tell you if there is a problem. At least the source tables in CTAS query are not present here -> Select * from delete_vectors;  (To have zero records).

    You have a lot of records in the delete records table - Running purge on individual will be less resource intensive and will take care of the deleted records for that table. Running Select purge(); -> will purge all the deleted records from the db. 

    Vertica internal processes will purge the data but sometimes there could be reasons the AHM to be held back. Monitoring them will be helpful.

    I hope this helps.

    -Sashi
  • Hi Jason,

    Random question -- what is the projection design of the table that you are SELECT'ing from?  "CREATE TABLE AS SELECT" will create a table with default projections; if the segmentation and sort order don't line up, that means a resegment/re-sort, which can be expensive.  "CREATE TABLE LIKE" mirrors the projection layout of the source table.

    [EDIT] Also, Sashi has some good points.

    Adam
  • Thanks Sashi!  That makes sense.  We will make sure to purge tables that have deletes to make sure this doesn't occur again.

    We also ran a purge around the time the problem got fixed, so that must have been what fixed it.
  • Hi Adam,
    We are selecting from multiple tables through join queries.  The outputs do not match the schema of any existing tables.

    For example, for one query the results set is only about 2,000 rows x 8 columns.  When running a "SELECT" it only takes about 4 seconds.  The "CREATE TABLE AS" would run indefinitely and spike all resources until we killed it.

    The problem is now resolved due to purging deleted rows, but I still do not understand why a "SELECT" would have no problems and a "CREATE TABLE AS" would blow up the cluster.

    Jason
  • Hi Adam,
    We are selecting from multiple tables through join queries.  The outputs do not match the schema of any existing tables.

    For example, for one query the results set is only about 2,000 rows x 8 columns.  When running a "SELECT" it only takes about 4 seconds.  The "CREATE TABLE AS" would run indefinitely and spike all resources until we killed it.

    The problem is now resolved due to purging deleted rows, but I still do not understand why a "SELECT" would have no problems and a "CREATE TABLE AS" would blow up the cluster.

    Jason

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file