Copy running for long time

hi

 

I found a Copy was running for long hours in Dev. When i looked at load_streams, it showed 21% progress, but doesn't move further. It's a simple copy table from local '/home/file1' with delimiter as '~' rejected data '/home/reject1';

 

Query ran for almost 2 hours and failed due to runtimecap set.

 

query_requests - memory_acquired_mb showed null, though is_executing is true (i felt this is strange?)

locks - "I" lock was placed on the table but was immediately acquired

execution_engine_profiles - it shows stats around memory acquired, send and receive between nodes, etc. Not sure what else to look for in this table. I specially pulled counters such as memory and execution, but cannot make out how to interpret those.

query_events - did not showed me any pushback from WOS as query did not had any direct hint

 

I was stuck at a point, trying to understand - how to determine if the transaction was truly running, and if running, what is it actually doing or is it waiting on something?

 

Table that is being loaded has around 35 cols, max length col is 100 bytes varchar, just one projection which is lazy with literally all the cols in the order by clause and segmented on around first 13 cols. It already has 3.6M records.

 

searched thru vertica log, did not see any alarming though. So the question is, how to determine if a transaction is truly running or is it hung? if it is running, where it is spending most of the time or waiting on?

 

Thanks much for any inputs

 

Comments

  • System view QUERY_PLAN_PROFILES provides detailed execution status for queries that are currently running in the system.
    Output from the table shows the real-time flow of data and the time and resources consumed for each path in each query plan.


    Run this

     

    select * from sessions where current_statement ilike '%copy%';

     

    Find your COPY command and take values from transaction_id and statement_id columns

    Run this multiple times and monitor if any numbers changing

     

    select * from QUERY_PLAN_PROFILES
    where transaction_id = xxxxxxxxxxxxxx and statement_id = xxxxx
    order by path_id desc, path_line_index;


    it has RUNNING_TIME, READ_FROM_DISK_BYTES, RECEIVED_BYTES and SENT_BYTES.


    My assumptions that for running request those numbers should be changing.


    Hope this helps.

  • Thanks for the update. I was waiting to see if the scenario re-occurs and it did today.

     See the attached screenshot. First line where it shows is executing is for DML INSERT operation. It seems to have memory allocated, but doesn't do anything? Why is the running time showing negative?

     

    How to identify and rectify these  kind of requests that are making the cluster to slow down as these are holding huge memory and hanging around for a long time.

     

     

     

     

     

     

     

  • Also, even after refresh, I don't see any of the digits moving though. Only I see the running time is changing.

     

  • What version of vertica are you running? How much data are you trying to load? Size of file and how many records?

    Is the table partitioned? Try running the copy statement with abort on error keyword. Is the table segmented? 

  •  

     

    Looks like we need to go into details.

     

    Can you share actual COPY statement ?

  • It may not issue with the copy, I found few other select queries as well, that has 5-6 table joins, when the query is run, it just hangs, nothing seems to be happening. Thanks to RunTimeCap, query is getting killed after an hour. 

     

    It looks to be some issue with the network distribution at first instance, tried to involve the support for the past 3 weeks, they weren't able to figure out the issue. I would be more happy if the query just fails either due to resource constraints or network but not to hang on to the resource and doesn't do anything.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file