CPU utilization high for 3 hours after COPY DIRECTs

I posted the problem before, but didn't get any answer.  Since we have to resolve the issue, I post again.

 

 

I loaded data to Vertica using multiple (parallel) COPY DIRECTs (via a old version of Vertica connector).  It took about 40 mins to finish this.  But after it finished, from our Ganglia charts, we found the CPU utilization was still high for about 3 hours.  What was the CPU doing?  Can we speed up the computing needed after COPY DIRECTs?   Thanks.

 

 

Ey-Chih Chow

Comments

  • If you are running many COPY DIRECT statements, you may be creating work for the Tuple Mover. Specifically Mergeout. The documentation ought to be able to point you towards monitoring and tuning choices for Tuple Mover.

     

    - Derrick

  • Thanks for the information.  We have some reporting jobs running on the background.  When the CPU utilization is high due to mergeout of the tuple mover, the reporting jobs are not progressing.  Can we lower the priority of the tuple mover so the jobs for reporting and interactive analytics can be on schedule?  Thanks.

     

    Best regards,

     

    Ey-Chih Chow 

  • Ey-Chih Chow,

     

    The "Tuning the Tuple Mover" documentation page may help. It's immediately after the Tuple Mover documentation page I linked in my last reply.

     

    Derrick

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file