We're Moving!

The Vertica Forum is moving to a new OpenText Analytics Database (Vertica) Community.

Join us there to post discussion topics, learn about

product releases, share tips, access the blog, and much more.

Create My New Community Account Now


COPY command and disk IO >70% — Vertica Forum

COPY command and disk IO >70%

Hi everyone!
I have a csv file that I want load to a vertica cluster (3 nodes).
The file contains 3 fields (integer, varchar(3), datetime), ~700Mb and 10 millions rows.
My table is segmented to all nodes by first field (integer and identity).
Then I try run COPY command (with "direct" parameter)
I get a very long execution of COPY (15 minute).
And I see that disk IO of cluster more than 70% ("bottle neck cluster" report by management console).
Is a high IO problem, or loading slows down due to another?
How I can resolve this issue?

Comments

  • edited March 2017

    Is there an estimate of the "typical" COPY load speed in Mb\s at one node?

  • I figured it out.

  • Can you share?

  • I use ssis to build ETL to Vertica.
    After adjusting the package's parameters (Buffer Size and so on), I got an increase load performance and reduced disk IO on the cluster to an acceptable 15%
    Although, the autocommit of the ADO provider's delivers a lot of pain (I would like to fill in all the data and only then make a commit).

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file
You can use Markdown in your post.