Hugh file needs to be loaded in Vertica using COPY

Hi,
We have a Vertica cluster of 3 nodes. And we have to load data into vertica.
Each of the compressed file size is of 3GBs or more.

Can you please let me know is there any efficient way to load into vertica?
Currently we are using COPY with DIRECT.. But still its taking hugh amount of time and some times its failing also...

Thanks in advance..

Comments

  • What does your table DDL look like?

  • It is wide table? If yes, then performance impact is quite imaginable and product developers have been improving performance significantly by each version.

    From your first statement, I assume there are multiple files to load. Can you try loading by multiple parallel load streams and see if that helps?

    Please follow below doc for same.
    https://my.vertica.com/docs/7.2.x/HTML/index.htm#Authoring/AdministratorsGuide/BulkLoadCOPY/UsingParallelLoadStreams.htm

    One more thing can be considered is resource pool optimization. You can see if copy is run from pool which is already burdened and running other queries, depriving your query for sufficient resources. Please check this using resource_pool_status at run time of query.

    Probably you would want to create a new pool and run this query from it by just running run query with plannedcurrency very low and auto executionparallelism. If you increase your CPU cores, it should help as well.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file