Please take this survey to help us learn more about how you use third party tools. Your input is greatly appreciated!

Sample ETL load script - Help

Can someone help me with a python script to load a file into vertica table. Since it is a huge file, I need to load with 4 parallel threads. Please help.


  • The "cleanest" way to make apportioned load happen is to have the load file on a directory that is locally mounted under the exact same directory on all existing Vertica nodes.

    You transfer the uncompressed flat file to that directory, and it's immediately visible for all Vertica nodes with the same path name.

    Uncompressed is necessary as, for apportioned load, each parsing thread of the, say, 8 parsing threads will position at the beginning and end of "their own" 8-th of the file, using fseek(), and then advance byte by byte until they find the next record delimiter, to determine their own portion.
    With a compressed file, you can't do that.

  • I would recommend you to try using an Apportioned Load , the best possible way to let python script to load. Hope you make any use of this friend

  • edited January 19

    Thanks all for your answers! The main constraint I have is defining the parallel threads..
    if I have less than 1 billion records, then I would like to load with 6 parallel threads .
    if I have more than 1 billion records, then I would like to load with 8 parallel threads and the condition goes on .

    Apologize for late reply! I was travelling.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file

Can't find what you're looking for? Search the Vertica Documentation, Knowledge Base, or Blog for more information.