The Vertica Forum recently got a makeover! Let us know what you think by filling out this short, anonymous survey.
Please take this survey to help us learn more about how you use third party tools. Your input is greatly appreciated!

Upload data to Vertica EC2

What options and their benefits/limitations are to upload data to Vertica? Is S3 the only method? As I understand S3 also costs something. Is there some good way to export data from EC2 PostgreSQL instance to Vertica on EC2, that would be quick and not very costly?

Comments

  • When loading data to a EC2 instance you need to make use of multi-thread

     

     Create an AWS DirectConnect( which establishes a dedicated private network connection between AWS and your source).

     Brake the file into chuncks (eg: 500GB file into 50 x 10Gb files) - use paralelism; tools for this task(Bucket Explorer, S3 Explorer, CloudBerry).

     Try to  look into use Tsunami-UDP.

     

    One way i do it : 

     - i have my S3 bucket receveing loads from my OLTP on a daily basis.

    - mount point on each node to the s3 bucket(mount's it on boot time).

     - 5 nodes sleaping Vertica cluster,

     - i load my data using "all nodes" from S3 bucket(you need chunks).

     

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file