Upload data to Vertica EC2

What options and their benefits/limitations are to upload data to Vertica? Is S3 the only method? As I understand S3 also costs something. Is there some good way to export data from EC2 PostgreSQL instance to Vertica on EC2, that would be quick and not very costly?

Comments

  • When loading data to a EC2 instance you need to make use of multi-thread

     

     Create an AWS DirectConnect( which establishes a dedicated private network connection between AWS and your source).

     Brake the file into chuncks (eg: 500GB file into 50 x 10Gb files) - use paralelism; tools for this task(Bucket Explorer, S3 Explorer, CloudBerry).

     Try to  look into use Tsunami-UDP.

     

    One way i do it : 

     - i have my S3 bucket receveing loads from my OLTP on a daily basis.

    - mount point on each node to the s3 bucket(mount's it on boot time).

     - 5 nodes sleaping Vertica cluster,

     - i load my data using "all nodes" from S3 bucket(you need chunks).

     

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file