Upload data to Vertica EC2
What options and their benefits/limitations are to upload data to Vertica? Is S3 the only method? As I understand S3 also costs something. Is there some good way to export data from EC2 PostgreSQL instance to Vertica on EC2, that would be quick and not very costly?
0
Comments
When loading data to a EC2 instance you need to make use of multi-thread
Create an AWS DirectConnect( which establishes a dedicated private network connection between AWS and your source).
Brake the file into chuncks (eg: 500GB file into 50 x 10Gb files) - use paralelism; tools for this task(Bucket Explorer, S3 Explorer, CloudBerry).
Try to look into use Tsunami-UDP.
One way i do it :
- i have my S3 bucket receveing loads from my OLTP on a daily basis.
- mount point on each node to the s3 bucket(mount's it on boot time).
- 5 nodes sleaping Vertica cluster,
- i load my data using "all nodes" from S3 bucket(you need chunks).