Options/tools to migrate Vertica data to Azure Cloud ADLS
vasanthvk
Vertica Customer
Hi,
I am looking for options/tools to migrate Vertica data to Azure cloud (ADLS), can someone please provide some guidance? Also, any reference architecture would be appreciated.
Thanks,
Vasanth Kumar
0
Answers
You can use export to parquet and copy the data to azure
https://www.vertica.com/docs/12.0.x/HTML/Content/Authoring/SQLReferenceManual/Statements/EXPORTTOPARQUET.htm
Vertica can read external data created using ADLS Gen2, and data that Vertica exports can be read using ADLS Gen2.
https://www.vertica.com/docs/12.0.x/HTML/Content/Authoring/SQLReferenceManual/FileSystems/AzureBlob.htm
Thank you, are there any other options the data size we are looking to migrate is from few GB's to 100 TB.
Another option would be using Azure Data factory.
https://learn.microsoft.com/en-us/azure/data-factory/connector-vertica?tabs=data-factory
export to parquet is faster and parquet file size will also be way smaller compared to using other export options. It would be useful if you want to create external tables and query the parquet files present in ADLS
If you partition your EXPORT, Vertica will parallelize the export based on number of partition keys and number of cores per node. So it's fairly fast and may only be limited to network bandwidth.
However, are you already on Azure? Or migrating to Azure? Is the intent to export data for other applications, or move the Vertica DB to the cloud? The Enterprise to Eon migration script can help move the entire DB to Eon mode on Azure.
When you export to S3, GCS, or Azure Vertica writes files directly to the destination path, so you must wait for the export to finish before reading the files.