performance consideration for bulk loading
We have 1 Node Vertica Database ...
how can we increase performance for loading data using COPY command in vertica?
based on our tests, in a server with 2 CPU core and 8 GB RAM, loading a json file with 1 million (900 MB size) takes 70 seconds, same file in a server with 128 GB RAM and 32 CPU core takes 30 seconds...
if we want for example loading this file takes 10 seconds what we should do ?
0
Answers
I think you should talk about both your IO and network throughput. Use vioperf and vnetworkperf and check the bottleneck and if it's coherent wrt to your hardware specs.
I don't know if you'll find this useful, and in terms of improving load performance my understanding is one of your options is to try apportioned load.
https://my.vertica.com/docs/8.1.x/HTML/index.htm#Authoring/ExtendingVertica/UDx/UDL/ApportionedLoad.htm?Highlight=apportioned load