Load performance expectations
Hi, I have a 3 way cluster with 6.1.2 freshly installed on Redhat 6.4. I am new to vertica but have lots of other DB experience. I am loading a moderate sized DB onto it. Each machine in the cluster is identical. Hexacore AMD processors, 16GB mem, fast SSD drives (io tests shows 500MB/s speed capable), private 1Gb network. All machines are real, no virtualization. The file I am loading is about 150GB The command I am issuing to execute the load is COPY relationship FROM :input_file DELIMITER '|' NULL '' DIRECT; Pretty much copied from the example There are 3 integer columns in the table. Every row has 3 values (no nulls) There are about 8 Billion rows in the table It is taking 4.75 hrs to load this data, which seems like a long time. The machine running the load goes to about 80% CPU utilization and the network goes to about 20MB/sec. The other 2 nodes go to about 30% cpu utilization with network utilization around 10MB/s. All the machines report very little io via iostat. I have a private 10Gbe network available for these machines but do not want to deploy it until I have a base line on 1GB network. Do the numbers I am seeing appear correct to experienced vertica users. Is there anything I can do to reduce these times? Thanks for your time.
0
Comments
With info, provided by you - NO.
If your bottleneck is IO (and CPU is 80% IDLE at same moment), no matter what you will do with CPU it will not improve IO (and so loads too).