most efficient way to insert huge amount of data into vertica
Hello!
I'm looking for how I can load into my vertica db 20B events per day.
For now the fastest way I figured out is to use copy with custom udx source, default delimited extractor and pinned projection on target global temp table (in terms of speed and thread count).
And accidenlty I figured out that there is no query running on less then 4 threads. And even select 1 results in 4 threads:
explain verbose select 1;
Estimated resources for plan:
Scratch Memory MB: 0
File Handles: 0
Worker Threads: 4
Blocking Threads: 0
Externalizing Ops: 0
Unbounded Mem Ops: 0
Max Threads: 56
1) What are they used for? Can I reduce it to 1? I have already set executionparallelism to 1
2) Is there any way to avoid resegmentation on copy other then guessing resulting segment on source or using pinned projection. or 220 threads = 4 (initiator threads) + 4*18 (initiator blocking threads) + 8 * 18 (executor threads) on single copy is the best I can get? It can be 10 times less if I somehow avoid resegmentation on copy.