The Vertica Forum recently got a makeover! Let us know what you think by filling out this short, anonymous survey.
Please take this survey to help us learn more about how you use third party tools. Your input is greatly appreciated!

most efficient way to insert huge amount of data into vertica

edited March 2019 in General Discussion


I'm looking for how I can load into my vertica db 20B events per day.
For now the fastest way I figured out is to use copy with custom udx source, default delimited extractor and pinned projection on target global temp table (in terms of speed and thread count).

And accidenlty I figured out that there is no query running on less then 4 threads. And even select 1 results in 4 threads:

explain verbose select 1;

Estimated resources for plan:

Scratch Memory MB: 0
File Handles: 0
Worker Threads: 4
Blocking Threads: 0
Externalizing Ops: 0
Unbounded Mem Ops: 0
Max Threads: 56

1) What are they used for? Can I reduce it to 1? I have already set executionparallelism to 1
2) Is there any way to avoid resegmentation on copy other then guessing resulting segment on source or using pinned projection. or 220 threads = 4 (initiator threads) + 4*18 (initiator blocking threads) + 8 * 18 (executor threads) on single copy is the best I can get? It can be 10 times less if I somehow avoid resegmentation on copy.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file