We're Moving!

The Vertica Forum is moving to a new OpenText Analytics Database (Vertica) Community.

Join us there to post discussion topics, learn about

product releases, share tips, access the blog, and much more.

Create My New Community Account Now


most efficient way to insert huge amount of data into vertica — Vertica Forum

most efficient way to insert huge amount of data into vertica

phil2phil2
edited March 2019 in General Discussion

Hello!

I'm looking for how I can load into my vertica db 20B events per day.
For now the fastest way I figured out is to use copy with custom udx source, default delimited extractor and pinned projection on target global temp table (in terms of speed and thread count).

And accidenlty I figured out that there is no query running on less then 4 threads. And even select 1 results in 4 threads:

explain verbose select 1;

Estimated resources for plan:

Scratch Memory MB: 0
File Handles: 0
Worker Threads: 4
Blocking Threads: 0
Externalizing Ops: 0
Unbounded Mem Ops: 0
Max Threads: 56

1) What are they used for? Can I reduce it to 1? I have already set executionparallelism to 1
2) Is there any way to avoid resegmentation on copy other then guessing resulting segment on source or using pinned projection. or 220 threads = 4 (initiator threads) + 4*18 (initiator blocking threads) + 8 * 18 (executor threads) on single copy is the best I can get? It can be 10 times less if I somehow avoid resegmentation on copy.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file