how to avoid resegmentation in a insert sql with Ksafe=1

vincent_wu · January 2015

Hi experts,
I have a 3-node vertica cluster with ksafe=1.
In our application, we lots of insert statement such as "insert into Table1 select col1,col2,... from Table2". The target table and the source table have almost the same structure, segmented on the same column.
But the insert performance is not good since the resegment.
The execute plan as below:
Access Path:
+-DML INSERT [Cost: 0, Rows: 0]
| Target Projection: dws.xxxxx_1_b1
| Target Projection: dws.xxxxx_1_b0 (RESEGMENT)

if I change the ksafe to 0, the performance is better. And I notice that no resegment happen in Ksafe=0.
Is there any way to avoid the resegment in ksafe=1? or any suggestion to improve insert performance?

David_Rydzewski_1 · January 2015

That's a feature. K-safety is achieved using buddy projections, so data will always resegment on insert with ksafe 1.

vincent_wu · January 2015

Thank you David.
Vertica suggest user to avoid resegment to improve performance. But we cannot avoid resegment for insert statement in ksafe=1. I think most customer use Ksafe=1 at product env.
So is there any way/suggestion to optimize the insert statement?

Thanks
Vincent

David_Rydzewski_1 · January 2015

If your insert isn't fast enough, add more nodes or get stronger nodes or try different segmentation and/or sort orders on your projections.

Vertica doc is talking about avoiding resegmentation during querying when possible.

The theory is you pay for resegmentation during insert, because you insert the data once, and you try to avoid it on query, because you have many users querying the data over and over.

vincent_wu · January 2015

Thanks a lot. David.
I wonder if I can create a projection with Ksafe=0 in a database with Ksafe=1. so the performance of insert statement will be better.
Notes: it's no matter if the data is lost since the application is always start from flat files.

id10t · January 2015

Hi!

Its possible, but not recommended. Because I do not recommend it I even will not explain a thing, just will point that in such case all database must be in k-safety=0. If database will be with k-safety=1 and some projections with k-safety=0 you will not be able to do a lot of things like advance epoch, refresh projection, etc and database will be unstable, with no HA and data lost guaranteed in case of any error, even a minor one. Is it what you want? I don't think so.

vincent_wu · January 2015

thank you very much

vincent_wu · January 2015

in 3 nodes cluster, The insert sql "insert into Table1 select col1,col2,... from Table2" last 40 seconds.
Now I extend the cluster from 3 nodes to 6 nodes. But it seems the performance is the same. No any improve. I'm really confused.
By the way, there are only one projections in both source target tables.

We're Moving!

Create My New Community Account Now

how to avoid resegmentation in a insert sql with Ksafe=1

Comments

Leave a Comment