how to avoid resegmentation in a insert sql with Ksafe=1

Hi experts,
   I have a 3-node vertica cluster with ksafe=1. 
   In our application, we lots of insert statement such as "insert into Table1 select col1,col2,... from Table2". The target table and the source table have almost the same structure, segmented on the same column. 
   But the insert performance is not good since the resegment. 
   The execute plan as below:
Access Path:
+-DML INSERT [Cost: 0, Rows: 0]
|  Target Projection: dws.xxxxx_1_b1
|  Target Projection: dws.xxxxx_1_b0 (RESEGMENT)

if I change the ksafe to 0, the performance is better. And I notice that no resegment  happen in Ksafe=0.
Is there any way to avoid the resegment in ksafe=1? or any suggestion to improve insert performance?

   

Comments

  • That's a feature. K-safety is achieved using buddy projections, so data will always resegment on insert with ksafe 1.
  • Thank you David.
    Vertica suggest user to avoid resegment to improve performance. But we cannot avoid resegment for insert statement in ksafe=1.  I think most customer use Ksafe=1 at product env. 
    So is there any way/suggestion to optimize the insert statement?

    Thanks
    Vincent
  • If your insert isn't fast enough, add more nodes or get stronger nodes or try different segmentation and/or sort orders on your projections.

    Vertica doc is talking about avoiding resegmentation during querying when possible.

    The theory is you pay for resegmentation during insert, because you insert the data once, and you try to avoid it on query, because you have many users querying the data over and over.
  • Thanks a lot. David.
    I wonder if I can create  a projection with Ksafe=0 in a database with Ksafe=1. so the performance of insert statement will be better.  
    Notes: it's no matter if the data is lost since the application is always start from flat files.

  • Hi!

    Its possible, but not recommended. Because I do not recommend it I even will not explain a thing, just will point that in such case all database must be in k-safety=0. If database will be with k-safety=1 and some projections with k-safety=0 you will not be able to do a lot of things like advance epoch, refresh projection, etc and database will be unstable, with no HA and data lost guaranteed in case of any error, even a minor one. Is it what you want? I don't think so.
  • thank you very much
  • in 3 nodes cluster, The insert sql "insert into Table1 select col1,col2,... from Table2"  last 40 seconds.
    Now I extend the cluster from 3 nodes to 6 nodes. But it seems the performance is the same. No any improve.  I'm really confused. 
    By the way, there are only one projections in both source target tables.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file