Options

Should i run a manual table rebalance

HI 

i have a 6 nodes vertica cluster , and have an issue with the data distribution of 2 supper projections with its body projection , ksafe = 1 

 

you can see from the following info how the data is distributed . 

node_name | projection_name | used_compressed_gb
-----------------------+---------------------------------+--------------------
v_zksainsprd_node0003 | cxt_hua_u_tgt_cells_3g_super_b0 | 0.211387122049928
v_zksainsprd_node0004 | cxt_hua_u_tgt_cells_3g_super_b1 | 0.211508643813431
v_zksainsprd_node0001 | cxt_hua_u_tgt_cells_3g_super_b1 | 0.325459931977093
v_zksainsprd_node0006 | cxt_hua_u_tgt_cells_3g_super_b0 | 0.325497413985431
v_zksainsprd_node0003 | cxt_hua_u_tgt_cells_3g_super_b1 | 0.373414414003491
v_zksainsprd_node0002 | cxt_hua_u_tgt_cells_3g_super_b0 | 0.373473711311817
v_zksainsprd_node0001 | cxt_hua_u_tgt_cells_3g_super_b0 | 0.385713445022702
v_zksainsprd_node0002 | cxt_hua_u_tgt_cells_3g_super_b1 | 0.385810352861881
v_zksainsprd_node0004 | cxt_hua_u_tgt_cells_3g_super_b0 | 0.410186663269997
v_zksainsprd_node0005 | cxt_hua_u_tgt_cells_3g_super_b1 | 0.410329761914909
v_zksainsprd_node0003 | cxt_hua_l_tgt_cells_4g_super_b1 | 0.692139703780413
v_zksainsprd_node0002 | cxt_hua_l_tgt_cells_4g_super_b0 | 0.892960877157748
v_zksainsprd_node0005 | cxt_hua_l_tgt_cells_4g_super_b0 | 1.14525959454477
v_zksainsprd_node0006 | cxt_hua_l_tgt_cells_4g_super_b1 | 1.14530864171684
v_zksainsprd_node0005 | cxt_hua_l_tgt_cells_4g_super_b1 | 1.36490591801703
v_zksainsprd_node0004 | cxt_hua_l_tgt_cells_4g_super_b0 | 1.36506684683263
v_zksainsprd_node0002 | cxt_hua_l_tgt_cells_4g_super_b1 | 2.13218771573156
v_zksainsprd_node0001 | cxt_hua_l_tgt_cells_4g_super_b0 | 2.13254337385297
v_zksainsprd_node0001 | cxt_hua_l_tgt_cells_4g_super_b1 | 2.69299864303321
v_zksainsprd_node0006 | cxt_hua_l_tgt_cells_4g_super_b0 | 2.69310360588133

v_zksainsprd_node0004 | cxt_hua_l_tgt_cells_4g_super_b1 | 49.0983770284802
v_zksainsprd_node0003 | cxt_hua_l_tgt_cells_4g_super_b0 | 49.6091840816662
v_zksainsprd_node0006 | cxt_hua_u_tgt_cells_3g_super_b1 | 54.1744314581156
v_zksainsprd_node0005 | cxt_hua_u_tgt_cells_3g_super_b0 | 54.2100570099428

 

so the last 4 rows shows that node 3,4,5,6 has must more data than the rest of the nodes for each projection . 

my question is what can be the issue for this to happen , should we run a table rebalance to fix this ? 

 

 

Comments

  • Options

    Yu should check segmentation first. May it could be better to change segmentation key

  • Options

    what is the table definition ?

    Like Fin said - the key you use for segmentaiton is important !

     Try to use a column that is high candidate for a good predicate column or join column highly prefered. 

    If no good key is found in the table you can create your own unique column using a sequence , doing this alows you to  create a uniform distribution of data across the cluster and minimizing data skew.

     

     Also is important to understand that the cost of dynamically resegmenting data for a join or for a group by, when the projection isn't segmented on the join key or grouping columns, will usually be greater than the extra time caused by some skew.  If the data is skewed in a big way, you might want to think about avoiding the skew.

     

  • Options

    HI 

    its a big table ill just send the end of the projection creation on that table . 

     


    SELECT cxt_hua_u_tgt_cells_3g.audit_sid,
    cxt_hua_u_tgt_cells_3g.date_id,
    cxt_hua_u_tgt_cells_3g.srv_technology,
    cxt_hua_u_tgt_cells_3g.srv_vendor,
    cxt_hua_u_tgt_cells_3g.srv_rncname,
    cxt_hua_u_tgt_cells_3g.srv_rncid,
    cxt_hua_u_tgt_cells_3g.srv_nodebname,
    cxt_hua_u_tgt_cells_3g.srv_nodebid,
    cxt_hua_u_tgt_cells_3g.srv_cell_network_sid,
    cxt_hua_u_tgt_cells_3g.srv_cellname,
    cxt_hua_u_tgt_cells_3g.srv_cellid,
    cxt_hua_u_tgt_cells_3g.srv_cgi,
    cxt_hua_u_tgt_cells_3g.srv_lac,
    cxt_hua_u_tgt_cells_3g.srv_ci,
    cxt_hua_u_tgt_cells_3g.srv_rac,
    cxt_hua_u_tgt_cells_3g.srv_sac,
    cxt_hua_u_tgt_cells_3g.srv_uarfcndownlink,
    cxt_hua_u_tgt_cells_3g.srv_pscrambcode,
    cxt_hua_u_tgt_cells_3g.srv_cellstatus,
    cxt_hua_u_tgt_cells_3g.srv_latitude,
    cxt_hua_u_tgt_cells_3g.srv_longitude,
    cxt_hua_u_tgt_cells_3g.srv_azimuth,
    cxt_hua_u_tgt_cells_3g.srv_region,
    cxt_hua_u_tgt_cells_3g.srv_adj_path,
    cxt_hua_u_tgt_cells_3g.srv_adj_id,
    cxt_hua_u_tgt_cells_3g.srv_adj_scellrncid,
    cxt_hua_u_tgt_cells_3g.srv_adj_srccellid,
    cxt_hua_u_tgt_cells_3g.srv_adj_ncellrncid,
    cxt_hua_u_tgt_cells_3g.srv_adj_nbrcellid,
    cxt_hua_u_tgt_cells_3g.srv_uext3gcell_ncellrncid,
    cxt_hua_u_tgt_cells_3g.srv_uext3gcell_cellid,
    cxt_hua_u_tgt_cells_3g.srv_uext3gcell_cellname,
    cxt_hua_u_tgt_cells_3g.srv_uext3gcell_lac,
    cxt_hua_u_tgt_cells_3g.srv_uext3gcell_rac,
    cxt_hua_u_tgt_cells_3g.srv_uext3gcell_pscrambcode,
    cxt_hua_u_tgt_cells_3g.srv_uext3gcell_uarfcn,
    cxt_hua_u_tgt_cells_3g.relation_type,
    cxt_hua_u_tgt_cells_3g.tgt_technology,
    cxt_hua_u_tgt_cells_3g.tgt_vendor,
    cxt_hua_u_tgt_cells_3g.tgt_rncname,
    cxt_hua_u_tgt_cells_3g.tgt_rncid,
    cxt_hua_u_tgt_cells_3g.tgt_nodebname,
    cxt_hua_u_tgt_cells_3g.tgt_nodebid,
    cxt_hua_u_tgt_cells_3g.tgt_cell_network_sid,
    cxt_hua_u_tgt_cells_3g.tgt_cellname,
    cxt_hua_u_tgt_cells_3g.tgt_cellindex,
    cxt_hua_u_tgt_cells_3g.tgt_cgi,
    cxt_hua_u_tgt_cells_3g.tgt_lac,
    cxt_hua_u_tgt_cells_3g.tgt_ci,
    cxt_hua_u_tgt_cells_3g.tgt_rac,
    cxt_hua_u_tgt_cells_3g.tgt_sac,
    cxt_hua_u_tgt_cells_3g.tgt_uarfcn_dl,
    cxt_hua_u_tgt_cells_3g.tgt_priscrcode,
    cxt_hua_u_tgt_cells_3g.tgt_cellstatus,
    cxt_hua_u_tgt_cells_3g.tgt_latitude,
    cxt_hua_u_tgt_cells_3g.tgt_longitude,
    cxt_hua_u_tgt_cells_3g.tgt_azimuth,
    cxt_hua_u_tgt_cells_3g.tgt_region
    FROM zksinsprd.cxt_hua_u_tgt_cells_3g
    ORDER BY cxt_hua_u_tgt_cells_3g.date_id,
    cxt_hua_u_tgt_cells_3g.audit_sid
    SEGMENTED BY hash(cxt_hua_u_tgt_cells_3g.date_id) ALL NODES KSAFE 1;

  • Options

    and between this shows how the partitions are distributed and for no reason one of the nodes has more then 2 bln record . 

     


    partition_key | projection_id | table_schema | projection_name | ros_id | ros_size_bytes | ros_row_count | node_name | deleted_row_count | location_label
    ---------------+-------------------+--------------+---------------------------------+-------------------+----------------+---------------+-----------------------+-------------------+----------------
    201603 | 54043195528474104 | zksinsprd | cxt_hua_u_tgt_cells_3g_super_b0 | 45035996867996617 | 304617508 | 4109685 | v_zksainsprd_node0001 | 0 |
    201603 | 54043195528474104 | zksinsprd | cxt_hua_u_tgt_cells_3g_super_b0 | 49539596424204609 | 234093011 | 3155426 | v_zksainsprd_node0002 | 0 |
    201604 | 54043195528474104 | zksinsprd | cxt_hua_u_tgt_cells_3g_super_b0 | 49539596481359317 | 111464008 | 1488392 | v_zksainsprd_node0002 | 0 |
    201602 | 54043195528474104 | zksinsprd | cxt_hua_u_tgt_cells_3g_super_b0 | 54043195699187385 | 4754368 | 66308 | v_zksainsprd_node0003 | 0 |
    201603 | 54043195528474104 | zksinsprd | cxt_hua_u_tgt_cells_3g_super_b0 | 54043195991343195 | 222220826 | 3036490 | v_zksainsprd_node0003 | 0 |
    201604 | 54043195528474104 | zksinsprd | cxt_hua_u_tgt_cells_3g_super_b0 | 54043196113831451 | 111192599 | 1487320 | v_zksainsprd_node0003 | 0 |
    201602 | 54043195528474104 | zksinsprd | cxt_hua_u_tgt_cells_3g_super_b0 | 58546795327960613 | 80765692 | 1637935 | v_zksainsprd_node0004 | 0 |
    201603 | 54043195528474104 | zksinsprd | cxt_hua_u_tgt_cells_3g_super_b0 | 58546795503345733 | 359668884 | 4998042 | v_zksainsprd_node0004 | 0 |
    201604 | 54043195528474104 | zksinsprd | cxt_hua_u_tgt_cells_3g_super_b0 | 58546795702159457 | 55690104 | 743521 | v_zksainsprd_node0004 | 0 |
    201602 | 54043195528474104 | zksinsprd | cxt_hua_u_tgt_cells_3g_super_b0 | 63050394954771049 | 85454497 | 1694333 | v_zksainsprd_node0005 | 0 |
    201603 | 54043195528474104 | zksinsprd | cxt_hua_u_tgt_cells_3g_super_b0 | 63050395075350911 | 58066279600 | 2044063270 | v_zksainsprd_node0005 | 0 |
    201603 | 54043195528474104 | zksinsprd | cxt_hua_u_tgt_cells_3g_super_b0 | 63050395262115993 | 55871396 | 742373 | v_zksainsprd_node0005 | 0 |
    201603 | 54043195528474104 | zksinsprd | cxt_hua_u_tgt_cells_3g_super_b0 | 67553994994516189 | 293634341 | 3960998 | v_zksainsprd_node0006 | 0 |

  • Options

    Hi 

    Thanks for your reply . 

    as far as i see , the table is well segmented (plz check the table structure i posted below) 

    but i dont know why the data is not well balanced , projection and body projection created on all nodes , 

     

    what am suspecting is it might be related to partitioning . since now checking partitioning i dont see the partitions created on all nodes for 201604 and 201602 

     

  • Options

    Hi

    I guess date_id is a something like date_key?

    I think its not a good choice for segmentation

     

    You sholud take field or gorup of fields with good cardinality.

     

    Try to count(*) (all records in table) and distinc of date_id field.

    What will be the result?

  • Options

    you are right  , one of the date has more then a bililion record vs the rest of the dates has maximum 14 million record . 

     

    thanks !

  • Options

    select count (distinct date_id) from cxt_hua_u_tgt_cells_3g

  • Options

    select count (*),date_id from cxt_hua_u_tgt_cells_3g group by date_id 

    shows that the data is well organized for each date except one date it has more then 2 billions record and this is why we have differnce in space . 

  • Options

    OK. As I wrote date_id is not a good choice for a segmentation.

  • Options

    why its not a good choice for supper projection segmentation ? 

    data is well organized between nodes by date_id 

    it is just one exception for one day , and now we are checking why that date has all of these records , probably we have an issue with that specific date . 

     

    but why your saying that date_id is not good for segmentation ?

  • Options

    cell_id will be a better choice because I guess it has more uniq values than date_id

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file