Should i run a manual table rebalance

HadyZ · April 2016

HI

i have a 6 nodes vertica cluster , and have an issue with the data distribution of 2 supper projections with its body projection , ksafe = 1

you can see from the following info how the data is distributed .

so the last 4 rows shows that node 3,4,5,6 has must more data than the rest of the nodes for each projection .

my question is what can be the issue for this to happen , should we run a table rebalance to fix this ?

FiliN · April 2016

Yu should check segmentation first. May it could be better to change segmentation key

Adrian_Oprea_1 · April 2016

what is the table definition ?

Like Fin said - the key you use for segmentaiton is important !

Try to use a column that is high candidate for a good predicate column or join column highly prefered.

If no good key is found in the table you can create your own unique column using a sequence , doing this alows you to create a uniform distribution of data across the cluster and minimizing data skew.

Also is important to understand that the cost of dynamically resegmenting data for a join or for a group by, when the projection isn't segmented on the join key or grouping columns, will usually be greater than the extra time caused by some skew. If the data is skewed in a big way, you might want to think about avoiding the skew.

HadyZ · April 2016

HI

its a big table ill just send the end of the projection creation on that table .

SELECT cxt_hua_u_tgt_cells_3g.audit_sid,
cxt_hua_u_tgt_cells_3g.date_id,
cxt_hua_u_tgt_cells_3g.srv_technology,
cxt_hua_u_tgt_cells_3g.srv_vendor,
cxt_hua_u_tgt_cells_3g.srv_rncname,
cxt_hua_u_tgt_cells_3g.srv_rncid,
cxt_hua_u_tgt_cells_3g.srv_nodebname,
cxt_hua_u_tgt_cells_3g.srv_nodebid,
cxt_hua_u_tgt_cells_3g.srv_cell_network_sid,
cxt_hua_u_tgt_cells_3g.srv_cellname,
cxt_hua_u_tgt_cells_3g.srv_cellid,
cxt_hua_u_tgt_cells_3g.srv_cgi,
cxt_hua_u_tgt_cells_3g.srv_lac,
cxt_hua_u_tgt_cells_3g.srv_ci,
cxt_hua_u_tgt_cells_3g.srv_rac,
cxt_hua_u_tgt_cells_3g.srv_sac,
cxt_hua_u_tgt_cells_3g.srv_uarfcndownlink,
cxt_hua_u_tgt_cells_3g.srv_pscrambcode,
cxt_hua_u_tgt_cells_3g.srv_cellstatus,
cxt_hua_u_tgt_cells_3g.srv_latitude,
cxt_hua_u_tgt_cells_3g.srv_longitude,
cxt_hua_u_tgt_cells_3g.srv_azimuth,
cxt_hua_u_tgt_cells_3g.srv_region,
cxt_hua_u_tgt_cells_3g.srv_adj_path,
cxt_hua_u_tgt_cells_3g.srv_adj_id,
cxt_hua_u_tgt_cells_3g.srv_adj_scellrncid,
cxt_hua_u_tgt_cells_3g.srv_adj_srccellid,
cxt_hua_u_tgt_cells_3g.srv_adj_ncellrncid,
cxt_hua_u_tgt_cells_3g.srv_adj_nbrcellid,
cxt_hua_u_tgt_cells_3g.srv_uext3gcell_ncellrncid,
cxt_hua_u_tgt_cells_3g.srv_uext3gcell_cellid,
cxt_hua_u_tgt_cells_3g.srv_uext3gcell_cellname,
cxt_hua_u_tgt_cells_3g.srv_uext3gcell_lac,
cxt_hua_u_tgt_cells_3g.srv_uext3gcell_rac,
cxt_hua_u_tgt_cells_3g.srv_uext3gcell_pscrambcode,
cxt_hua_u_tgt_cells_3g.srv_uext3gcell_uarfcn,
cxt_hua_u_tgt_cells_3g.relation_type,
cxt_hua_u_tgt_cells_3g.tgt_technology,
cxt_hua_u_tgt_cells_3g.tgt_vendor,
cxt_hua_u_tgt_cells_3g.tgt_rncname,
cxt_hua_u_tgt_cells_3g.tgt_rncid,
cxt_hua_u_tgt_cells_3g.tgt_nodebname,
cxt_hua_u_tgt_cells_3g.tgt_nodebid,
cxt_hua_u_tgt_cells_3g.tgt_cell_network_sid,
cxt_hua_u_tgt_cells_3g.tgt_cellname,
cxt_hua_u_tgt_cells_3g.tgt_cellindex,
cxt_hua_u_tgt_cells_3g.tgt_cgi,
cxt_hua_u_tgt_cells_3g.tgt_lac,
cxt_hua_u_tgt_cells_3g.tgt_ci,
cxt_hua_u_tgt_cells_3g.tgt_rac,
cxt_hua_u_tgt_cells_3g.tgt_sac,
cxt_hua_u_tgt_cells_3g.tgt_uarfcn_dl,
cxt_hua_u_tgt_cells_3g.tgt_priscrcode,
cxt_hua_u_tgt_cells_3g.tgt_cellstatus,
cxt_hua_u_tgt_cells_3g.tgt_latitude,
cxt_hua_u_tgt_cells_3g.tgt_longitude,
cxt_hua_u_tgt_cells_3g.tgt_azimuth,
cxt_hua_u_tgt_cells_3g.tgt_region
FROM zksinsprd.cxt_hua_u_tgt_cells_3g
ORDER BY cxt_hua_u_tgt_cells_3g.date_id,
cxt_hua_u_tgt_cells_3g.audit_sid
SEGMENTED BY hash(cxt_hua_u_tgt_cells_3g.date_id) ALL NODES KSAFE 1;

HadyZ · April 2016

and between this shows how the partitions are distributed and for no reason one of the nodes has more then 2 bln record .

partition_key | projection_id | table_schema | projection_name | ros_id | ros_size_bytes | ros_row_count | node_name | deleted_row_count | location_label
---------------+-------------------+--------------+---------------------------------+-------------------+----------------+---------------+-----------------------+-------------------+----------------
201603 | 54043195528474104 | zksinsprd | cxt_hua_u_tgt_cells_3g_super_b0 | 45035996867996617 | 304617508 | 4109685 | v_zksainsprd_node0001 | 0 |
201603 | 54043195528474104 | zksinsprd | cxt_hua_u_tgt_cells_3g_super_b0 | 49539596424204609 | 234093011 | 3155426 | v_zksainsprd_node0002 | 0 |
201604 | 54043195528474104 | zksinsprd | cxt_hua_u_tgt_cells_3g_super_b0 | 49539596481359317 | 111464008 | 1488392 | v_zksainsprd_node0002 | 0 |
201602 | 54043195528474104 | zksinsprd | cxt_hua_u_tgt_cells_3g_super_b0 | 54043195699187385 | 4754368 | 66308 | v_zksainsprd_node0003 | 0 |
201603 | 54043195528474104 | zksinsprd | cxt_hua_u_tgt_cells_3g_super_b0 | 54043195991343195 | 222220826 | 3036490 | v_zksainsprd_node0003 | 0 |
201604 | 54043195528474104 | zksinsprd | cxt_hua_u_tgt_cells_3g_super_b0 | 54043196113831451 | 111192599 | 1487320 | v_zksainsprd_node0003 | 0 |
201602 | 54043195528474104 | zksinsprd | cxt_hua_u_tgt_cells_3g_super_b0 | 58546795327960613 | 80765692 | 1637935 | v_zksainsprd_node0004 | 0 |
201603 | 54043195528474104 | zksinsprd | cxt_hua_u_tgt_cells_3g_super_b0 | 58546795503345733 | 359668884 | 4998042 | v_zksainsprd_node0004 | 0 |
201604 | 54043195528474104 | zksinsprd | cxt_hua_u_tgt_cells_3g_super_b0 | 58546795702159457 | 55690104 | 743521 | v_zksainsprd_node0004 | 0 |
201602 | 54043195528474104 | zksinsprd | cxt_hua_u_tgt_cells_3g_super_b0 | 63050394954771049 | 85454497 | 1694333 | v_zksainsprd_node0005 | 0 |
201603 | 54043195528474104 | zksinsprd | cxt_hua_u_tgt_cells_3g_super_b0 | 63050395075350911 | 58066279600 | 2044063270 | v_zksainsprd_node0005 | 0 |
201603 | 54043195528474104 | zksinsprd | cxt_hua_u_tgt_cells_3g_super_b0 | 63050395262115993 | 55871396 | 742373 | v_zksainsprd_node0005 | 0 |
201603 | 54043195528474104 | zksinsprd | cxt_hua_u_tgt_cells_3g_super_b0 | 67553994994516189 | 293634341 | 3960998 | v_zksainsprd_node0006 | 0 |

HadyZ · April 2016

Hi

Thanks for your reply .

as far as i see , the table is well segmented (plz check the table structure i posted below)

but i dont know why the data is not well balanced , projection and body projection created on all nodes ,

what am suspecting is it might be related to partitioning . since now checking partitioning i dont see the partitions created on all nodes for 201604 and 201602

FiliN · April 2016

Hi

I guess date_id is a something like date_key?

I think its not a good choice for segmentation

You sholud take field or gorup of fields with good cardinality.

Try to count(*) (all records in table) and distinc of date_id field.

What will be the result?

HadyZ · April 2016

you are right , one of the date has more then a bililion record vs the rest of the dates has maximum 14 million record .

thanks !

FiliN · April 2016

select count (distinct date_id) from cxt_hua_u_tgt_cells_3g

HadyZ · April 2016

select count (*),date_id from cxt_hua_u_tgt_cells_3g group by date_id

shows that the data is well organized for each date except one date it has more then 2 billions record and this is why we have differnce in space .

FiliN · April 2016

OK. As I wrote date_id is not a good choice for a segmentation.

HadyZ · April 2016

why its not a good choice for supper projection segmentation ?

data is well organized between nodes by date_id

it is just one exception for one day , and now we are checking why that date has all of these records , probably we have an issue with that specific date .

but why your saying that date_id is not good for segmentation ?

FiliN · April 2016

cell_id will be a better choice because I guess it has more uniq values than date_id

Should i run a manual table rebalance

Comments

Leave a Comment