Options

Delete vectors not removed during mergeout

We have nightly jobs that do some deletes and updates on accumulating facts. Not a huge number, mind you, and it's on data that is not easily partitioned for a delete using 'drop partition.' I would expect the regular mergeouts to purge the deleted rows and, along with them, the delete vectors. When I check in the morning, however, the delete vectors are still listed in the delete_vectors table. If I then run a manual purge_table, the delete vectors are removed just fine. Can someone please shed some light on why the purge during mergeout does not remove the delete vectors? AHM is being advanced just fine, btw. Thanks, Ben

Comments

  • Options
    Nimmi_guptaNimmi_gupta - Select Field - Employee
    Hi Ben, Normally the deleted records should be purged as mergeout occurs, but if a mass delete was executed after mergeout had consolidated containers into a large container in an upper tuple mover stratum, the deletes remain un-purged because Mergeout will not consider the large containers. The PurgeMergeoutPercent config parameter (currently undocumented but shows up in the system table configuration_parameters) was not effective with its default value of 40%. Their table was very large, and 40% was too high of a threshold to make the large ROS eligible for purge/merge. So a large ROS container was stuck in the high stratum with little chance of ever getting purged. Every query was impacted. Force the purge using the process below. => select make_ahm_now(); => select set_config_parameter('MaxMrgOutROSSizeMB', 1024); => select set_config_parameter('PurgeMergeoutPercent',1); Now let the ROSes get purged through Tuple Mover Mergout operations, checking with the query at the top of this Solution. After deletes have been purged, => select set_config_parameter('MaxMrgOutROSSizeMB', -1); => select set_config_parameter('PurgeMergeoutPercent',20); If large deletes of old data can occur again following this operation, check periodically for the need to re-run the process. Use the following queries to find projections and nodes with the highest count of ROS containers and delete vectors. => select count(*), node_name, projection_name from storage_containers group by 2,3 order by 1 desc limit 20; => select count(*), node_name, projection_name from delete_vectors group by 2,3 order by 1 desc limit 20; Nimmi
  • Options
    This is great info. It's nice to know what's going on under the covers, and this makes good sense. Thanks, Ben

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file