Mergeout operation does not run on time?
Hello,
We're getting "too many ROS containers" during MERGE operation. I've tried to fix that by setting the MergeOutInterval value to 300, but it seems like the merge out operation doesn't run in the specified frequency (current time is 16:25):
Thanks,
Michael
We're getting "too many ROS containers" during MERGE operation. I've tried to fix that by setting the MergeOutInterval value to 300, but it seems like the merge out operation doesn't run in the specified frequency (current time is 16:25):
aa1=> select * from v_monitor.tuple_mover_operations where projection_name='client_dim_node0001' and operation_name='Mergeout' order by operation_start_timestamp desc limit 10; operation_start_timestamp | node_name | operation_name | operation_status | table_schema | table_name | projection_name | projection_id | column_id | earliest_container_start_epoch | latest_container_end_epoch | ros_count | total_ros_used_bytes | plan_type | session_id | is_executing | runtime_priorityPlease advice how can I make sure that ROS containers are merged on time.
-------------------------------+----------------+----------------+-----------------------------------+--------------+------------+---------------------+-------------------+-----------+--------------------------------+----------------------------+-----------+----------------------+---------------+------------------------+--------------+------------------
2014-04-27 15:56:52.262467+03 | v_aa1_node0001 | Mergeout | Complete | public | client_dim | client_dim_node0001 | 45035996274167882 | 0 | 985929 | 987488 | 84 | 24547348 | Replay Delete | idc-sci-2-2557:0x28e9c | f | MEDIUM
2014-04-27 15:56:37.932777+03 | v_aa1_node0001 | Mergeout | Change plan type to Replay Delete | public | client_dim | client_dim_node0001 | 45035996274167882 | 0 | 985929 | 987488 | 84 | 24547348 | Replay Delete | idc-sci-2-2557:0x28e9c | f | MEDIUM
2014-04-27 15:56:28.522484+03 | v_aa1_node0001 | Mergeout | Start | public | client_dim | client_dim_node0001 | 45035996274167882 | 0 | 985929 | 987488 | 84 | 24547348 | Mergeout | idc-sci-2-2557:0x28e9c | f | MEDIUM
2014-04-27 15:56:28.516812+03 | v_aa1_node0001 | Mergeout | Complete | public | client_dim | client_dim_node0001 | 45035996274167882 | 0 | 985918 | 987568 | 249 | 41446397 | Replay Delete | idc-sci-2-2557:0x28e9c | f | MEDIUM
2014-04-27 15:55:14.129814+03 | v_aa1_node0001 | Mergeout | Change plan type to Replay Delete | public | client_dim | client_dim_node0001 | 45035996274167882 | 0 | 985918 | 987568 | 249 | 41446397 | Replay Delete | idc-sci-2-2557:0x28e9c | f | MEDIUM
2014-04-27 15:53:51.177644+03 | v_aa1_node0001 | Mergeout | Start | public | client_dim | client_dim_node0001 | 45035996274167882 | 0 | 985918 | 987568 | 249 | 41446397 | Mergeout | idc-sci-2-2557:0x28e9c | f | MEDIUM
2014-04-27 14:02:32.294431+03 | v_aa1_node0001 | Mergeout | Complete | public | client_dim | client_dim_node0001 | 45035996274167882 | 0 | 906263 | 914651 | 169 | 55822133 | Replay Delete | idc-sci-2-2557:0x27eac | f | HIGH
2014-04-27 14:01:22.314386+03 | v_aa1_node0001 | Mergeout | Change plan type to Replay Delete | public | client_dim | client_dim_node0001 | 45035996274167882 | 0 | 906263 | 914651 | 169 | 55822133 | Replay Delete | idc-sci-2-2557:0x27eac | f | HIGH
2014-04-27 14:00:49.654613+03 | v_aa1_node0001 | Mergeout | Start | public | client_dim | client_dim_node0001 | 45035996274167882 | 0 | 906263 | 914651 | 169 | 55822133 | Mergeout | idc-sci-2-2557:0x27eac | f | HIGH
2014-04-27 14:00:49.640884+03 | v_aa1_node0001 | Mergeout | Complete | public | client_dim | client_dim_node0001 | 45035996274167882 | 0 | 906915 | 914607 | 249 | 72754494 | Replay Delete | idc-sci-2-2557:0x27eac | f | HIGH
(10 rows)
Thanks,
Michael
0
Comments
Can you run a procedure - print_next_mergeout_job? >> We're getting "too many ROS containers" during MERGE operation
Looks like STRATA ISSUE, but Im not sure.
1. If you don't think to grow up so you must to disable "Scale Factor". It will reduce amount of segments.
2. Check that you didn't get what calls "STRATA ISSUE"
Zvika explains about "STRATA ISSUE" (00:41:00)
https://www.youtube.com/watch?v=ISa9BNGK1Dg
3. MERGEOUT can't consolidate all containers(In case of "STRATA ISSUE"), but PURGE does, so may be you need to run PURGE sometimes.
PS
You were warned about "Too many ROS": https://community.vertica.com/vertica/topics/partition_by_timestamptz_field_how
I've disabled the scale factor (indeed, we're running on a single node currently), and it reduced the number of partitions to ~160.
I tried tuning the default resource pool settings for tuple mover (threads number and memory), and it seems to fix the issue for a while. I'll let it run, then I'll write back in this thread whether the issue has been fixed completely.
Thanks for your help and for the great video!