Resource Management best practices for “ WOS Full; failover to DIRECT” message

eli_revach · May 2013

Hi What is the best practices configuration for Resource Management in a case of many “ WOS Full; failover to DIRECT” message on vertica.log file , assuming Trickle Load which is frequency can’t be adjust .

eli_revach · May 2013

The switch to ROS impact the performance of the load rate dramatically .

eli_revach · May 2013

Its negative impact

[Deleted User] · May 2013

Hi, How big is your wos pool? Vertica recommends 2GB. If it gets full because you are doing a single load bigger than that, you should do it direct. It is too much data that you can loose in case of power failure. If it gets full because you are doing too many loads, perhaps you can change the MoveOutInterval, so the data is moved more often but consider the MergeOutInterval too so you don't end up with ROS push back. However, you need to also be careful on making those parameters too small or your data can end up processing the files too many times. ( I hope make sense, there is too many details to consider) I recommend you a section in the documentation that talks about how to tune the tuple mover. That could help to shed some light. https://my.vertica.com/docs/6.1.x/HTML/index.htm#14361.htm Hope this helps, Eugenia

eli_revach · May 2013

Thanks , to minimize ROS pushback i will change the WOSDATA to 4G

[Deleted User] · May 2013

Hi, Sorry, I think I was not clear. I do not recommend to increase the WOSDATA to 4gb. It is too big, Vertica recommends to have 2GB Ros pushback, it is when you reach the max number of ros that by default is 1024. One reason to get ROS pushback is when you move the data from WOS to ROS to often and could create too many ROS containers because those smaller files do not mergeout fast enough. ROS pushback is not that the WOS spill to disk. Hope this is more clear. If you want the WOS to empty more often you could change the MoveOutInterval that by default is set to 5min, but also change the MergeOutInterval so it merges the new ROS. However, do not set those parameters too small. You can check the resource_pool_status and see how much space is being used. select * from resource_pool_status; Eugenia

eli_revach · May 2013

It was very clear before ! , when i change the moveout and meragout frequency the performance was drop probably as a result of TM lock . See below the resource_pool_status status during the time of WOS Full; failover to DIRECT

eli_revach · May 2013

Hi , When change the 4G no message like this anymore . The moveout and meragout change is very depended on the load type , in my load scenario i find the use of 4G for WOSdata more useful

[Deleted User] · May 2013

Hi, What is your moveout and mergeout intervals? select * from configuration_parameters where parameter_name ilike '%outinterval%'; Eugenia

eli_revach · May 2013

MergeOutInterval =300 MoveOutInterval=100

[Deleted User] · May 2013

Hi, Of course if you increase it to 4GB the message will stop but be aware that if for some reason vertica ends abruptly or computer power off, you can lose up to 4gb of data. In addition, you have 4gb that could be in use for the WOS and not for your queries. The change is OK, but I want you to know the implications so you make an informed decision. Eugenia

eli_revach · May 2013

Thanks i understend that , but the lost should be only in case of single node implemenation .

[Deleted User] · May 2013

Well, if you lose power to one node in a cluster (and you have at least K=1 safety, which is the default), then you're correct that you won't lose data. But when was the last time you lost power to just one machine in a rack?

eli_revach · June 2013

Sure , anyway you do not provide soultion for the data lost , you just advice how to minimize it , anyway , looks like it should be take as a considuration Thanks a lot

Resource Management best practices for “ WOS Full; failover to DIRECT” message

Comments

Leave a Comment