Tuning Vertica query performance during mergeout

minatverma · April 2018

Hi,

This query is regarding mergeout and its impact on query performance.

What I have found out:
1. The mergeout happens in strata levels (smaller to bigger until 1 ROS per partition/table is reached)
2. Lower strata mergeout happens rather quickly, then higher strata mergeout.
3. Increasing the plannedcocurrency of tm pool, only affects mergeout of lower strata.

Below are the questions I have,
1. Suppose if there are 4 strata layers, until which strata layer multiple threads of mergeout can work? Does only 1 thread operate on the uppermost strata layer?
2. Does mergeout happen in WOS? If it does, is it a good idea to make the frequency of mergeout more than moveout?
3. If I have set both MergeOutInterval = 600 and MoveOutSizePct='20%', which variable has higher precedence?
4. What is the difference between ROSPerStratum and MaxROSPerStratum? Will MaxROSPerStratum override ROSPerStratum if I have set MaxMrgOutROSSizeMB to a predefined value?
6. If my query answers are coming from multiple small ROS containers which are to be merged, will increasing executionparallelism (for the analytical query pool) help in any way?
5. If I have defined MaxMrgOutROSSizeMB to a particular value, can I manually merge a projection to get a bigger ROS Container?
For example when I archived the older partition with may small ROS, and that point I am willing to consolidate the bigger containers into one.

Lastly, the reason for all these questions above is that,
I have a very large table(200 columns, billion rows per week, 5-6 weeks)
with partition size of around 10GB per partition for the super-projection.
And real-time loading from Kafka write a batch of around 20-30MB/batch (1 or 2 batches in 1 minute)
After 20-26 hours when mini-ROS are consolidated to larger ROS, then the performance of queries(100 queries per minute) tanks,
so much so that the application SLA is breached.

Please feel free to correct me where I am wrong.
Any advice for performance improvement will be helpful.

Let me know for any clarifications.

Below is a very related question with my scenarios, unanswered in stackexchange.

How can I reduce the Tuple Mover's disk I/O impact for mergeouts?

Thanks,
Minat

We're Moving!

Create My New Community Account Now

Tuning Vertica query performance during mergeout

How can I reduce the Tuple Mover's disk I/O impact for mergeouts?

Leave a Comment