data collector retention
I'm seeing some odd or possibly undocumented behavior from the data collector retention policy.
We have our policy set at 50,000kb with no time limit on RequestsIssued, RequestsCompleted, Errors, and ResourceAcquisitions to set the overall retention for v_monitor.query_requests.
I'm doing some analysis over the last few monthsand have data for all users over that time period except for 1. That user is a service account that issues far more queries than any of the others. The history for that user only has data for the last few weeks.
Sometime in the past one of our admins had set up a dump of query_requests to another table at regular intervals, maybe because the data collector had been seen as unreliable in the past, but I couldn't confirm the reason. Anyway, I have at least some of the historical data I need for that user because of that mechanism.
My question:
Is there some sort of undocumented selection process for purging data collector data? Maybe the system is purging data at the user level when the max size is reached? I would think that the oldest data would roll off regardless of any criteria but that doeasn't seem to be what I'm experiencing.
Comments
Oldest data rolls off first - there aren't any exceptions to that. Is the service account connecting to a specific node or a subset of the nodes? If this one account is executing the most queries, and the workload isn't distributed across all of the nodes, then the nodes that the service account connects to will have far more queries than the other nodes and those nodes will then roll off older data earlier than the other nodes. You can check this by querying the DATA_COLLECTOR table ordered by first_time to see the differences across the nodes.
If you want to keep a minimum time duration of data, set that as the time-based policy, with a very generous disk space policy that would never get hit - so that you'll always max out at the time first, not the disk space.
--Sharon
I didn't even think about the fact that the retention policy was applied per node. I was thinking about the whole cluster. If that is the case then is the size set in the retention policy also per node? So specifying 100k would use that much space on each node or divided among all nodes?
Yes - the size is per node.
--Sharon