Currently we are looking to assign 80% of the cluster resources to a single pool (exclusive) but in the case of CPU it has not been possible to make them exclusive, so requests are queued. How is it possible to assign them exclusively?
Max concurrency is based on single node core count. If you run explain plan, it should indicate that a query runs on all nodes, taking at least 1 core per node.
Not really sure why you'd want to exclusively lock CPU to a specific resource pool, but you can do that the the CPU Affinity parameters in the Resource Pool definition. Set it to Exclusive. CPUs can also be shared between pools as well. Most clients don't do this because it's pretty edge-case stuff.
Thank you very much dear Curtis, the question arises because , i'm confuse on the difference in the SHARED and ANY in the cpuaffinitymode parameter since sometimes heavy tasks consume CPU and when short tasks are required there are no resources for this purpose.
You could set 80% of CPU to a specific resource pool by setting CPUAFFINITYSET to 80%. Are you sure that the issue is lack of CPU? Queries are queued if there are no concurrency slots in the target pool, which may or may not be caused by lack of CPUs. It's more likely that you want to tune PLANNEDCONCURRENCY and MAXCONCURRENCY to match the expected workload and allocate memory to match. You might also consider setting a cascading pool to move long running or high memory queries.
Can you share resource pool configuration (SELECT * FROM RESOURCE_POOLS) as well as resource usage when requests are queued (SELECT * FROM RESOURCE_POOL_STATUS)?
Or, open a support case and submit a scrutinize so we can review all system metrics.
Thanks a lot Bryan:
Here is the resource pool config:
Basically what we are looking for is to guarantee the response times of queries from a web application and BI applications such as tableau against the processing processes.
I recommend allocating memory to a pool to ensure queries will always run, and also set planned concurrency and max concurrency to ensure queries received enough resources. Please see the "CEO query" example at https://docs.vertica.com/12.0.x/en/admin/managing-db/managing-workloads/workload-best-practices/managing-workloads-with-resource-pools-and-user-profiles/ceo-query/
Please also check the query budget listed in resource_pool_status and adjust memory size and concurrency such that each query will receive a budget suggested by PROFILE for query memory usage. There is more information about query budget in resource planning at https://docs.vertica.com/12.0.x/en/admin/managing-db/managing-workloads/resource-pool-architecture/query-budgeting/
Why is general pool limited to 40%? This likely constrains all other pools, since they borrow from general. Assigning memory to a specific pool also reduces overall memory available since memory that is reserved by MEMORYSIZE can't be borrowed again.
Thank very much Bryan, I'll try the CEO Option, but, how i can assign the CPU exclusive like memory?, it's better to assign Shared?, Or it's better to change to Any? Best Regards
I'm not sure CPU affinity has much effect for current releases. If there is a requirement to have reserved compute, I would ensure that planned concurrency and max concurrency in other pools are set to allow at least 1-2 cores available for the Tableau pool. In this case, assume that 1 concurrency = 1 core. So to ensure there are concurrency slots available for Tableau, sum of max concurrency in other pools should be less than the core count. It's also possible to run some historic queries to determine actual concurrency of other pools to find out whether it is possible to reduce max concurrency of other pools.
Thank you very much dear Bryan, I appreciate your recommendation I will do it that way, just in case of reviewing the historical query, would it be about the threads in the v_monitor tables?
There are two queries I typically use.
For current resource usage:select * from resource_pool_status order by node_name, pool_name;
For historic resource usage, the following query will show number of running queries, total memory used, and how many minutes were seen at that concurrency level. This can help identify high concurrency in a pool where concurrency is max or greater than number of theads, and can also show when memory is a limit, for example if concurrency increases but allocated memory does not, this can show that more memory is needed, or some work needs to move to another resource pool by moving the user or setting a cascade resource pool for long running or high memory/CPU jobs:select pool_name,reserved_queries_max_value,max(reserved_memory_kb_max_value) as memoryKB,count(*) as minutes from dc_resource_pool_status_by_minute group by 1,2 order by 1 asc,2 desc;
You can also look at query_consumption table for resource use by a single query.
select * from resource_pool_status order by node_name, pool_name;
select pool_name,reserved_queries_max_value,max(reserved_memory_kb_max_value) as memoryKB,count(*) as minutes from dc_resource_pool_status_by_minute group by 1,2 order by 1 asc,2 desc;
Thank you very much Bryan , one last Question "sum of max concurrency in other pools should be less than the core count" if my cluster have 128 core per node, te maximun is 128 or 512 Cores?
Thank you very much dear Bryan, you helped me a lot, just as a doubt, executionparallelism should be equal to maxconcurrency or can it be varied by the physical number of cores? What would be the best practice?