Slow query execution
Hi.
Currently, we are building a Vertica server with the following configuration
Number of servers: 3
Number of CPU cores per server: 16
It uses only about four cores per server when performing complex queries, and
If you look at v_monitor.cpu_usage, it is only using about 20 percent of the peak time.
To use more CPU resources during query execution to reduce execution time, you can use
What should I do?
If this is possible by changing the settings, please tell me how to do it.
(The query itself is very inefficient.
Since the creation of the query is done by the user, query modification is considered a last resort)
Tagged:
0
Comments
P.S. Memory per server:192GB
1) To optimize performance, begin by running Vertica Database Designer.
See: https://www.vertica.com/docs/10.0.x/HTML/Content/Authoring/AnalyzingData/Optimizations/InitialProcessForImprovingQueryPerformance.htm
2) You can use Directed Queries to redirect an input of an existing query to a query that uses different semantics - for example, map a join query to a SELECT statement that queries a flattened table.
See: https://www.vertica.com/docs/10.0.x/HTML/Content/Authoring/AdministratorsGuide/QueryManagement/DirectedQueries/DirectedQueries.htm
3) You can configure the built-in GENERAL pool based on actual concurrency and performance requirements, or create custom Resource Pools to handle various classes of workloads and optionally restrict user requests to your custom pools.
See: https://www.vertica.com/docs/10.0.x/HTML/Content/Authoring/AdministratorsGuide/ManageWorkloads/ResourcePoolArchitecture.htm
Dear @mosheg
Thank you for your answer.
We've reviewed the three that were presented.
The resource pool settings are initial and the RESOURCE_POOLS.PLANNEDCONCURRENCY of "general" is set to AUTO.
Each node has 4 physical CPUs and 16 logical cores, in this case, can we expect an improvement by setting PLANNEDCONCURRENCY to 16?
(I was not able to see how many cores Vertica determines to have when the value is set to AUTO).
Yes.
PLANNEDCONCURRENCY = The number of concurrent queries you expect to run against the resource pool, an integer ≥ 4.
If set to AUTO (default), Vertica automatically sets PLANNEDCONCURRENCY at query runtime,
choosing the lower of these two values:
Number of cores
Memory/2GB
In systems with a large number of cores, the default AUTO setting of PLANNEDCONCURRENCY is liable to be too low.
Try to set the parameter to the actual number of cores:
In your case, ALTER RESOURCE POOL general PLANNEDCONCURRENCY 16;
See: https://www.vertica.com/docs/10.0.x/HTML/Content/Authoring/SQLReferenceManual/Statements/Built-inPoolConfiguration.htm
And https://www.vertica.com/kb/BestPracticesforManagingResourcePools/Content/BestPractices/BestPracticesforManagingResourcePools.htm
@mosheg
I found out why it was not performing as expected.
It was very rudimentary, but the data had not been changed when the PoC was conducted previously.
In contrast, there are now a large number of daily data updates, but the statistics had not been re-fetched.
By modifying the statistics to be retrieved every time, the performance is now as good as it was during PoC.
I added a note to prevent anyone else from getting stuck in the same situation.