Vertica Threads
scottpedersoli
Vertica Customer ✭
Does anybody know of any documentation/posting that explains how Vertica Threads work?
I see queries run using 1000’s of threads. Trying to understand things like how does Vertica determine the number of threads needed to execute a query and does a high number of threads indicate anything specific. I read that you use “EXECUTIONPARALLELISM to limit the number of threads used to process any single query issued in this resource pool.” We have users in pool that is set to EXECUTIONPARALLELISM =32 but then I see in Management Console those queries using more than 32.
0
Comments
The EXECUTIONPARALLELISM controls the number of threads that a single operator in the query plan may use. For example, if your query is a GROUP BY and EXECUTIONPARALLELISM is set to 8, then the Scan operator that reads data from the projection would use 8 threads, and the GROUP BY operator would use 8 threads. If the "query budget" happened to be very low - as can be the case in misconfigured resource pools (see RESOURCE_POOL_STATUS) - then there might not be enough memory for the operators to actually fan out to that number of threads.
The actual number of threads can be seen in the EXECUTION_ENGINE_PROFILES data for the executed query. For example looking at the rows with counter_name "rows produced", each row in the system table for that counter_name represents a thread. For the example above, you'd see 8 Scan operators on each node and 8 Group By Hash/Pipeline operators on each node.
Btw it's worth doing some performance testing to see if EXECUTIONPARALLELISM of 32 is actually faster for your use case than 8 or 16. In my experience it's often not, unless the queries are very CPU-expensive.
--Sharon
Thanks Sharon, I appreciate the response!