We're Moving!

The Vertica Forum is moving to a new OpenText Analytics Database (Vertica) Community.

Join us there to post discussion topics, learn about

product releases, share tips, access the blog, and much more.

Create My New Community Account Now


Insufficient resources to execute plan on pool general [Timedout waiting for resource request: Reque — Vertica Forum

Insufficient resources to execute plan on pool general [Timedout waiting for resource request: Reque

We recently upgraded our Vertica database server from 6.4 to Vertica Analytic Database v7.1.0-1. We use Talend, and since the upgrade, we have had issues with Talend jobs failing on our Vertica server due to memory issues. The most reliable repro is a daily job - which previously always ran successfully, in less than 4 minutes - and now it times out and dies. The error on the db logs: 2014-09-19 10:56:15.533 Init Session:0x2b3e70013220-a000000742bfd5 @v_pdw_node0001: 53000/3587: Insufficient resources to execute plan on pool general [Timedout waiting for resource request: Request exceeds limits: Memory(KB) Exceeded: Requested = 4288595, Free = 118927 (Limit = 19981816, Used = 19862889) (queueing threshold)] LOCATION: Exec_compilePlan, /scratch_a/release/vbuild/vertica/Dist/Dist.cpp:1217 2014-09-19 10:56:16.263 Init Session:0x2b3e701a4390-a000000742bfdf [Txn] Starting Commit: Txn: a000000742bfdf 'select atl.id, atl.APP_DOMAIN, atl.CREATE_DATE, atl.TRANS_ID, atl.SOURCE_PARTNER, atl.LAST_NAME, atl.FIRST_NAME, bcf.LP_ID LOYALTY_PROFILE_ID, atl.MEMBER_ID, bcf.TX_ID ORDER_ID, atl.AMOUNT, atl.BATCH_ID, atl.BATCH_KEY from"STAGING".JPMC_ATL_VW atl left outer join "STAGING".JPMC_BCF_VW bcf on atl.APP_DOMAIN = bcf.APP_DOMAIN and atl.TRANS_ID = bcf.MINOR_KEY where atl.BATCH_KEY > 12303 or bcf.BATCH_KEY > 12303 ;' Some interesting notes: • We have a single node with 28GB RAM • the identical job runs on our Staging Vertica server (which is also Vertica v7.1.0-1 and is the same as production in terms of the volume of data, single node with less memory) – with no problems • execution of the identical job through the Talend Studio, succeeds – Vertica is happen enough to run the query and return the data when run in that mode • Talend version is 5.4.1 Things that we have tried that have not made any difference: • Increase the queuetimeout from 600 to 900 to 1200 of the general resource pool. Performance only degraded. • Increase maximum memory maps configuration of the db server • Modified plannedconcurrency to AUTO • Created new projections for the tables the query is using. Our questions – • has anyone seen these sort of memory issues come up after upgrading to Vertica v7.1.0? • how can we resolve this? what else should we try? Thanks in advance for your help.

Comments

  • Prasanta_PalPrasanta_Pal - Select Field - Employee
    You may check how many queries are running / memory in use etc in the pools

    select node_name, pool_name, memory_inuse_kb,running_query_count from resource_pool_status ;

    is it the same pool where the query used to run?

  • How will I know which pool the query uses? node_name pool_name memory_inuse_kb running_query_count -------------- --------- --------------- ------------------- v_pdw_node0001 general 209152 0 v_pdw_node0001 sysquery 0 0 v_pdw_node0001 sysdata 102400 0 v_pdw_node0001 wosdata 0 0 v_pdw_node0001 tm 0 0 v_pdw_node0001 refresh 0 0 v_pdw_node0001 recovery 0 0 v_pdw_node0001 dbd 0 0 v_pdw_node0001 jvm 0 0
  • Prasanta_PalPrasanta_Pal - Select Field - Employee
    select * from query_request where request ilike '%<part of your query>%';
    Get the transaction_id and statement_id

    dbadmin=> select node_name, request_type, pool_name, memory_kb, queries, succeeded, failing_resource from dc_resource_acquisitions where transaction_id = <> and statement_id = <>;

    Also need to check query_budget_kb in all the pools ( specially where you are running the query)

    select * from sessions; should also give some idea about how many sessions are running.

    check other columns in the resource_pool_status pool.

    Try changing the planned_concurrency ( lower the value) in resource_pools table for the general pool where your are running the query. This should allow more query budget memory.





  • When the ETL with the query is running node_name pool_name memory_inuse_kb running_query_count -------------- --------- --------------- ------------------- v_pdw_node0001 general 19862889 1 v_pdw_node0001 sysquery 0 0 v_pdw_node0001 sysdata 102400 0 v_pdw_node0001 wosdata 0 0 v_pdw_node0001 tm 0 0 v_pdw_node0001 refresh 0 0 v_pdw_node0001 recovery 0 0 v_pdw_node0001 dbd 0 0 v_pdw_node0001 jvm 0 0
  • Prasanta_PalPrasanta_Pal - Select Field - Employee
    19GB is in use during that time
  • I have lowered the planned_concurrency to 2 and the ETL still failed. node request_type pool memory_kb queries succeeded failing_resource v_pdw_node0001 Acquire general 102400 0 t v_pdw_node0001 Reserve general 9986812 1 t v_pdw_node0001 AcquireAdditional general 11377853 1 t v_pdw_node0001 AcquireAdditional general 14159934 1 t v_pdw_node0001 AcquireAdditional general 19724095 1 t v_pdw_node0001 AcquireAdditional general 19857216 1 t v_pdw_node0001 AcquireAdditional general 20123457 1 t v_pdw_node0001 AcquireAdditional general 20655938 1 t v_pdw_node0001 AcquireAdditional general 21720899 1 f Memory(KB) v_pdw_node0001 AcquireAdditional general 21720899 1 f Memory(KB)
  • Hi,
    It should be more simple .

    Check and compare the execution plan you have between both Staging  and the current one .
    Do you gather statistics ?   

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file