Enable Join Spill

Hi, As far as I know, by default the join spill is enabled. I am using Vertica 6.1.2. But still I get the error " Join doesn't fit in memory". My question is if it doesn't fit in the memory then it should spill to disk automatically. But when I explicitly define the enable then the query runs: select /* add_vertica_options('EE','Enable_Join_Spill')*/ (Query)..... How to set the Join Spill globally so that if the query doesn't fit in the memory, it should spill to disk? Regards, Abhi

Comments

  • Prasanta_PalPrasanta_Pal - Select Field - Employee
    select set_vertica_options('EE','ENABLE_JOIN_SPILL'); set_vertica_options -------------------------------------------------------------- EE Vertica Options -------------------- ENABLE_JOIN_SPILL
  • Thanks Prasanta !!!! I tried with said SQL select set_vertica_options('EE','ENABLE_JOIN_SPILL'). But the same error message I get."Join didn't fit in memory" Is there any other way out? Regards, Abhi
  • Prasanta_PalPrasanta_Pal - Select Field - Employee
    Hi Abhi, Can you try the below options? Setting this option from a SQL client does not work. Try inline option i.e. build the query with setting or by setting DB wise. examples below: 1. Enable join spill by trying one or more of the following: * To enable join spilling for a session: => SELECT add_vertica_options('EE', 'ENABLE_JOIN_SPILL'); * To disable join spilling for a session: => SELECT clr_vertica_options('EE', 'ENABLE_JOIN_SPILL'); * To enable join spilling for a single query, use hint syntax: => SELECT /*+set_vertica_options(EE, ENABLE_JOIN_SPILL)*/ ; Only one hint can be used in a statement. For example, in an "INSERT /*+direct*/ SELECT ..." query where the select list includes a join that requires ENABLE_JOIN_SPILL, set ENABLE_JOIN_SPILL with the following: => SELECT add_vertica_options('EE', 'ENABLE_JOIN_SPILL'); and => SELECT clr_vertica_options('EE', 'ENABLE_JOIN_SPILL') * To globally enable this option for all new sessions started in the database: => SELECT set_config_parameter('EEVerticaOptions','ENABLE_JOIN_SPILL'); IMPORTANT: Setting ENABLE_JOIN_SPILL globally or leaving it on can degrade performance for subsequent queries, especially for queries that do not need it. Thus, Vertica recommends that you enable the join spill mechanism only when needed and then disable it immediately after the query successfully runs. 2. Re-execute the query.
  • Thanks a lot !!!! It really worked for me. Regards, Abhi
  • Hi Prasanta, Is there any way to find out what are the queries with highest memory consumption executed on system on a particular day? Can you provide me the SQL? I tried with many system tables but in vain. Thanks in advance !!!!
  • Prasanta_PalPrasanta_Pal - Select Field - Employee
    dbadmin=> select node_name, transaction_id, statement_id, memory_inuse_kb, duration_ms from RESOURCE_ACQUISITIONS order by memory_inuse_kb desc; get the query details from the belwo query by entering the transaction_id nd statement_id from the above. dbadmin=> select * from dc_requests_issued where transaction_id = <> and statement_id = <>;
  • Thanks !!! I tried with the following SQL also: select user_name, session_id, statement_start, sum_memory_in_use_kb, sum_thread_count,current_statement from sessions a, (select transaction_id, statement_id, sum(memory_inuse_kb) sum_memory_in_use_kb, sum(thread_count) sum_thread_count from resource_acquisitions where is_executing = 'true' group by 1,2) subn where a.transaction_id = subn.transaction_id and a.statement_id = subn.statement_id;
  • Hi Prasanta,

    I want to know resource pool memory allocation behaviour.

    Lets say I have three users A, Band C. I have created three user defined pools and allocated maxmemory 80% to each. Now what happens when users A and B are executing their task but C is idle. Does the memory allocated to C is reserved or it will be used by A and B if required.

    Is it a good approach to define separate pool for every users? If not, then how to overcome this?
    Suppose the query run by ETL sometimes fail with the error "Insufficient resource to execute on general pool". Next time they run, it gets executed. So I am bit surprised and confused.

    Thanks in advance !!!!

    Regards,
    Abhishek
  • Thanks, this option is great but how to turn it off globaly?

    SELECT set_config_parameter('EEVerticaOptions','ENABLE_JOIN_SPILL');

  • The post doesn't have the "+" at the beginning. It should be:

    select /*+add_vertica_options('EE','Enable_Join_Spill')*/ (Query)...
  • Came across this old conversation and wanted to comment.

      => SELECT set_config_parameter('EEVerticaOptions','ENABLE_JOIN_SPILL'); 

    This isn't usually a desirable thing to do - if you do this, then every single hash join will be forced into using the often-slower join spill method.  It would be far better to enable join spill on a specific query using a hint or session-specific option setting.  

      --Sharon


  • Hi!

    My 5 cents: its very important to install last fixes of Vertica:
    VER-30973 (Optimizer) Inconsistent results could occur with spilled joins. The inconsistencies occurred when the outer join input was sorted with a different column sequence ordering from the join key columns. This issue has been resolved.
  • We are on Vertica 7.0.1. When running a query, we observed spikes in disk usage. We suspected join spilled to disk. But we did not enable join spills. I thought join spills are disabled by default. Is this still the case in 7.0.1?

    Can we query any of the system tables to see if join spills are occurring? And how much it spills to disk?
  • >> Can we query any of the system tables to see if join spills are occurring? And how much it spills to disk?
    How to profile a single statement
    ...
    How to Find Out Which Counters are Available

    HP Vertica keeps track of the following counters during query execution:

    => SELECT DISTINCT(counter_name) FROM EXECUTION_ENGINE_PROFILES;
    counter_name
    -----------------------------------------------------
    estimated rows produced
    bytes spilled
    ...


  • Abhisek:

    Unless you have specified a resource pool parameter "MEMORYSIZE", none of those 3 users will hog up the memory. 

    I'm not entirely sure if it is a good approach to define separate pool for each user since I have no context. But, it is going to become more and more tedious to manage so many users. I recommend looking into HP Vertica Resource Manager documentation to find what your needs are. 

    -Bohyun

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file