Please take this survey to help us learn more about how you use third party tools. Your input is greatly appreciated!

How to check slow queries in vertica

for few read operations are taking morethan time, how to analys and command to fetch slow queries?

Best Answer

Answers

  • Top 50 queries in execution for > 30 minutes.
    A query running for an abnormally longer duration, can consume and hold resources which can impact other queries.
    Can you try the below query?
    select /+label(LongRunningQueries)/ s.node_name,s.user_name,transaction_id,statement_id,DATEDIFF('MINUTE',s.statement_start,ss.max_timestamp) query_runtime_minutes,s.current_statement||'...'::varchar(150) from sessions s , (select max(login_timestamp) max_timestamp from sessions ) ss where statement_id is not null and DATEDIFF('MINUTE',s.statement_start,ss.max_timestamp) > 30 order by 6 desc limit 50;

  • Try this query without the label.
    select s.node_name,s.user_name,transaction_id,statement_id,DATEDIFF('MINUTE',s.statement_start,ss.max_timestamp) query_runtime_minutes,s.current_statement||'...'::varchar(150) from sessions s , (select max(login_timestamp) max_timestamp from sessions ) ss where statement_id is not null and DATEDIFF('MINUTE',s.statement_start,ss.max_timestamp) > 30 order by 6 desc limit 50;

  • Can you show how you are reading a file?

  • From our application, we are triggering select queries with dynamic where conditions.

    Above query not get any record and I have changed the value from 30 to 1,

  • moshegmosheg Employee
    edited January 12

    In addition, see this: https://www.vertica.com/blog/vsql-visual-storytelling-through-query-language/
    Or try something like the following query to find Top10 long running queries in the last 6 hours:

    SELECT QC.TRANSACTION_ID,QC.STATEMENT_ID, QC.CPU_CYCLES_US, QC.DURATION_MS, LEFT(QR.REQUEST,30) QUERY
    FROM QUERY_CONSUMPTION QC, QUERY_REQUESTS QR
    WHERE QC.request_type='QUERY' AND
          QC.success AND
          ( NOW() - QC.END_TIME) < '6 hour' AND
          QC.TRANSACTION_ID = QR.TRANSACTION_ID AND
          QC.STATEMENT_ID = QR.STATEMENT_ID
    ORDER BY 3 DESC,4
    LIMIT 10;
    
      TRANSACTION_ID   | STATEMENT_ID | CPU_CYCLES_US | DURATION_MS |             QUERY
    -------------------+--------------+---------------+-------------+--------------------------------
     45035996274039564 |           40 |         58233 |          64 | SELECT 'tests.mogo_campaigns.N
     45035996274039646 |           38 |         36983 |          43 | SELECT 'tests.mogo_campaigns.N
     45035996274039792 |            4 |         31320 |          72 | SELECT QC.TRANSACTION_ID,QC.ST
     45035996274039792 |            5 |         28960 |          69 | SELECT QC.TRANSACTION_ID,QC.ST
     45035996274039731 |           40 |         27826 |          33 | SELECT 'tests.mogo_campaigns.N
     45035996274039792 |            3 |         18350 |          69 | SELECT transaction_id,statemen
     45035996274039792 |            2 |          8766 |          23 | Select * from dc_optimizer_sta
     45035996274039564 |           47 |          8711 |          13 | SELECT 'tests.mogo_campaigns.i
     45035996274039731 |           47 |          8651 |          13 | SELECT 'tests.mogo_campaigns.i
     45035996274039646 |           45 |          8554 |          13 | SELECT 'tests.mogo_campaigns.i
    (10 rows)
    
    
    
  • @PavanVejju , as you mentioned some of the read operations taking longer, do you know those query are?
    If you want to know
    [1] How long the query is running you can try the below query.
    select query_start::date,query_duration_us/1000000/60/60 from query_profiles where query ilike '%%';
    [2] If you want to know why those read operations taking longer and where , get the transaction ID and statement ID of the slow query and run it against the execution engine profile table

    [a]The below query will show, max execution time on each operator, If execution time is high on Network operator, check for send and receive data over the network. Slow network can impact on reading and writing performance.

    select node_name, path_id, operator_name, counter_name, max(counter_value)
    from execution_engine_profiles
    where counter_name ilike '%us%'
    and transaction_id= and statement_id=
    group by 1,2,3,4 order by 5 desc;

    [3] Run the below query to check number of bytes send and received.

    select node_name, path_id, operator_name, counter_name, max(counter_value)
    from execution_engine_profiles
    where counter_name ilike '%byte%'
    and transaction_id= and statement_id=
    group by 1,2,3,4 order by 5 desc;

    [4] As you seeing slowness on read operator, check for the row produced by running the below query.

    select node_name, path_id, operator_name, counter_name, sum(counter_value)
    from execution_engine_profiles
    where counter_name ilike 'rows%produced'
    and transaction_id= and statement_id=
    group by 1,2,3,4 order by 5 desc;
    If you need more help on analyzing the query we will suggest open a support case .

  • @PavanVejju
    You can also try something like a below query from query_profiles to find what time query started, how long is running and is_executing.

    The below example showing, delete query took 28 hours to execute.

    select query_start::date, substr(query,0,30) ,floor(query_duration_us/1000000/60/60) as query_duration_hour, is_executing from query_profiles order by 3 desc limit 10;

    query_start | substr | query_duration_hour | is_executing
    -------------+-------------------------------+---------------------+--------------
    2021-01-08 | DELETE /+ DIRECT */ FROM EPP | 28 | f
    2021-01-10 | INSERT /+DIRECT/ INTO v_tem | 17 | f
    2021-01-10 | CREATE LOCAL TEMP TABLE TAAA 17 | f
    2021-01-10 | update TEMP_ABB | 14 | f
    2021-01-10 | INSERT /
    +DIRECT*/ INTO PTT | 13 | f

    (10 rows)

  • Thank you @Nimmi_gupta / @mosheg for providing queries
    It helps a lot for me to find slow queries with timings
    Recently, We have upgraded the vertica machine with 64 processor, but still query performance is slow.
    In my query, we have placed count(distinct a.x) and group by b.y and added projections, but still fetch is slow

  • @mosheg
    I tried approximateCountDistinct, but not getting acurateDistinctCount

    I didn't use Database Designer to create optimized projection, I will look and let you know

    Thanks

  • Jim_KnicelyJim_Knicely Administrator

    For more accurate counts at the cost of performance, you can set error‑tolerance to a minimum value of 0.88. Vertica imposes no maximum restriction, but any value greater than 5 is implemented with 5% error tolerance.

    See: https://www.vertica.com/docs/10.1.x/HTML/Content/Authoring/SQLReferenceManual/Functions/Aggregate/APPROXIMATE_COUNT_DISTINCT.htm

  • @Jim_Knicely Tried this option also but my requirement is need to come exact count.
    @mosheg I have implemented "Database Designer to create optimized projection", still taking long time.

  • @Jim_Knicely
    I tried this, but still its taking minitues. We are expecting in seconds.
    Don't know Where i am doing wrong.

  • Before I pump the front tire of my Gold Wing, I measure the pressure; I don't just kick it.

    Can you provide:

    • the query
    • for each table involved in the query, a SELECT EXPORT_OBJECTS('','<schema>.<table>",FALSE)
    • row counts for the tables involved

    I have a strong suspicion on what could be the reason; but I hate poking around and fighting just the symptoms. I need the root cause, and fix it at the root.

  • Hi @marcoth

    Can you please refer this link https://pastebin.com/BTQa9Gvw

  • Let's look at your query and the physical design you're presented with. Re-formatted and commented.

    SELECT 
        count(DISTINCT dms.member_id) AS unique_players -- varchar(80) , candidate for a big HASH table in memory
      , b.game AS game                                  -- varchar(500)   -- candidate for a 4.25 bigger HASH table in memory than above
           -- unless the optimizer manages a GROUPBY pushdown, grouping by game_id first, and then joining with games_info
    FROM pp_vertica.daily_member_summary dms
    LEFT JOIN pp_vertica.games_info b
      ON (  -- 3 longer VARCHARs - with no commonly sorted projections. bigger HASH table for join probably needed
              dms.game_id = b.game_id  -- varchar(100) IN "dms", int in "b" !                              -- type conversion for join
          AND dms.SERVER = b.SERVER    -- varchar(30)  in "dms", varchar(255) in "b"     -- type conversion for join
          AND dms.product = b.product  -- varchar(80)  in "dms", varchar(100) in "b"       -- type conversion for join
          )
    -- filter condition forces an unnecessary type conversion, and the range predicate could be replaced by an equi filter
    WHERE dms.summary >= '2020-12-01'::TIMESTAMPTZ(3) -- dms.summary is date, not timestamptz, and 10th in sort order
      AND dms.summary < '2021-01-01'::TIMESTAMPTZ(3)  -- < if date_month in daily_member_summary contains '2020-12-01'
                                                      -- use an equi predicate
    GROUP BY b.game; -- VARCHAR(500)
    

    This query and the design of the tables it works on could be used as a wonderful example for a lecture to intermediate SQL students..
    It contains several problematic traits that can illustrate how to both write queries and create tables and projections to support Vertica in doing a good job at being performant.

    I suspect you are running into a HASH JOIN (due to the join condition and the non-matching sort order of the projections of the two tables in question) and a HASH GROUP BY, as the grouping column - game - is nowhere useful in a sort clause of a projection definition.
    Hash tables sit in memory and occupy, for each entry, the longest possible number of bytes. So, a VARCHAR(500) that contains 'Joe', by default, occupies 500 bytes, not 3. And this effect multiplies by the number of rows to use in that hash table. You will quickly run out of memory.

    Plus: the three columns used for the join are nowhere useful in an ORDER BY clause of a projection definition to enable a MERGE JOIN - Vertica's most efficient and parsimonious join, which is based on projections that are sorted the same way.

    Can you run the command:

    EXPLAIN
    SELECT 
        count(DISTINCT dms.member_id) AS unique_players
      , b.game AS game
    FROM pp_vertica.daily_member_summary dms
    LEFT JOIN pp_vertica.games_info b
      ON (
              dms.game_id = b.game_id
          AND dms.SERVER = b.SERVER
          AND dms.product = b.product 
          )
    WHERE dms.summary >= '2020-12-01'::TIMESTAMPTZ(3)
      AND dms.summary < '2021-01-01'::TIMESTAMPTZ(3)
    GROUP BY b.game;
    

    for me?

  • @marcothesane
    Thanks for your valueable inputs.
    You have modified based on your inputs, can you please refer below link.
    https://pastebin.com/UvQHD6bD

  • Hi @marcothesane
    https://pastebin.com/UvQHD6bD
    After modifications, still facing same slow performance. Can you please let me know where exactly i am doing wrong.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file

Can't find what you're looking for? Search the Vertica Documentation, Knowledge Base, or Blog for more information.