How to check slow queries in vertica
PavanVejju
Vertica Customer ✭
for few read operations are taking morethan time, how to analys and command to fetch slow queries?
Tagged:
0
Best Answer
-
mosheg Vertica Employee Administrator
Did you use Database Designer to create optimized projection for your long running query?
See: https://www.vertica.com/docs/10.1.x/HTML/Content/Authoring/AdministratorsGuide/ConfiguringTheDB/PhysicalSchema/DBD/UsingAdministrationToolsToCreateADesign.htmIf your query include count distinct, did you try Approximate Count Distinct?
See: https://www.vertica.com/docs/10.1.x/HTML/Content/Authoring/AnalyzingData/Optimizations/OptimizingCOUNTDISTINCTByCalculatingApproximateCounts.htm0
Answers
Top 50 queries in execution for > 30 minutes.
A query running for an abnormally longer duration, can consume and hold resources which can impact other queries.
Can you try the below query?
select /+label(LongRunningQueries)/ s.node_name,s.user_name,transaction_id,statement_id,DATEDIFF('MINUTE',s.statement_start,ss.max_timestamp) query_runtime_minutes,s.current_statement||'...'::varchar(150) from sessions s , (select max(login_timestamp) max_timestamp from sessions ) ss where statement_id is not null and DATEDIFF('MINUTE',s.statement_start,ss.max_timestamp) > 30 order by 6 desc limit 50;
Try this query without the label.
select s.node_name,s.user_name,transaction_id,statement_id,DATEDIFF('MINUTE',s.statement_start,ss.max_timestamp) query_runtime_minutes,s.current_statement||'...'::varchar(150) from sessions s , (select max(login_timestamp) max_timestamp from sessions ) ss where statement_id is not null and DATEDIFF('MINUTE',s.statement_start,ss.max_timestamp) > 30 order by 6 desc limit 50;
Can you show how you are reading a file?
From our application, we are triggering select queries with dynamic where conditions.
Above query not get any record and I have changed the value from 30 to 1,
In addition, see this: https://www.vertica.com/blog/vsql-visual-storytelling-through-query-language/
Or try something like the following query to find Top10 long running queries in the last 6 hours:
@PavanVejju , as you mentioned some of the read operations taking longer, do you know those query are?
If you want to know
[1] How long the query is running you can try the below query.
select query_start::date,query_duration_us/1000000/60/60 from query_profiles where query ilike '%%';
[2] If you want to know why those read operations taking longer and where , get the transaction ID and statement ID of the slow query and run it against the execution engine profile table
select node_name, path_id, operator_name, counter_name, max(counter_value)
from execution_engine_profiles
where counter_name ilike '%us%'
and transaction_id= and statement_id=
group by 1,2,3,4 order by 5 desc;
select node_name, path_id, operator_name, counter_name, max(counter_value)
from execution_engine_profiles
where counter_name ilike '%byte%'
and transaction_id= and statement_id=
group by 1,2,3,4 order by 5 desc;
select node_name, path_id, operator_name, counter_name, sum(counter_value)
from execution_engine_profiles
where counter_name ilike 'rows%produced'
and transaction_id= and statement_id=
group by 1,2,3,4 order by 5 desc;
If you need more help on analyzing the query we will suggest open a support case .
@PavanVejju
You can also try something like a below query from query_profiles to find what time query started, how long is running and is_executing.
query_start | substr | query_duration_hour | is_executing
-------------+-------------------------------+---------------------+--------------
2021-01-08 | DELETE /+ DIRECT */ FROM EPP | 28 | f
2021-01-10 | INSERT /+DIRECT/ INTO v_tem | 17 | f
2021-01-10 | CREATE LOCAL TEMP TABLE TAAA 17 | f
2021-01-10 | update TEMP_ABB | 14 | f
2021-01-10 | INSERT /+DIRECT*/ INTO PTT | 13 | f
(10 rows)
Thank you @Nimmi_gupta / @mosheg for providing queries
It helps a lot for me to find slow queries with timings
Recently, We have upgraded the vertica machine with 64 processor, but still query performance is slow.
In my query, we have placed count(distinct a.x) and group by b.y and added projections, but still fetch is slow
@mosheg
I tried approximateCountDistinct, but not getting acurateDistinctCount
I didn't use Database Designer to create optimized projection, I will look and let you know
Thanks
For more accurate counts at the cost of performance, you can set error‑tolerance to a minimum value of 0.88. Vertica imposes no maximum restriction, but any value greater than 5 is implemented with 5% error tolerance.
See: https://www.vertica.com/docs/10.1.x/HTML/Content/Authoring/SQLReferenceManual/Functions/Aggregate/APPROXIMATE_COUNT_DISTINCT.htm
@Jim_Knicely Tried this option also but my requirement is need to come exact count.
@mosheg I have implemented "Database Designer to create optimized projection", still taking long time.
@PavanVejju -
If you have a single DISTINCT aggregate, you can try to rewite the query to avoid using count distinct:
https://www.vertica.com/docs/10.1.x/HTML/Content/Authoring/AnalyzingData/Optimizations/IfTheQueryHasASingleDISTINCTAggregate.htm
Or worse, if you have multiple DISTINCT aggregates, you can throw more memory at the query!
https://www.vertica.com/docs/10.1.x/HTML/Content/Authoring/AnalyzingData/Optimizations/IfTheQueryHasMultipleDISTINCTAggregates.htm?tocpath=Analyzing Data|Query Optimization|DISTINCT in a SELECT Query List|_____5
@Jim_Knicely
I tried this, but still its taking minitues. We are expecting in seconds.
Don't know Where i am doing wrong.
Before I pump the front tire of my Gold Wing, I measure the pressure; I don't just kick it.
Can you provide:
SELECT EXPORT_OBJECTS('','<schema>.<table>",FALSE)
I have a strong suspicion on what could be the reason; but I hate poking around and fighting just the symptoms. I need the root cause, and fix it at the root.
Hi @marcoth
Can you please refer this link https://pastebin.com/BTQa9Gvw
Let's look at your query and the physical design you're presented with. Re-formatted and commented.
This query and the design of the tables it works on could be used as a wonderful example for a lecture to intermediate SQL students..
It contains several problematic traits that can illustrate how to both write queries and create tables and projections to support Vertica in doing a good job at being performant.
I suspect you are running into a HASH JOIN (due to the join condition and the non-matching sort order of the projections of the two tables in question) and a HASH GROUP BY, as the grouping column -
game
- is nowhere useful in a sort clause of a projection definition.Hash tables sit in memory and occupy, for each entry, the longest possible number of bytes. So, a VARCHAR(500) that contains 'Joe', by default, occupies 500 bytes, not 3. And this effect multiplies by the number of rows to use in that hash table. You will quickly run out of memory.
Plus: the three columns used for the join are nowhere useful in an ORDER BY clause of a projection definition to enable a MERGE JOIN - Vertica's most efficient and parsimonious join, which is based on projections that are sorted the same way.
Can you run the command:
for me?
@marcothesane
Thanks for your valueable inputs.
You have modified based on your inputs, can you please refer below link.
https://pastebin.com/UvQHD6bD
Hi @marcothesane
https://pastebin.com/UvQHD6bD
After modifications, still facing same slow performance. Can you please let me know where exactly i am doing wrong.