ANALYZE_STATICS stuck
Hello,
I run a 3 node cluster on AWS with Vertica 7.0.1. The DB itself is very small, about 20G compressed on disk.
When running ANALYZE_STATISTICS, it seems to stuck very frequently. Sometimes it runs in a few seconds and sometimes it doesn't finish even in 30 minutes.
And when this happens, if I try to close the session , that doesn't help either, even after issuing close_session, ANALYZE_STATISTICS will just stuck in there.
Can you please advise what is the work around on this?
Thanks,
Tibor
I run a 3 node cluster on AWS with Vertica 7.0.1. The DB itself is very small, about 20G compressed on disk.
When running ANALYZE_STATISTICS, it seems to stuck very frequently. Sometimes it runs in a few seconds and sometimes it doesn't finish even in 30 minutes.
And when this happens, if I try to close the session , that doesn't help either, even after issuing close_session, ANALYZE_STATISTICS will just stuck in there.
Can you please advise what is the work around on this?
Thanks,
Tibor
0
Comments
If you want to check if ANALYSE_STATISTICS is still running ...open a duplicate session and checking query_profiles table if the statement is still running. The query_type column will be 'UTILITY', check with the is_executing column.
Secondly if you wish to stop ANALYZE_STATISTICS after a certain period sue INTERUPT_STATEMENT form a duplicate statement
https://my.vertica.com/docs/7.0.x/HTML/index.htm#Authoring/SQLReferenceManual/Functions/VerticaFunctions/INTERRUPT_STATEMENT.htm
Hope this helps
But the real issue is, why this is happening? Here is the output from the sessions table for the running queries. Again, the db is on 3 node cluster, with very little data (20Gb).
And as you can see, SELECT ANALYZE_STATISTICS('') is running for over 8 minutes now.
And why this is happening, other queries get hold up. See SELECT * FROM query_profiles LIMIT 100;
Is this a 7.0.1 specific issue? This is a pretty serious issue, we cannot safely updating statistics.
datediff | session_id | statement_id | current_statement
----------+-------------------+--------------+---------------------------------------------------------------------------------------------------------------------------------------------
460 | vdb1-1934:0x13422 | 1 | SELECT ANALYZE_STATISTICS('');
249 | vdb1-1934:0x13a78 | 1 | SELECT * FROM query_profiles LIMIT 100;
0 | vdb1-1934:0x13cac | 1 | SELECT datediff(second, statement_start, GETDATE()), session_id, statement_id, current_statement FROM SESSIONS WHERE current_statement <>''
At the same time, if I am trying to interrupt the held up query, my request in VSQL seem to hang up too!
dbadmin=> SELECT * FROM query_profiles LIMIT 100;
^CCancel request sent
^CCancel request sent
^CCancel request sent
^CCancel request sent
^CCancel request sent
^CCancel request sent
^CCancel request sent
^CCancel request sent
Please advise.,Thank you,
Tibor
Unless someone from support or engineering has seen the issue before and knows a solution.
So, some generic questions:
What kind of disk?
Where is the disk (co-located with the CPU, or is it some sort of SAN implementation in the cloud?)
What is the disk performance?
Are the nodes in the same rack?
What kind of hardware?
The specs:
1) 3x cc3.2xl nodex (15G RAM, 8 core)
2) Using local storage with the 2x SSDs in RAID 0 (to avoid EBS perf issues)
The local SSD are decent, Write: 150 MB/s, Read: 200 MB/s, SkipReads: 6400 seeks/s based on vioperf.
I seriously doubt that this is a storage/instance performance issue, as don't see that we with anything else. Everything runs fast consistently, except ANALYZE_STATISTICS(''), which runs sometimes 30 seconds and sometimes just stucks until I manage to close the session that it was started in.
I did at least 30 Vertica installations on various systems, but this is the first time I saw this issue, so I wonder if this is a 7.0.1 issue.
Thanks for any pointers.
Tibor
So the only way I can interrupt the statement if either shut down the connection physically between the client and the server.
Tried the same with vsql running on one of the server nodes itself and saw exact same issue, ie. I couldn't even close the session that was opened by vsql when ANALYZE_STATISTICS was running. So I had kill the vsql process, which in tern ended the session.
This is such a huge issue, can you please provide some assistance?
Thanks,
Tibor
Tibor, I'm not sure I can debug this without really debugging your system more thoroughly. Frankly, I'm not even sure where to start. I've never encountered this problem before, so don't really know what could be causing. I am curious, though. how often are you running analyze_statistics? It sounds like you're running it way more frequently than might be necessary. You could also try running analyze_histogram. It does the same thing, but you can lower the sample percentage. So, a smaller sample size with analyze_histogram might succeed where analyze_statistics is failing.