Queries very slow at high concurrency
Hi! We recently migrated an analytics software to Vertica and it seems our setup has issues with concurring requests / queries. We use lots of AJAX widgets and thus have usually about 15-30 concurring queries at a time. The higher the concurrency is, the longer all queries take. We are looking for a hint on how to reduce the query time. A quite basic query like:
SELECT * FROM account
runs in less than 50ms, at a concurrency of 25 (same query) the last query takes more than 500ms, at a concurrency of 50 it goes up to over 1300ms and so on. We played around with the resource pool configuration and found that for this test case a "Max Concurrency" of 1 is the fastest way (less than 900ms with 50 concurrent queries) but this still feels way too much, especially for such a basic query.
When we tested it with a more complex query that normally takes less than 350ms to execute, at a concurrency of 25 we get the first result after around 4000ms and the last after almost 10 seconds with "Max Concurrency" set to 1.
The box is a 8 core all SSD server with 16gb of RAM. CPU usage never gets higher than 400%, memory usage is below 10%.
We'd be happy about any input on how to optimize the performance of our setup. Thanks in advance!
SELECT * FROM account
runs in less than 50ms, at a concurrency of 25 (same query) the last query takes more than 500ms, at a concurrency of 50 it goes up to over 1300ms and so on. We played around with the resource pool configuration and found that for this test case a "Max Concurrency" of 1 is the fastest way (less than 900ms with 50 concurrent queries) but this still feels way too much, especially for such a basic query.
When we tested it with a more complex query that normally takes less than 350ms to execute, at a concurrency of 25 we get the first result after around 4000ms and the last after almost 10 seconds with "Max Concurrency" set to 1.
The box is a 8 core all SSD server with 16gb of RAM. CPU usage never gets higher than 400%, memory usage is below 10%.
We'd be happy about any input on how to optimize the performance of our setup. Thanks in advance!
0
Comments
vcpuperf is a little slower than good, but this should not be the bottleneck since the box is more bored than anything over all: vioperf is more than good, probably because of the SSD drives, network performance shouldn't matter since we're running the text on the database server itself.
We can't find any system bottlenecks on the management console. With 300 concurring queries CPU load never exceeds 50%, everything else doesn't even show an amplitude on the graphs.
I see your CPU stats above , however we face similar situation and overcome it with the parallel setting i advice you to give it a try ( it may related thread management , using parallel setting you reduce their count ) .
- max_concurrency
- execution_parallelism
- other ?
Tx
instead of 'select * from' have you tried 'select c1,c2 from' and how does that stack up in concurrency?
------------------------------------------------------+--------------------
SELECT account_id, name FROM account WHERE state = 1 | 100
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------
QUERY PLAN DESCRIPTION:
------------------------------
EXPLAIN SELECT account_id, name FROM account WHERE state = 1;
Access Path:
+-STORAGE ACCESS for account [Cost: 52, Rows: 46] (PATH ID: 1)
| Projection: public.account_tmp_super
| Materialize: account.account_id, account.name
| Filter: (account.state = 1)
------------------------------
-----------------------------------------------
PLAN: BASE QUERY PLAN (GraphViz Format)
-----------------------------------------------
digraph G {
graph [rankdir=BT, label = "BASE QUERY PLAN\nQuery: EXPLAIN SELECT account_id, name FROM account WHERE state = 1;\n\nAll Nodes Vector: \n\n node[0]=v_biddy_node0001 (initiator) Up\n", labelloc=t, labeljust=l ordering=out]
0[label = "Root \nOutBlk=[UncTuple(2)]", color = "green", shape = "house"];
1[label = "NewEENode \nOutBlk=[UncTuple(2)]", color = "green", shape = "box"];
2[label = "StorageUnionStep: account_tmp_super\nUnc: Integer(8)\nUnc: Varchar(255)", color = "purple", shape = "box"];
3[label = "ScanStep: account_tmp_super\n(account.state = 1)\naccount_id\nstate (not emitted)\nname\nUnc: Integer(8)\nUnc: Varchar(255)", color = "brown", shape = "box"];
1->0 [label = "V[0] C=2",color = "black",style="bold", arrowtail="inv"];
2->1 [label = "0",color = "blue"];
3->2 [label = "0",color = "blue"];
}
(27 rows)