Why is this query much slower than in Postgresql?
This query (joining two tables) takes about 250 seconds in Postgresql 9.1 and 28 minutes (consistently) in Vertica 6.1.3 on Linux on a host with 15G memory:
Thanks for any answers!
selectThe schemas are simple:
lead.score > 0,
count(*) as effort, count(distinct lead.id) as leads, count(distinct case lead.isconverted when false then lead.id end) as unconv, count(nullif(lead.isconverted, true)) as unconv_effort, from lead left outer join task on lead.id = task.whoid where not lead.isdeleted and lead.createddate between '2013-01-01' and '2013-11-01' group by lead.score > 0;
CREATE TABLE lead ( id varchar(65000) NOT NULL, score real, isconverted boolean, isdeleted boolean, createddate timestamp without time zone, ... 328 columns total ... ); CREATE TABLE task ( id varchar(65000) NOT NULL, whoid varchar(65000), ... 88 columns total ... );EXPLAIN shows nothing outrageous:
Access Path: +-GROUPBY HASH (LOCAL RESEGMENT GROUPS) [Cost: 447M, Rows: 1 (NO STATISTICS)] (PATH ID: 1) | Aggregates: count(DISTINCT lead.id), sum_of_count(*), sum_of_count(), count(DISTINCT ) | Group By: | +---> GROUPBY HASH (LOCAL RESEGMENT GROUPS) [Cost: 447M, Rows: 10K (NO STATISTICS)] (PATH ID: 2) | | Aggregates: count(*), count(CASE WHEN (lead.isconverted = true) THEN NULL::boolean ELSE lead.isconverted END) | | Group By: (lead.score > 0), lead.id, CASE lead.isconverted WHEN false THEN lead.id ELSE NULL END | | Grouping Sets: (, lead.id, , ), (, ) | | +---> JOIN HASH [RightOuter] [Cost: 437M, Rows: 6M (NO STATISTICS)] (PATH ID: 3) | | | Join Cond: (lead.id = task.whoid) | | | +-- Outer -> STORAGE ACCESS for task [Cost: 19M, Rows: 6M (NO STATISTICS)] (PATH ID: 4) | | | | Projection: task_super | | | | Materialize: task.whoid | | | | Runtime Filter: (SIP1(HashJoin): task.whoid) | | | +-- Inner -> STORAGE ACCESS for lead [Cost: 1M, Rows: 433K (NO STATISTICS)] (PATH ID: 5) | | | | Projection: lead_super | | | | Materialize: lead.id, lead.isconverted, lead.score | | | | Filter: (NOT lead.isdeleted) | | | | Filter: ((lead.createddate >= '2013-01-01 00:00:00'::timestamp) AND (lead.createddate <= '2013-11-01 00:00:00'::timestamp)) ------------------------------ ----------------------------------------------- PLAN: BASE QUERY PLAN (GraphViz Format) ----------------------------------------------- digraph G { graph [rankdir=BT, label = "BASE QUERY PLAN\nQuery: explain\nselect\n lead.score \> 0,\n count(*) as effort,\n count(distinct lead.id) as leads,\n count(distinct case lead.isconverted when false then lead.id end) as unconv,\n count(nullif(lead.isconverted, true)) as unconv_effort\nfrom lead\nleft outer join task on lead.id = task.whoid\nwhere not lead.isdeleted\n and lead.createddate between \'2013-01-01\' and \'2013-11-01\'\ngroup by lead.score \> 0;\n\nAll Nodes Vector: \n\n node[0]=v_pod_node0001 (initiator) Up\n", labelloc=t, labeljust=l ordering=out] 0[label = "Root \nOutBlk=[UncTuple(5)]", color = "green", shape = "house"]; 1[label = "NewEENode \nOutBlk=[UncTuple(5)]", color = "green", shape = "box"]; 2[label = "ParallelUnionStep: \nCombine\nUnc: Boolean(1)\nUnc: Integer(8)\nUnc: Integer(8)\nUnc: Integer(8)\nUnc: Integer(8)", color = "green", shape = "box"]; 3[label = "ExprEval: \n \\n \\n \\n \\n \\nUnc: Boolean(1)\nUnc: Integer(8)\nUnc: Integer(8)\nUnc: Integer(8)\nUnc: Integer(8)", color = "green", shape = "box"]; 4[label = "GroupByHash(Spill): 1 keys\nAggs:\n count(DISTINCT lead.id)\n sum_of_count(*)\n sum_of_count(\)\n count(DISTINCT \)\nUnc: Boolean(1)\nUnc: Integer(8)\nUnc: Integer(8)\nUnc: Integer(8)\nUnc: Integer(8)", color = "green", shape = "box"]; 5[label = "ParallelUnionStep: \nLocally Resegment\nUnc: Boolean(1)\nUnc: Integer(8)\nUnc: Integer(8)\nUnc: Integer(8)\nUnc: Integer(8)", color = "green", shape = "box"]; 6[label = "GroupByPipe(HashPrepass): 1 keys\nAggs:\n count(DISTINCT lead.id)\n sum_of_count(*)\n sum_of_count(\)\n count(DISTINCT \)\nUnc: Boolean(1)\nUnc: Integer(8)\nUnc: Integer(8)\nUnc: Integer(8)\nUnc: Integer(8)", color = "green", shape = "box"]; 7[label = "ExprEval: \n (lead.score \> 0)\n lead.id\n count(*)\n count(CASE WHEN (lead.isconverted = true) THEN NULL::boolean ELSE lead.isconverted END)\n CASE lead.isconverted WHEN false THEN lead.id ELSE NULL END\nUnc: Boolean(1)\nUnc: Varchar(65000)\nUnc: Integer(8)\nUnc: Integer(8)\nUnc: Varchar(65000)", color = "green", shape = "box"]; 8[label = "GroupByHash(Spill): 4 keys\nAggs:\n count(*)\n count(CASE WHEN (lead.isconverted = true) THEN NULL::boolean ELSE lead.isconverted END)\nUnc: Boolean(1)\nUnc: Varchar(65000)\nUnc: Varchar(65000)\nUnc: Integer(8)\nUnc: Integer(8)\nUnc: Integer(8)", color = "green", shape = "box"]; 9[label = "ParallelUnionStep: \nLocally Resegment\nUnc: Boolean(1)\nUnc: Varchar(65000)\nUnc: Varchar(65000)\nUnc: Integer(8)\nUnc: Integer(8)\nUnc: Boolean(1)", color = "green", shape = "box"]; 10[label = "ExprEval: \n (lead.score \> 0)\n lead.id\n CASE lead.isconverted WHEN false THEN lead.id ELSE NULL END\n \\n count(*)\n count(CASE WHEN (lead.isconverted = true) THEN NULL::boolean ELSE lead.isconverted END)\nUnc: Boolean(1)\nUnc: Varchar(65000)\nUnc: Varchar(65000)\nUnc: Integer(8)\nUnc: Integer(8)\nUnc: Boolean(1)", color = "green", shape = "box"]; 11[label = "GroupGeneratorStep\nkeys: 5\ngExprs: 1\nAggregates: \n count(*)\n count(CASE WHEN (lead.isconverted = true) THEN NULL::boolean ELSE lead.isconverted END)\nUnc: Boolean(1)\nUnc: Varchar(65000)\nUnc: Integer(8)\nUnc: Boolean(1)\nUnc: Varchar(65000)\nUnc: Integer(8)", color = "green", shape = "box"]; 12[label = "ExprEval: \n (lead.score \> 0)\n lead.id\n count(*)\n count(CASE WHEN (lead.isconverted = true) THEN NULL::boolean ELSE lead.isconverted END)\n CASE lead.isconverted WHEN false THEN lead.id ELSE NULL END\n 1\nUnc: Boolean(1)\nUnc: Varchar(65000)\nUnc: Integer(8)\nUnc: Boolean(1)\nUnc: Varchar(65000)\nUnc: Integer(8)", color = "green", shape = "box"]; 13[label = "ExprEval: \n (lead.score \> 0)\n lead.id\n CASE lead.isconverted WHEN false THEN lead.id ELSE NULL END\n 1\n CASE WHEN (lead.isconverted = true) THEN NULL::boolean ELSE lead.isconverted END\nUnc: Boolean(1)\nUnc: Varchar(65000)\nUnc: Varchar(65000)\nUnc: Integer(8)\nUnc: Boolean(1)", color = "green", shape = "box"]; 14[label = "Join: Hash-Join: \n(task x lead) using task_super and lead_super (PATH ID: 3)\n[RightOuter]\n\nUnc: Varchar(65000)\nUnc: Boolean(1)\nUnc: Float(8)", color = "green", shape = "box"]; 15[label = "StorageUnionStep: task_super\nUnc: Varchar(65000)", color = "purple", shape = "box"]; 16[label = "ScanStep: task_super\nSIP1(HashJoin): task.whoid\nid (not emitted)\nwhoid\nUnc: Varchar(65000)", color = "brown", shape = "box"]; 17[label = "StorageUnionStep: lead_super\nUnc: Varchar(65000)\nUnc: Boolean(1)\nUnc: Float(8)", color = "purple", shape = "box"]; 18[label = "ScanStep: lead_super\n(NOT lead.isdeleted)\n((lead.createddate \>= \'2013-01-01 00:00:00\'::timestamp) AND (lead.createddate \<= \'2013-11-01 00:00:00\'::timestamp))\nid\nisdeleted (not emitted)\ncreateddate (not emitted)\nisconverted\nscore\nUnc: Varchar(65000)\nUnc: Boolean(1)\nUnc: Float(8)", color = "brown", shape = "box"]; 1->0 [label = "V[0] C=5",color = "black",style="bold", arrowtail="inv"]; 2->1 [label = "0",color = "blue"]; 3->2 [label = "0",color = "blue"]; 4->3 [label = "0",color = "blue"]; 5->4 [label = "0",color = "blue"]; 6->5 [label = "0",color = "blue"]; 7->6 [label = "0",color = "blue"]; 8->7 [label = "0",color = "blue"]; 9->8 [label = "0",color = "blue"]; 10->9 [label = "0",color = "blue"]; 11->10 [label = "0",color = "blue"]; 12->11 [label = "0",color = "blue"]; 13->12 [label = "0",color = "blue"]; 14->13 [label = "0",color = "blue"]; 15->14 [label = "0",color = "blue"]; 16->15 [label = "0",color = "blue"]; 17->14 [label = "1",color = "blue"]; 18->17 [label = "0",color = "blue"]; } (82 rows)I also don't think I'm grokking the cost numbers because when I run the following query, it finishes in a couple seconds, which is closer to the performance I'm expecting:
select id, whoid from task order by id, whoid offset 5946020 limit 10; Access Path: +-SELECT LIMIT 10 OFFSET 6M [Cost: 915M, Rows: 10 (NO STATISTICS)] (PATH ID: 0) | Output Only: 5946030 tuples | +---> SORT [Cost: 915M, Rows: 6M (NO STATISTICS)] (PATH ID: 1) | | Order: task.id ASC, task.whoid ASC | | Output Only: 5946030 tuples | | +---> STORAGE ACCESS for task [Cost: 38M, Rows: 6M (NO STATISTICS)] (PATH ID: 2) | | | Projection: task_super | | | Materialize: task.id, task.whoidNote: I'm explicitly caring about performance without projections. These are examples of ad-hoc queries. I'm seeking query latencies on the order of several seconds.
Thanks for any answers!
0
Comments
http://pastebin.com/Nix7FCQz
I'm not sure what the cause of this slowdown is. Could you post more about what PostgreSQL is doing with the query?
One immediate thing that you could experiment with -- there's a known performance issue in current versions of Vertica with multiple COUNT(DISTINCT [...]) aggregates in a single query. We're quite aware of the issue; if you have the Enterprise Edition, you can contact Vertica Support about it if you have questions. I don't personally know of a good universal workaround in currently-released versions of Vertica, though specific queries can sometimes be rewritten to avoid using multiple distinct aggregates.
Also, I notice that your 'id' column is a VARCHAR(65000). Are your 'id' values really tens of thousands of characters? If not, Vertica tends to do much better with smaller VARCHAR fields. Though I would expect the same, though likely to a lesser degree, from PostgreSQL.
One other note, regarding projections -- Are you possibly used to optimizing using indices? It's true that indices probably won't help random ad-hoc queries all that much, so it may not be worth adding them. However, projections are different -- you say that you're explicitly caring about performance without projections, yet you do have projections here, and as you can see from the EXPLAIN plans, they are being used. (You always have at least one projection, usually two; Vertica can't store data without them. "SELECT * FROM PROJECTIONS;" will display more details. Note further that all projections are sorted, have some compression options set, etc. There really is no such thing as an unoptimized table in Vertica; just a poorly-optimized one.) And there are lots of optimizations that can be performed on that storage that are workload-independent, that will help any query. For example, the raw data can be reorganized so that it compresses better on disk; this means that doing a full table scan is cheaper. Unless you're only querying a table a few times before dropping it (in which case load time isn't small as compared to query time), we always recommend at least running the Database Designer to automatically tune projection design. Note that you do not need to give it a list of queries to work with; if you don't, it will still perform the types of optimizations described above.
Adam
What do you want? You don't work by Vertica guidelines: no statistics, no DBD performed, no optimal projections built - all against super projection. You don't know how to work with Vertica and you are expecting for a good results?
Learn how to work with Verica (read documentation), it isn't PostgreSQL, don't be a lazzy person.
Relatedly, no need to call anyone lazy -- the topic starter has clearly done some looking around, posted a bunch of information, etc. (As they note, the query plan looks reasonable, so statistics probably wouldn't help.) Pretty much everyone I've seen migrating to Vertica has fallen into the "all databases are the same, right?" trap at some point or another. A good response (and I think mine was too verbose, sorry about that!) will teach something about Vertica; give them a fact they didn't know, a reference to some bit of the docs that they may have missed, etc. Helpful to both the topic starter and to other new Vertica users who might stumble across this question while learning how we work.
https://my.vertica.com/docs/6.1.x/PDF/HP_Vertica_6.1.x_AdminGuide.pdf
(The DBD section starts on page 77; note that it does a lot of things, only one of which needs to know the queries that you most-often run.)
I also feel like I should plug our standard training pages, just for good measure; probably overkill for this specific issue but maybe interesting:
http://www.vertica.com/customer-experience/vertica-101/
http://www.vertica.com/customer-experience/training/
Anyway, as I mentioned, this could be a real performance bug. If you have further information, feel free to post it! We are, of course, always working to improve our performance; I can't speak to specific roadmaps or features or anything like that, but if you have the chance, I'd encourage you to try this again on new Vertica releases as they come out.
Adam
PG is just scanning the full datasets, doing a hash join, and sorting for the group by: Even without the count(distinct), the query is slow. It seems to be in the join. I can boil much of the slowness down to just: On other points:
Our text fields can in fact be longer than 65K, but most are short, and it just so happens that in this example dataset the IDs are <65K.
Regarding projections: we understand the data representation can be optimized, but at this point, we're just interested in the performance-debugging the naive arrangement—even with unordered, uncompressed data, etc. (why I mentioned the time needed to sort a column of the task table). As a strawman, a sort-merge join should (theoretically take less time than half an hour.
Just to verify that unused columns were not being read or anything silly like that, I tried this on pared-down versions of the lead and task table with all unused columns removed (same results).
I'm happy to provide further details—feel free to let me know if there's anything else that would be helpful here. Thanks in advance for any help.
Hm... Interesting that switching to a single COUNT(*) didn't help.
From the EXPLAIN plan, it looks like "lead" has a little under 450k rows and "task" has about 6M rows. Is that roughly correct? Trying to get a sense of the size of the data.
I'm also curious to get a better sense of where all that time is going. When the system is executing the query, is it pegging one CPU core? Several CPU cores? Is it disk IO-bound?
You could also poke at the "execution_engine_profiles" system table. It's extremely verbose and its contents can, unfortunately, be a little hard to decipher. But it will give you lots of stats about each operator in the query plan of the query as it's executing and after it has finished. It would be interesting to know if one operator is running for dramatically longer than others, for example.
It would also be interesting to know if the join is spilling to disk. (I believe you'll see a message about join spills in the Vertica log if this happens, though it's been a little while since I poked at join spills...) I'm not sure why that would happen on a data set of this size; but I'm also not sure why it's slow, so, might as well ask.
Lastly, regarding projections, I feel like I should reiterate "no really you want optimized projections, it's always worth the time!" But it sounds like you're trying to do something specific. I also agree with you that, at first glance at least, it seems like you ought to get better performance than this even with a pessimal projection design; in which case something else is going on here. (Though I'm sure there's a reason; just not sure what it is yet.)
Thanks,
Adam
Hi Adam—
Quick answer to first questions (will dig into the rest later): all four cores on the machine max out during query execution. dstat shows only a trickle of IO (50-100KB per second). And yep, 450K and 6M are the table sizes.
For the sort spilling—I'll need to re-run with more verbose logging but looking at the file handles the workers have open, there's some files like Sort_Temp_358.dat, which suggests sorts are spilling to disk (not surprising).