ENABLE_JOIN_SPILL Usage

rsilva4 · April 2014

I've been having problems with some queries that contain one or more joins in them. The queries look like this:

select
    "view"."unit" as "c0"
from (
    select "tabl"."c1", "tabl"."c2", "tabl"."c3", "tabl"."id", "multi_c3"."unit" as "unit"
    from
        "tabl" as "tabl"
    left join
        "multi_c3" as "multi_c3"
    on
        "tabl"."c3" = "multi_c3"."id") as "view"
group by "view"."unit"
having NOT((count("view"."id") is null)) 
order by CASE WHEN "view"."unit" IS NULL THEN 1 ELSE 0 END, "view"."unit" ASC

Nothing too fancy. The tables might have several million rows each.

While executing this type of query i often get the exception:

com.vertica.util.ServerException: [Vertica][VJDBC](3814) ERROR: Join inner did not fit in memory

This is somehow expected since the hardware might not be the best to support the amount of data, still that is another problem.
While googling I've found reference to an option that enables the join to be spilled into disk if required, but i failed to find any mention to it on the documentation (https://my.vertica.com/docs/7.0.x/HTML/index.htm).

Looking at https://community.vertica.com/vertica/topics/enable_join_spill and https://community.vertica.com/vertica/topics/join_inner_did_not_fit_in_memory I've tryed to test this option to evaluate if it would solve the problem or if it would do more harm than good. Unfortunately I haven't been able to use it, it seems like it does nothing.

I'm running the query over a jdbc connection and i've tried to set the property on the database permanently with:

SELECT set_config_parameter('EEVerticaOptions','ENABLE_JOIN_SPILL')

And running it within the same statement as the sql query above like:

select add_vertica_options('EE', 'ENABLE_JOIN_SPILL'); 
select
    "view"."unit" as "c0"
from (
    select "tabl"."c1", "tabl"."c2", "tabl"."c3", "tabl"."id", "multi_c3"."unit" as "unit"
    from
        "tabl" as "tabl"
    left join
        "multi_c3" as "multi_c3"
    on
        "tabl"."c3" = "multi_c3"."id") as "view"
group by "view"."unit"
having NOT((count("view"."id") is null)) 
order by CASE WHEN "view"."unit" IS NULL THEN 1 ELSE 0 END, "view"."unit" ASC

But the error remains and the option seems to have no effect. Any idea what i'm doing wrong?

[Deleted User] · April 2014

Hi,

This is an option that sets a flag that is used internally by the Vertica binary. We have not blocked it as it is occasionally a useful debugging aid for us developers, but we may do so in the future because it's a potential source of user confusion. It's not for general use.

Its name is historic, and now inaccurate. It might be more accurately called "FORCE_JOIN_SPILL", but even that is only sometimes true and only covers part of what this option does.

I would strongly advise against setting it generally as one of its other side effects is to disable several algorithms that can be used to make queries in general faster. (Basically, it tells Vertica to use only a specific category of query plan, and ignore other types of plans that are generally faster but require newer / more-complex runtime logic to spill in case of join.) So it will slow many queries and benefit few or none. (When we don't document an option, there's usually a reason :-) )

Join spilling in Vertica is always "enabled"; we will always spill a join if we need to do so and we are able to do so. There do exist some types of joins that Vertica is unable to spill. Typically certain types of cross-joins where you're requesting every row in the outer *for each row* in the inner ("O(n^2)"); so if even the inner has so many records that its part of the JOIN doesn't fit in memory, that query'll take take years to finish... So if you get this error, it's usually worth looking at the query to see if you're really asking what you're intending to ask, or if there's a better way to express your question given your data model.

Usually such queries are much weirder than this, though; this looks like a pretty straightforward operation. I'm guessing that one of your GROUP BY, HAVING, or ORDER BY clauses is somehow tripping things up. Could you try removing them one at a time, just as an experiment? Then we can go from there.

Thanks,
Adam

Hilario_Fernand · April 2014

Hi Adam,

Thank you for your answer. I'm working on this with Ruben so i'll answer on his behalf.

I've tried executing a simpler version of the query:

select "tabl"."c1", "tabl"."c2", "tabl"."c3", "tabl"."id", "multi_c3"."unit" as "unit" from    "tabl" as "tabl" left join    "multi_c3" as "multi_c3" on     "tabl"."c3" = "multi_c3"."id"

The result was the same.

The join colums are varchar, with about 60-100 characters. This is being run on a single node installation on a quadcore with 6gb of ram.

The tables have about 200-250 thousand records.

Do you think the problem is generated because of the core/ram ratio?

A weird thing i've noticed is the fact that i only get the problem when executing the query while running the application (DataSource, jdbc) and the same query does run when i execute it from command line.

Sharon_Cutter · April 2014

Hi Adam,

Join spilling in Vertica is always "enabled"; we will always spill a join if we need to do so and we are able to do so.

One scenario that can warrant setting enable_join_spill on a per-query basis is where there are many subqueries, and it takes many minutes for the query execution to get to the point where it's far enough along for the hash table to not fit in memory, and then do the automatic retry with join spill enabled. I've seen this many times, at least in 5.x and 6.x, where it's better for a set of queries to just enable join spill up front with a hint rather than waste so much time every time that the query executes, to just find that the query requires join spill. I assume that this scenario is still possible in 7.0. Because this is a scenario where it's useful to explicitly turning on join spill, I am puzzled as to why the option was removed from the documentation, leaving people to search the web for hints of what to do. I accept though that any time that someone explicitly enables join spill for a query, that the query should be revisited with each upgrade to ensure that the default query execution still warrants enabling join spill at all.

--Sharon

[Deleted User] · April 2014

Hey Sharon,

Good question: So, as you correctly note, there are scenarios where this knob can be useful. (We'd like to get rid of them by automatically optimizing those situations, but there are likely still a few left in 7.) So, in the hands of a power-user who understands exactly what it does, it does have its uses.

However, there are also many scenarios where it can be quite harmful.

The real issue, I think, is the name -- ENABLE_JOIN_SPILL is, simply put, false. The option doesn't actually enable join spill; it does something significantly more complicated with joins. (Not sure how familiar you are with its exact effects; obviously undocumented so not sure how much I can discuss it here.) Vertica has lots of new users these days; we've seen many people take this option at face value without really reading the docs to understand what it does, then get themselves into trouble using it.

(Unfortunately, that's the name we used; backwards compatibility and all that...)

It's good to know that you use this often -- another sign that we could use a better solution here, that gives users an option without leading many people entirely astray.

Adam

[Deleted User] · April 2014

Hi Hilario,

Hm... I'm surprised both that this is erroring out, and that it's causing the join to spill in the first place. Those are tiny tables; at least, at 100 characters and 250k rows, that's 25mb of data. Clearly much less than your 6gb RAM.

Let me throw a few questions out there; see if anything sticks:

- How wide is the 'id' column? (Not the data; the actual column declaration.) Or other columns that you are querying. If the column size is much bigger than the actual data width, in some (not all) situations we'll need to allocate more memory just in case a much-wider row comes along.

- Have you declared a PK/FK relationship between "id" and "c3"? (Would one be appropriate in this case?; does "id" contain unique values?)

- Have you run the Database Designer for this query? This join should be amenable to a pre-sorted merge join, which would be very fast and use minimal memory.

Adam

Hilario_Fernand · April 2014

Well, you might've hit a soft spot there, the mv_c3.id and tabl.c3 columns are varchar(10000). What is sent into memory in this type of query? All the columns selected? We could probably improve the join columns turnin them into a hash of the data with a fixed size. You believe that would greatly decrease the memory footprint?

There is no PK/FK relation between those columns because mv_c3.id is not by it self unique to the table, repeating itself greatly. The pair mv_c3.id and mv_c3.unit is unique though.

The problem with using DB Designer on our project is the dynamic nature of the table structure. the table "tabl" can have multiple "c3" like columns that link to a multi_* table. They are created on demand and there can be several joins into the main table ocurring, it depends on what the user asks. The table "tabl" itself is dynamic, there is usually a good number of these tables and they are created on demand. We've tried using projections, but we got to a point where we had 30 projections on a table, and these were only for simple joins like the one i posted. When we requested 2+ joins the performance would decrease as expected since only the first join would benefit from the projection. Also, since the "tabl" is dynamic it could be dropped by the user at any time, and this proved to be a painfull task for vertica given a high number of projections.

[Deleted User] · April 2014

Hi Hilario,

Hm... You say that there are many duplicate values for 'id'. What do you intend this query to do? Are you sure it does what you are intending?:

dbadmin=> create table t1 (i int, v varchar);
CREATE TABLE
dbadmin=> create table t2 (i int, v varchar);
CREATE TABLE
aseering=> copy t1 from stdin;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> 1|a
>> 1|a
>> \.
dbadmin=> copy t2 from stdin;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> 1|b
>> 1|b
>> \.
dbadmin=> select * from t1 left join t2 on t1.i = t2.i;
i | v | i | v
---+---+---+---
1 | a | 1 | b
1 | a | 1 | b
1 | a | 1 | b
1 | a | 1 | b
(4 rows)

In other words, duplicate join keys means a cross join. (If you had 20 values of "id" and 20 of tabl.c3, that's 400 rows; etc.)

That could be what you intend; just making sure. It may be what's getting this join into trouble -- gIf you could re-work this query to aggregate first and then join, I suspect it would be happier. Though, doing so would likely require knowing something about your data or changing the results that you're asking for.

Regarding memory usage: Depending on the type of join operator used (we're working to improve this; newer Vertica versions are better), the worst-possible case is that you use memory that scales with the widths of all columns queried. (Column widths, not actual width of the data. Again, something we're working to improve. Can't commit to a timeline here, though. If it's especially important to you, and if you have an EE account, you're certainly welcome to get in touch with your sales rep or to file a support case for more details.)

If a hash of the join keys would be sufficient for you, that would very likely help here. If a 64-bit hash (such as Vertica's HASH() function) would be sufficient, you could put the value into an INTEGER; Vertica has highly-optimized integer operations, so this would be both more memory-efficient and faster.

Regarding the DBD -- ah, that's an interesting use case. And indeed one where the DBD (or projections in general) aren't so helpful. It'd be interesting to hear more about it. A comment on a response on a forum post is possibly the wrong channel -- if you have the chance to talk with one of us at an event, or happen to blog about what you do, etc., let me know.

Thanks,
Adam

Hilario_Fernand · April 2014

My data is something as follows:

Tabl(c1,c2)
1|a,b
2|a,c
3|a,b,c

mv(c2,c3)
a,b|a
a,b|b
a,c|a
a,c|c
a,b,c|a
a,b,c|b
a,b,c|c

Join result expected(c1,c3):
1|a
1|b
2|a
2|c
3|a
3|b
3|c

It is returning what expected, and i need those c3 values separated and linked to c1.

As for the hash, i looked into the HASH function and i'm not sure if i can use because i need it to be unique for the string given, so I can migrate the data, and i need to be able to calculate it outside of vertica. I couldn't find anywhere which algorithm was used for this function so i was thinking of going with an md5. Can you point me to any information that would allow me to code an hash function with similar output? Will the results of joining a varchar(32) vs Integer be greatly significant?

Thanks

[Deleted User] · April 2014

Hi Hilario,

Regarding performance -- if you want to know what the difference is, test it :-) A varchar(32) is certainly smaller than a varchar(10000). But an INTEGER is 8 bytes, so it's smaller still. Also, "integer" is a fixed-size type; it's always exactly 8 bytes, and 8 bytes in particular happens to be a very efficient size for x86_64 processors to work with, so we can optimize around that.

If you'd like to use MD5, Vertica does provide an MD5() function:

https://my.vertica.com/docs/6.1.x/HTML/index.htm#12080.htm

Please be aware, though, that all hash functions can and do have collisions; two different strings that hash to the same thing. This mathematically must be the case -- there are many more strings than there are hash codes, since every possible hash code is a valid string and many strings are not valid hash codes. Just, some hash functions (such as MD5) are quire good at avoiding collisions, so it's very unlikely on real data.

Adam

Hilario_Fernand · April 2014

Thank you again for all your help Adam!

I had a question on my last comment that I believe i didn't express correctly. Whats the algorithm behind the vertica HASH() function? I would need to calculate it in my code outside of vertica also if i were to use it.

Thanks

[Deleted User] · April 2014

Hi Hilario,

Ah -- that's a good question; unfortunately, I don't believe we have published the algorithm behind our HASH() implementation... So if you do need the external implementation, you may be stuck with MD5().

If you have an algorithm that you prefer, you can implement it as a scalar function using our C++ add-on API. (We support other languages too; I mention C++ because, in its un-fenced mode, it can provide essentially the same performance as a built-in operator, depending only on the efficiency of your implementation.) You can also use this API to link in an existing library/implementation; etc.

Re-implementing hash functions can be tricky -- with an off-by-one error or a difference in rounding, they can be correct virtually always, but once in a billion or a trillion records they'll give an incorrect/different answer. (So it's possible to reimplement one, but it's prone to impossible-to-track-down bugs.)

Adam

We're Moving!

Create My New Community Account Now

ENABLE_JOIN_SPILL Usage

Comments

Leave a Comment