Please take this survey to help us learn more about how you use third party tools. Your input is greatly appreciated!

Bug in WHERE clause?

Hi all,
I believe I must have run into a bug with the WHERE clause in SQL on our Vertica Analytic Database v9.3.0-0.
I have the following relatively simple WHERE on a complex query:
WHERE txnYesterday.AverageBasketSizeEUR > agg90Days.avg90DayAverageBasketSizeEUR + 4 * agg90Days.stddev90DayAverageBasketSizeEUR
The values of come from subqueries that make the calculations over different time spans and result in the following:

The problem is that the resulting WHERE should be FALSE:
WHERE 20 > 20 + 4 * 0
But the row is part of the result set as if "<" would be the same as "<=".
I tried to debug the problem by replacing the columns in the query with literals, but the problem continues until I replace every column with a literal.
Test 1:

Test 2:

Test 3:

Tagged:

Best Answers

  • marcothesanemarcothesane Administrator
    Answer ✓

    This smells like a rounding problem ...
    As a test: Try to put into the select list of the query:

    , txnYesterday.AverageBasketSizeEUR
    , agg90Days.avg90DayAverageBasketSizeEUR + 4 * agg90Days.stddev90DayAverageBaskedSizeEUR
    

    Try working with:

    , txnYesterday.AverageBasketSizeEUR::NUMERIC(18,2)
    , (agg90Days.avg90DayAverageBasketSizeEUR + 4 * agg90Days.stddev90DayAverageBaskedSizeEUR)::NUMERIC(18,2)
    

    Or also INTEGERs instead of NUMERICs ...

  • Answer ✓

    Hi @marcothesane,
    Thank you for your answer.
    I have put the columns into the select list and added a few variation for debugging.
    I discovered that agg90Days.stddev90DayAverageBasketSizeEUR is 0 and inconsequential for the Bug so I removed it to make the debugging clearer.
    Rounding should not be the problem since both values are exactly 20, but rounding and typecasting indeed solves the issue:

    So I tried looking at the data types (is there a good way to do this?) by causing some errors:
    test(txnYesterday.AverageBasketSizeEUR) --> ERROR: Function test(numeric) does not exist, or permission is denied for test(numeric)
    test(agg90Days.avg90DayAverageBasketSizeEUR) --> ERROR: Function test(float) does not exist, or permission is denied for test(float)
    So I guess vertica uses different data types on those numbers and therefore the binary comparison done by the greater than operator comes up with a wring result.

Answers

  • marcothesanemarcothesane Administrator

    I am charmed by your way of determining the data type of an expression, @peter_krammer ! No, I don't know of any better way. I usually went creating a table with an expression and then having the table described - and dropped again ...
    Not only in SQL, floats in their internal representation often can't express certain values with fractions in absolutely precise way, so they are never equal to what would be the same literal typecast as a NUMERIC(n,m). By saying 'rounding problem' I meant this specific issue with floats.

  • I have tried to construct other queries to recreate this problem, but I can't.
    Vertica seems to force all expressions into the numeric type before comparison.
    Then again in my specific query on that particular set of data, it happens to not do it, which I assume is a bug and can lead to undesired results, like in my case.

  • marcothesanemarcothesane Administrator

    Not only in Vertica, but in any database, I personally don't leave it to the discretion of the database to cast expressions as they "like" before performing operations on them. I also avoid testing for equality of floats, as they can often differ in a small fraction. Explicit cast makes you the master instead of the victim of what is to happen.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file

Can't find what you're looking for? Search the Vertica Documentation, Knowledge Base, or Blog for more information.