Bug in WHERE clause?
Hi all,
I believe I must have run into a bug with the WHERE clause in SQL on our Vertica Analytic Database v9.3.0-0.
I have the following relatively simple WHERE on a complex query:WHERE txnYesterday.AverageBasketSizeEUR > agg90Days.avg90DayAverageBasketSizeEUR + 4 * agg90Days.stddev90DayAverageBasketSizeEUR
The values of come from subqueries that make the calculations over different time spans and result in the following:
The problem is that the resulting WHERE should be FALSE: WHERE 20 > 20 + 4 * 0
But the row is part of the result set as if "<" would be the same as "<=".
I tried to debug the problem by replacing the columns in the query with literals, but the problem continues until I replace every column with a literal.
Test 1:
Test 2:
Test 3:
Best Answers
-
marcothesane - Select Field - Administrator
This smells like a rounding problem ...
As a test: Try to put into the select list of the query:, txnYesterday.AverageBasketSizeEUR , agg90Days.avg90DayAverageBasketSizeEUR + 4 * agg90Days.stddev90DayAverageBaskedSizeEUR
Try working with:
, txnYesterday.AverageBasketSizeEUR::NUMERIC(18,2) , (agg90Days.avg90DayAverageBasketSizeEUR + 4 * agg90Days.stddev90DayAverageBaskedSizeEUR)::NUMERIC(18,2)
Or also INTEGERs instead of NUMERICs ...
1 -
peter_krammer Vertica Customer
Hi @marcothesane,
Thank you for your answer.
I have put the columns into the select list and added a few variation for debugging.
I discovered that agg90Days.stddev90DayAverageBasketSizeEUR is 0 and inconsequential for the Bug so I removed it to make the debugging clearer.
Rounding should not be the problem since both values are exactly 20, but rounding and typecasting indeed solves the issue:
So I tried looking at the data types (is there a good way to do this?) by causing some errors:
test(txnYesterday.AverageBasketSizeEUR) --> ERROR: Function test(numeric) does not exist, or permission is denied for test(numeric)
test(agg90Days.avg90DayAverageBasketSizeEUR) --> ERROR: Function test(float) does not exist, or permission is denied for test(float)
So I guess vertica uses different data types on those numbers and therefore the binary comparison done by the greater than operator comes up with a wring result.0
Answers
I am charmed by your way of determining the data type of an expression, @peter_krammer ! No, I don't know of any better way. I usually went creating a table with an expression and then having the table described - and dropped again ...
Not only in SQL, floats in their internal representation often can't express certain values with fractions in absolutely precise way, so they are never equal to what would be the same literal typecast as a NUMERIC(n,m). By saying 'rounding problem' I meant this specific issue with floats.
I have tried to construct other queries to recreate this problem, but I can't.
Vertica seems to force all expressions into the numeric type before comparison.
Then again in my specific query on that particular set of data, it happens to not do it, which I assume is a bug and can lead to undesired results, like in my case.
Not only in Vertica, but in any database, I personally don't leave it to the discretion of the database to cast expressions as they "like" before performing operations on them. I also avoid testing for equality of floats, as they can often differ in a small fraction. Explicit cast makes you the master instead of the victim of what is to happen.