I though of that But you cannot UNION ALL a VARCHAR with an INT.
dbadmin=>\d a
List of Fields by Tables
Schema|Table|Column| Type | Size |Default|NotNull|PrimaryKey|ForeignKey--------+-------+---------+------+------+---------+----------+-------------+-------------public| a | user_id | int |8|| f | f |(1row)
dbadmin=>\d b
List of Fields by Tables
Schema|Table|Column| Type | Size |Default|NotNull|PrimaryKey|ForeignKey--------+-------+---------+-------------+------+---------+----------+-------------+-------------public| b | user_id | varchar(10)|10|| f | f |(1row)
dbadmin=>SELECT*FROM a;
user_id
---------26712(1row)
dbadmin=>SELECT*FROM b;
user_id
---------26712(1row)
dbadmin=>SELECT user_id, hash(user_id,'a')FROM a;
user_id | hash
---------+---------------------26712|6622807832315002947(1row)
dbadmin=>SELECT user_id, hash(user_id,'a')FROM b;
user_id | hash
---------+---------------------26712|3488463230331277936(1row)
dbadmin=>SELECT user_id, hash(user_id,'a')FROM a
dbadmin->UNIONALL
dbadmin->SELECT user_id, hash(user_id,'a')FROM b;
ERROR 3429:For'UNION', types int and varchar are inconsistent
dbadmin=>SELECT user_id, hash(user_id,'a')FROM b
dbadmin->UNIONALL
dbadmin->SELECT user_id, hash(user_id,'a')FROM a;
ERROR 3429:For'UNION', types varchar and int are inconsistent
DETAIL: Columns: user_id and user_id
I went through it all again, hashing with user_id and 'a':
\x
SELECT
HASH(26712,'a')AS hash_with_INT
, HASH(26712::NUMERIC(18),'a')AS hash_with_NUMERIC_18
, HASH(26712::NUMERIC(32),'a')AS hash_with_NUMERIC_32
, HASH(26712::NUMERIC(18,4),'a')AS hash_with_NUMERIC_18_4
, HASH(26712::FLOAT ,'a')AS hash_with_FLOAT
, HASH(26712::VARCHAR(256),'a')AS hash_with_VARCHAR_256
, HASH(26712::VARCHAR(64),'a')AS hash_with_VARCHAR_64
, HASH(26712::VARCHAR(5),'a')AS hash_with_VARCHAR_5
, HASH(26712::CHAR(256),'a')AS hash_with_CHAR_256
, HASH(26712::CHAR(5),'a')AS hash_with_CHAR_5
;-- out Expanded display is on.-- out -[ RECORD 1 ]----------+---------------------- out hash_with_INT | 6622807832315002947-- out hash_with_NUMERIC_18 | 6622807832315002947-- out hash_with_NUMERIC_32 | 2877367361208514378-- out hash_with_NUMERIC_18_4 | 8063475701858845121-- out hash_with_FLOAT | 304723032549234641-- out hash_with_VARCHAR_256 | 3488463230331277936-- out hash_with_VARCHAR_64 | 3488463230331277936-- out hash_with_VARCHAR_5 | 3488463230331277936-- out hash_with_CHAR_256 | 3488463230331277936-- out hash_with_CHAR_5 | 3488463230331277936
If we look at hanans's output, his 1st and 2nd output row in the report, with 2877367361208514378 as the hash value must come from dwh.sm_user_profile, where user_id must be between a NUMERIC(19,n) to a NUMERIC(37,n), and the 3rd output row must come from agg.agg_sm_daily_users_stats_by_platform, where user_id is either an INT or between a NUMERIC(1) and a NUMERIC(18).
Comments
That seems highly unlikely.
What client tool are you using? Try running your query in vsql.
Is user_id the same data type in both tables? I get one result as integer, and a different result as varchar:
I though of that
But you cannot UNION ALL a VARCHAR with an INT.
Maybe the client tool is doing the conversion?
To illustrate what Jim and Bryan were mentioning:
Here's a test I just ran:
So it's not only the value , but also its type, and sometimes also the size and precision, that you're hashing, that makes the difference....
@hanans: Can you share with us the exact type of the columns in the table definition of the columns that you're hashing?
Heh heh. Must be a mix of data types that Vertica is auto coercing allowing the UNION to succeed
I went through it all again, hashing with
user_id
and'a'
:If we look at hanans's output, his 1st and 2nd output row in the report, with 2877367361208514378 as the hash value must come from
dwh.sm_user_profile
, whereuser_id
must be between a NUMERIC(19,n) to a NUMERIC(37,n), and the 3rd output row must come fromagg.agg_sm_daily_users_stats_by_platform
, whereuser_id
is either an INT or between a NUMERIC(1) and a NUMERIC(18).wow!you guys are awesome!