Hash and Re Hash
Hi,
I need help , we are looking to use hash function to our bigest text .
For example
select 'VerticaGM'as Name, hash('VerticaGM') as Name_Value;
Result :-
Name Name_Value
VerticaGM 4870610383584265370
can I rehash 4870610383584265370 to get VerticaGM ?
Thank you.
0
Comments
Hash functions are unidirectional, that is they are not be reversible.
you need ot use other functions
what are other functions or UDF available?
I appreciate for your help.
There is an encryption pkg at
https://github.com/vertica/Vertica-Extension-Packages/tree/master/encryption_package
- this might doing the trick for you if you are trying to hyde it.
If you don`t wanna have it in your result sets becouse is to big then you can choose the relation aproach and create Key to it instead of having it draged around your most queried table.
Do you have built in function's to hash and re hash.
Our problem we have very large text column as PK , charter joins are very costly .
So we are looking to hash it , so Integer joins are faster . At same time we need to rehash it to orginal value .
Ex : Vertica hash value is 5605021174313838236
If I rehase 5605021174313838236 = Vertica
I'm looking for funcation to do this.
Hi
You can easily create projection that used an expression , the projection will include both , your original column and the hash based column , then you can explicitly used this projection for your queries :
Eg:
I hope you will find it useful
Thanks
ok ,
Why not adding a new default column to hold the hash for the text field.
Insert some data :
See the data:
See the data in both tables:
Make you idhash primary key:
Make fk on idhaah of the tbl 2 :
Your model is optimized now , i guess
Also adapt your projection build based on your query requirments
Thank you , Much appricated!
Does hash value will change if we add more nodes
Ex : we have 30 nodes , we updated to 60 nodes
Not sure i understood your qustion ?
Nodes = Vertica Cluster Nodes ?
If you add new nodes to the database should and wont influence or change any of the columns and table definition,
When adding more nodes to your Vertica cluster you need to rebalance the cluster. This is a different task and won't affect any of the hash default column definiton.
Can you cross check , I See this issue with other MPP platfrom's.
yes , Nodes = Vertica Cluster Nodes.
You mean to tell me that by altering the infrastructure (adding or removing nodes) will affect the default values stored in a column of a table ?
I have never encounterd any similar situation that this happened.
Try to reproduce this and i will belive it.
It might be that when you have added the new node you haven't rpelicated the proejctions across all nodes, or look at your projection description where you have custom segmentation rules and one projection sits only in node A,B, and C out of A,B,C,D,E,F. becouse you decided so.
See this page for more info:
https://community.dev.hpe.com/t5/Vertica-Knowledge-Base/Understanding-Rebalancing-Part-1-What-Happens-During-Rebalancing/ta-p/231792
Thank you for your info ,I will try in my dev system adding new nodes and re run hash function .
I will let you know,
Hash function for a string will always be unique.
try these, they are all different.
select hash('abc')
select hash('ABC')
select hash('cba')
String Hash function provided by Vertica is meant for external use by developers, its not used by Vertica internal operations. So the distribution of the data using hash function is different that the one used above.
So you can hash your values and store it, maybe even use it as an dictionary(don't do it if many values and will be used as dimension).