row_number vs. rank
Hi,
I find it hard to fully understand the uniqueness of each function.
If I understand correctly they don't work as same as in different db's.
Moreover, the only difference between them in Vertica is in rank when the order by is the same so the function will give the same number.
That is the only difference?
If so, what will be the equivalent for row_number and rank in SQL SERVER for example in Vertica?
Are there any "Cost" differences, or optimization wise between the two?
Thanks.
Best Answer
-
Bryan_H Vertica Employee Administrator
RANK complies with SQL99 specification and should behave the same in other DB's. Otherwise, Vertica implementation is Postgres-like and should work like Postgres or Oracle, though I checked SQL Server doc and it looks like T-SQL should behave the same as Vertica, though SQL Server considers these functions to be nondeterministic while Vertica considers them immutable (deterministic).
A quick example from Vertica showing behavior of analytic functions RANK, DENSE_RANK, ROW_NUMBER:=> select state,city,population,rank() over (partition by state order by population desc) as rank,dense_rank() over (partition by state order by population desc) as dense_rank,row_number() over (partition by state order by population desc) as row_number from ranks;
state | city | population | rank | dense_rank | row_number
------+------------+------------+------+------------+-----------
NY | New York | 8500000 | 1 | 1 | 1
NY | Buffalo | 800000 | 2 | 2 | 2
NY | Hempstead | 800000 | 2 | 2 | 3
NY | Syracuse | 400000 | 4 | 3 | 4
NY | Rochester | 400000 | 4 | 3 | 5
NY | Plattsburg | 50000 | 6 | 4 | 6
1
Answers
Thank you, Bryan!