Query
rajuk99
✭
Hi Vertica team, I have requirement to get element count based on two column uniqueness. Here, for example, if element is equal in both fields, i.e. 1=1 then count for 1 is just 1. Here is the result I'm getting in Spark query which is correct but it uses explode and array.
Result:
{"count":2,"element":1}
{"count":2,"element":3}
{"count":4,"element":2}
col1 col2
1 2
1 1
2 2
2 3
3 2
Please help, I'm using version: "Vertica Analytic Database v9.3.1-0".
0
Comments
I'm not quite getting what you are looking for.
For element 1, you want to count 1, because there is one row where col1 is equal to col2.
But for element 2, you want to count 3, even if there is one row where col1 is equal to col2.
I don't see the logic of what you are looking for.
Or do you simply need the last physical "col2" value per "col1" ?
In this latter case - remember: a table's rows are never ordered. In any database.
Adding a column
seq
that contains a sorting number for each row, it would look like this:Hi marcothesane, Thanks for looking into it. I should have been more clearer with my requirement. Here is what is required-
I want distinct element count based upon col1 & col2. That's why 1 should have count 2 (not 3 though it apeear 3 times). Element 2 has 4 element and element 3 has count 2. Whenever col1 and col2 has same element then it's count as 1.
What am I missing?
In this table:
col1 col2
1 2
1 1
2 2
2 3
3 2
Element (col1)
1
appears 2 times, not 3, as you say aboveElement (col1)
2
appears 2 times, tooElement(col1)
3
appears once.What you write above seems to be in contradiction with what I can see in your example data ....
Did I get something wrong?
Hi marcothesane:
Element 1 appears in row1 and row2, so, total 2 times. Element 2 appears 4 times - row1, row3, row4 and row5. Element 3 appears in row4 and row5, so, total count 2.
Hope this helps.
Not sure if it is exactly what you want or if it is the best way to do it. Some array support for internal tables coming in V10 that could be potentially beneficial for you.
select c1, sum(countc1)
from (select a c1, count() countc1 from f1 group by a
union all
select b c1, count() countc1 from f1 where a<>b group by b) x
group by c1 order by c1;
Hi DaveT,
I'm getting "Function count() does not exist, or permission is denied for count()". I'm using version: "Vertica Analytic Database v9.3.1-0".
It worked if I mention _fieldname _for count() funtion
query-
select c1, sum(countc1)
from (select site1resourceid c1, count(site1resourceid) countc1 from conbineflow_testdata group by site1resourceid
union all
select site2resourceid c1, count(site2resourceid) countc1 from conbineflow_testdata where site1resourceid<>site2resourceid group by site2resourceid) x
group by c1 order by c1
Thank you very much. I'll play with it little more and see if all scenarios getting covered.
Sorry, I guess it got removed when I pasted it. I has placed an asterisk in the parens. You can also use a column name if you wish.
Sure, np. Thank you.