Query to select first row based on column with duplicates

slc1axjslc1axj Vertica Customer

I have a primary key column called (PK) - I'd like to select the first PK and the row along with that PK and just remove the others. I don't really care if it's the first row but just need 1 row associated with each PK

Example
PK col1 col2
A Hector 263
A James 2
B Stel 58
B Jones 85
B Emmons 3
C Tobin 15
D Dyer 3
D Formine 6

RESULTS
PK col1 col2
A Hector 263
B Stel 58
C Tobin 15
D Dyer 3

Tagged:

Best Answer

  • Jim_KnicelyJim_Knicely - Select Field - Administrator
    edited November 2021 Answer ✓

    Put the LIMIT after the WHERE

    Something like:

    select
    "ebin added by" || "Intake ID" || Source as PK
    ,cast("intake id" as varchar) as IntakeID
    ,p.*
    from ABC_PRD.abc_portal p
    where "Request Submit Date" >= CURRENT_DATE()-11
    and (Business = 'CORE' OR Business = 'SMS')
    and (Source = 'ECIN' or Source = 'Curaspan')
    LIMIT 1 OVER(PARTITION BY "ebin added by" || "Intake ID" || Source ORDER BY "intake id" desc)

Answers

  • Jim_KnicelyJim_Knicely - Select Field - Administrator

    One way:

    dbadmin=> SELECT * FROM z;
     PK |  col1   | col2
    ----+---------+------
     A  | Hector  |  263
     A  | James   |    2
     B  | Stel    |   58
     B  | Jones   |   85
     B  | Emmons  |    3
     C  | Tobin   |   15
     D  | Dyer    |    3
     D  | Formine |    6
    (8 rows)
    
    dbadmin=> SELECT PK, col1, col2 FROM z LIMIT 1 OVER (PARTITION BY PK ORDER BY col2 DESC);
     PK |  col1   | col2
    ----+---------+------
     A  | Hector  |  263
     B  | Jones   |   85
     C  | Tobin   |   15
     D  | Formine |    6
    (4 rows)
    
    
  • slc1axjslc1axj Vertica Customer

    I'm getting an error: SQL Error [4856] [42601]: [Vertica]VJDBC ERROR: Syntax error at or near "where"

    Here's my SQL:

    select
    "ebin added by" || "Intake ID" || Source as PK
    ,cast("intake id" as varchar) as IntakeID
    ,p.*
    from ABC_PRD.abc_portal p LIMIT 1 OVER(PARTITION BY "ebin added by" || "Intake ID" || Source ORDER BY "intake id" desc)
    where "Request Submit Date" >= CURRENT_DATE()-11
    and (Business = 'CORE' OR Business = 'SMS')
    and (Source = 'ECIN' or Source = 'Curaspan')

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file