We're Moving!

The Vertica Forum is moving to a new OpenText Analytics Database (Vertica) Community.

Join us there to post discussion topics, learn about

product releases, share tips, access the blog, and much more.

Create My New Community Account Now


Remove Duplicate Values from a String — Vertica Forum

Remove Duplicate Values from a String

Jim Knicely authored this tip.

Do you ever wonder how to get rid of those pesky duplicate values from a string? One way of doing that is via Vertica's REGEXP_REPLACE function.

Example:

dbadmin=> SELECT c1, REGEXP_REPLACE(c1, '(?<=\, |^)(.+?)\, (?=(.+\, )?\1(\, |$))', '') FROM test;
           c1           | regexp_replace
------------------------+----------------
 A, B, B, A             | B, A
 A, B, C, A, T          | B, C, A, T
 A, B                   | A, B
 D, A, B, C, C, D, D    | A, B, C, D
 Z, X, Z, X, A, A, A, Q | Z, X, A, Q
 *, *, *, *, *          | *
(6 rows)

Have Fun!

Comments

  • Jim_KnicelyJim_Knicely - Select Field - Administrator
    edited June 2018

    Here is an update REGEX for removing duplicate values from a string:

    dbadmin=> select c1, regexp_replace(regexp_replace(c1, '\s*([^|]*?)\s*'), '(?<=\,|^)(.+?)\,(?=(.+\,)?\1(\,|$))') from test;
                                                          c1                                                      |                       regexp_replace
    --------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------
    A, B                                                                                                         | A,B
    D, A, B, C, C, D, D                                                                                          | A,B,C,D
    Z, X, Z, X, A, A, A, Q                                                                                       | Z,X,A,Q
    A, B, B, A                                                                                                   | B,A
    65281,65283,65288,65296,65312,65320,65328,65352,65360,65368, 65281,65283,65288,65296,65320,65352,65360,65368 | 65312,65328,65281,65283,65288,65296,65320,65352,65360,65368
    A, B, C, A, T                                                                                                | B,C,A,T
          65283,     65283, 65320,65320                                                                          | 65283,65320
    (7 rows)
    

    It handles spaces following the commas!

Sign In or Register to comment.