Remove Characters in IP Address

edited November 2017 in General Discussion

Hello Vertica friends,

Need some help. I have a table with a column that holds IP Addresses. The column datatype is VARCHAR, so users have been putting all sorts of things in that column. For example: 'Unknown', 'p-alive', and lots of other strings.

So what I need is a query that removes all characters and only gives me IP Addresses from this table. I was able to remove some of the junk using LEFT function, but I need something that only retrieves numbers and periods. I have also used WHERE IPAddress LIKE '%Unknown%', but this logic will fail at some point.

Perhaps Vertica has a way of identifying numeric values and dots (periods). That would work

Any help would be appreciated.

Comments

  • Jim_KnicelyJim_Knicely Administrator
    edited November 2017

    I think a REGEXP can help...

    Example:

    dbadmin=> select ip, case regexp_count(ip, '(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b') when 1 then ip else '' end valid_ip from ip_test;
          ip       |   valid_ip
    ---------------+--------------
     10.10.12.111  | 10.10.12.111
     192.168.1.1   | 192.168.1.1
     192.168.1.X   |
     Unknown       |
     p-alive       |
     192.168.1.256 |
    (6 rows)
    
  • Thank you very much. I will try this out now. You guys are always a great help!!!!!!!!!!!!

  • Jim_KnicelyJim_Knicely Administrator

    No problem! Thanks for the positive feedback :)

  • Ben_VandiverBen_Vandiver Employee
    edited November 2017

    You can also use the INET_ATON() function to explicitly parse IP addresses
    select INET_ATON('1.2.3.4') is null, INET_ATON('unknown') is null;

    ..and then INET_NTOA to convert it back to a clean address.

    See https://my.vertica.com/docs/8.1.x/HTML/index.htm#Authoring/SQLReferenceManual/Functions/String/INET_ATON.htm?TocPath=SQL%20Reference%20Manual|SQL%20Functions|IP%20Conversion%20Functions|_____1

  • Jim_KnicelyJim_Knicely Administrator
    edited November 2017

    Yeah! The INET_ATON function is cleaner and most likely performs much better than a REGEX!

    Nice recommendation @Ben_Vandiver

    Example:

    dbadmin=> select ip, case when INET_ATON(ip) is not null then ip else '' end valid_ip from ip_test;
          ip       |   valid_ip
    ---------------+--------------
     10.10.12.111  | 10.10.12.111
     192.168.1.1   | 192.168.1.1
     192.168.1.X   |
     Unknown       |
     p-alive       |
     192.168.1.256 |
     1.2.3.4       | 1.2.3.4
    (7 rows)
    
    dbadmin=> select ip, case regexp_count(ip, '(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b') when 1 then ip else '' end valid_ip from ip_test;
          ip       |   valid_ip
    ---------------+--------------
     10.10.12.111  | 10.10.12.111
     192.168.1.1   | 192.168.1.1
     192.168.1.X   |
     Unknown       |
     p-alive       |
     192.168.1.256 |
     1.2.3.4       | 1.2.3.4
    (7 rows)
    

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file