Remove Characters in IP Address
Hello Vertica friends,
Need some help. I have a table with a column that holds IP Addresses. The column datatype is VARCHAR, so users have been putting all sorts of things in that column. For example: 'Unknown', 'p-alive', and lots of other strings.
So what I need is a query that removes all characters and only gives me IP Addresses from this table. I was able to remove some of the junk using LEFT function, but I need something that only retrieves numbers and periods. I have also used WHERE IPAddress LIKE '%Unknown%', but this logic will fail at some point.
Perhaps Vertica has a way of identifying numeric values and dots (periods). That would work
Any help would be appreciated.
0
Comments
I think a REGEXP can help...
Example:
dbadmin=> select ip, case regexp_count(ip, '(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b') when 1 then ip else '' end valid_ip from ip_test; ip | valid_ip ---------------+-------------- 10.10.12.111 | 10.10.12.111 192.168.1.1 | 192.168.1.1 192.168.1.X | Unknown | p-alive | 192.168.1.256 | (6 rows)Thank you very much. I will try this out now. You guys are always a great help!!!!!!!!!!!!
No problem! Thanks for the positive feedback
You can also use the INET_ATON() function to explicitly parse IP addresses
select INET_ATON('1.2.3.4') is null, INET_ATON('unknown') is null;
..and then INET_NTOA to convert it back to a clean address.
See https://my.vertica.com/docs/8.1.x/HTML/index.htm#Authoring/SQLReferenceManual/Functions/String/INET_ATON.htm?TocPath=SQL%20Reference%20Manual|SQL%20Functions|IP%20Conversion%20Functions|_____1
Yeah! The INET_ATON function is cleaner and most likely performs much better than a REGEX!
Nice recommendation @Ben_Vandiver
Example:
dbadmin=> select ip, case when INET_ATON(ip) is not null then ip else '' end valid_ip from ip_test; ip | valid_ip ---------------+-------------- 10.10.12.111 | 10.10.12.111 192.168.1.1 | 192.168.1.1 192.168.1.X | Unknown | p-alive | 192.168.1.256 | 1.2.3.4 | 1.2.3.4 (7 rows) dbadmin=> select ip, case regexp_count(ip, '(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b') when 1 then ip else '' end valid_ip from ip_test; ip | valid_ip ---------------+-------------- 10.10.12.111 | 10.10.12.111 192.168.1.1 | 192.168.1.1 192.168.1.X | Unknown | p-alive | 192.168.1.256 | 1.2.3.4 | 1.2.3.4 (7 rows)