The Vertica Forum recently got a makeover! Let us know what you think by filling out this short, anonymous survey.

What’s the Distance Between Two Zip Codes?

Jim_KnicelyJim_Knicely Administrator
edited February 6 in Tips from the Team

Vertica has a built-in function called DISTANCEV which returns the distance (in kilometers) between two points using the Vincenty formula. Because the Vincenty formula includes the parameters of the WGS-84 ellipsoid model, you need not specify a radius of curvature. You specify the latitude and longitude of both the starting point and the ending point.

Example:

dbadmin=> SELECT DISTANCE(40.4406, -79.9959, 28.3852, -81.5639) Disney_is_this_far_from_home;
Disney_is_this_far_from_home
------------------------------
             1348.12778743491
(1 row)

But what if I want to measure the distance between two zip codes? No problem! But first we’ll need a table that stores the longitude and latitude by zip code.

Longitude and Latitude by zip code data is readily available online. For this example, I grabbed the needed data from here:

https://gist.githubusercontent.com/erichurst/7882666/raw/5bdc46db47d9515269ab12ed6fb2850377fd869e/US%20Zip%20Codes%20from%202013%20Government%20Data

I put the data from the above link into a file called /home/dbadmin/zips.txt and loaded it into a Vertica table:

dbadmin=> CREATE TABLE zipcode_geography (zipcode VARCHAR(5), latitude NUMERIC(25,6), longitude NUMERIC(25,6));
CREATE TABLE

dbadmin=> COPY zipcode_geography FROM '/home/dbadmin/zips.txt' DIRECT;
Rows Loaded
-------------
       33144
(1 row)

I can find the longitude and latitudes for two zips code like this:

dbadmin=> SELECT z1.zipcode, z1.latitude, z1.longitude, z2.zipcode, z2.latitude, z2.longitude
dbadmin->   FROM zipcode_geography z1
dbadmin->  CROSS JOIN zipcode_geography z2
dbadmin->  WHERE z1.zipcode = '15090'  -- Wexford, PA
dbadmin->    AND z2.zipcode = '43210'; -- OSU (Columbus, OH)
zipcode | latitude  | longitude  | zipcode | latitude  | longitude
--------+-----------+------------+---------+-----------+------------
15090   | 40.625015 | -80.067058 | 43210   | 40.005435 | -83.023227
(1 row)

Now I can use the DISTANCEV function to get the distance between the two zip codes in kilometers and miles!

dbadmin=> SELECT distancev(z1.latitude, z1.longitude, z2.latitude, z2.longitude) kms,
dbadmin->        distancev(z1.latitude, z1.longitude, z2.latitude, z2.longitude) / 1.609 miles
dbadmin->   FROM zipcode_geography z1
dbadmin->  CROSS JOIN zipcode_geography z2
dbadmin->  WHERE z1.zipcode = '15090'  -- Wexford, PA
dbadmin->    AND z2.zipcode = '43210'; -- OSU (Columbus, OH)
       kms       |      miles
-----------------+-----------------
260.507361625431 | 161.90637764166
(1 row)

Helpful Link:
https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/SQLReferenceManual/Functions/Mathematical/DISTANCEV.htm

Have fun!

Comments

  • MAWMAW Employee

    Jim, thanks for sharing this Forum posting which I found especially useful as I had not used the DISTANCEV function before.

    Not being a resident of the US meant I was unable to check on the distances from my home Post Code (I’m in the UK) to elsewhere.

    Thankfully, I came across another Postal Codes data set from GeoNames which covers the whole of the world - geonames.org/postal-codes/ (well at least for 86 different countries). This links to a set of downloadable text files – one per country, one complete set (allCountries.zip), plus a larger single file (GB_full.zip) for the UK/GB which includes both outward and inward parts of the postcode. These zip files are to be found here: download.geonames.org/export/zip/.

    These UTF8 encoded, tab-delimited files contain the following fields:

    country code      : iso country code, 2 characters
    postal code       : varchar(20)
    place name        : varchar(180)
    admin name1       : 1. order subdivision (state) varchar(100)
    admin code1       : 1. order subdivision (state) varchar(20)
    admin name2       : 2. order subdivision (county/province) varchar(100)
    admin code2       : 2. order subdivision (county/province) varchar(20)
    admin name3       : 3. order subdivision (community) varchar(100)
    admin code3       : 3. order subdivision (community) varchar(20)
    latitude          : estimated latitude (wgs84)
    longitude         : estimated longitude (wgs84)
    accuracy          : accuracy of lat/lng from 1=estimated to 6=centroid
    

    Having downloaded and unzipped these files to a new folder (/home/dbadmin/projects/geonames), I then created a Vertica schema (geonames), a table (postal_codes) and loaded the data:

    CREATE SCHEMA geonames;
    
    CREATE TABLE geonames.postal_codes
       (
       country_code      CHAR(2),
       postal_code       VARCHAR(20),
       place_name        VARCHAR(180),
       admin_name1       VARCHAR(100),
       admin_code1       VARCHAR(20),
       admin_name2       VARCHAR(100),
       admin_code2       VARCHAR(20),
       admin_name3       VARCHAR(100),
       admin_code3       VARCHAR(20),
       latitude          NUMERIC(25,6),
       longitude         NUMERIC(25,6),
       accuracy          INTEGER
       );
    COPY geonames.postal_codes FROM '/home/dbadmin/projects/geonames/allCountries.txt' DELIMITER E'\t' DIRECT;
    COPY geonames.postal_codes FROM '/home/dbadmin/projects/geonames/GB_full.txt' DELIMITER E'\t' DIRECT;
    

    The allCountries.txt will result in ~1.3m postal codes being loaded, with a further 1.7m from the GB_full.txt file.

    Now, repeating some similar queries to Jim’s I can find the distance from Wexford, PA in the US to Columbus, Ohio, my home postcode in the UK to Newcastle-upon-Tyne and across the world between my home and Wexford.

    SELECT
      'Distance between ' || z1.place_name || ' in ' || z1.admin_name2 || ', ' || z1.admin_name1 ||  ', ' || z1.country_code || ' (' || z1.postal_code || ') ' ||
      ' and ' || z2.place_name || ' in ' || z2.admin_name2 || ', ' || z2.admin_name1 || ', ' || z2.country_code || ' (' || z2.postal_code || ') is ' ||
      distancev(z1.latitude, z1.longitude, z2.latitude, z2.longitude) || ' KM or ' ||
      distancev(z1.latitude, z1.longitude, z2.latitude, z2.longitude) / 1.609 || ' miles' as From_A_to_B
    FROM
      geonames.postal_codes z1
    CROSS JOIN
      geonames.postal_codes z2
    WHERE
      z1.postal_code = '15090' AND
      z2.postal_code = '43210' AND
      z1.country_code = 'US'   AND
      z2.country_code = 'US'
    
    UNION ALL
    
    SELECT
      'Distance between ' || z1.place_name || ' in ' || z1.admin_name2 || ', ' || z1.admin_name1 ||  ', ' || z1.country_code || ' (' || z1.postal_code || ') ' ||
      ' and ' || z2.place_name || ' in ' || z2.admin_name2 || ', ' || z2.admin_name1 || ', ' || z2.country_code || ' (' || z2.postal_code || ') is ' ||
      distancev(z1.latitude, z1.longitude, z2.latitude, z2.longitude) || ' KM or ' ||
      distancev(z1.latitude, z1.longitude, z2.latitude, z2.longitude) / 1.609 || ' miles'
    FROM
      geonames.postal_codes z1
    CROSS JOIN
      geonames.postal_codes z2
    WHERE
      z1.postal_code = 'SA3 2DR' AND
      z2.postal_code = 'NE1 1SG' AND
      z1.country_code = 'GB'   AND
      z2.country_code = 'GB'
    
    UNION ALL
    
    SELECT
      'Distance between ' || z1.place_name || ' in ' || z1.admin_name2 || ', ' || z1.admin_name1 ||  ', ' || z1.country_code || ' (' || z1.postal_code || ') ' ||
      ' and ' || z2.place_name || ' in ' || z2.admin_name2 || ', ' || z2.admin_name1 || ', ' || z2.country_code || ' (' || z2.postal_code || ') is ' ||
      distancev(z1.latitude, z1.longitude, z2.latitude, z2.longitude) || ' KM or ' ||
      distancev(z1.latitude, z1.longitude, z2.latitude, z2.longitude) / 1.609 || ' miles'
    FROM
      geonames.postal_codes z1
    CROSS JOIN
      geonames.postal_codes z2
    WHERE
      z1.postal_code = 'SA3 2DR' AND
      z2.postal_code = '15090' AND
      z1.country_code = 'GB'   AND
      z2.country_code = 'US';
    
                                                                                        From_A_to_B
    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
     Distance between Wexford in Allegheny, Pennsylvania, US (15090)  and Columbus in Franklin, Ohio, US (43210) is 259.850262859359 KM or 161.497988104014 miles
     Distance between Southgate in West Glamorgan, Wales, GB (SA3 2DR)  and Newcastle upon Tyne in Tyne & Wear, England, GB (NE1 1SG) is 411.973681496387 KM or 256.043307331502 miles
     Distance between Southgate in West Glamorgan, Wales, GB (SA3 2DR)  and Wexford in Allegheny, Pennsylvania, US (15090) is 5727.5700557061 KM or 3559.70792772287 miles
    (3 rows)
    

    If this has in any way whetted your appetite, GeoNames also has a number of other interesting and useful data sets (download.geonames.org/export/dump/) all available under the Creative Commons Attribution 4.0 Licence (https://creativecommons.org/licenses/by/4.0/).

    The world is your oyster – get out there and explore!

  • Jim_KnicelyJim_Knicely Administrator

    @MAW - Awesome post and thanks for sharing! So you and I are only "5727.5700557061 KM or 3559.70792772287 miles" from one another? Want to get together for a beer tomorrow? :)

Sign In or Register to comment.

Can't find what you're looking for? Search the Vertica Documentation, Knowledge Base, or Blog for more information.