load geography table from table of lat/long using ST_GeographyFromText



I have a Vertica table with millions of positions (positiondate timestamp, latitude numeric, longitude numeric) and I want to load these into another Vertica table with a geography type column.


Most of the examples I have found use an insert with values as below (this works fine):

insert into table(geog) values(ST_GeographyFromText('POINT(-85 15)');


I want to insert by selecting from the positions table similar to below (this does not work: could not find function defintion):

insert into positions(geog) select ST_GeographyFromText('POINT('||cast(longitude as varchar(10)) ||' '|| cast(latitude as varchar(10))||')') from positions;


I am stumped. Any ideas or tips? I am used to casting as geography type in SQL Server, is there something similar in Vertica?


Thanks in advance for any ideas!


  • Options

    Hi jdobbins,


    Have you installed HP Vertica Place? You didn't mention it in your post so I just want to make sure you have that installed before we get too far ahead of ourseleves. You'll need to have the matching server version. For example: HP Vertica 7.1.1 and HP Vertica Place 7.1.1. You won't get access to the GEOGRAPHY or GEOMETRY type until you have the HP Vertica Place package installed.


    Here's a link to the install guide if you don't know where to find it: http://my.vertica.com/docs/7.1.x/HTML/index.htm#Authoring/Place/InstallingPlaceOverview.htm


    I think you are really close to the answer. Your use case is very similar to a blog post I wrote recently. Give our GitHub example a look (https://github.com/vertica/Place/blob/master/HP_Vertica_Place_Example_Git.sql).


    I think this line in particular might get you on the right track:

    COPY usr_data(usr_id, date_time, x filler long varchar, y filler long varchar, g as ST_GeomFromText('POINT(' || x || ' ' || y || ')')) from LOCAL 'place_output.csv' delimiter ',' enclosed by '';


    Doing the load like this will allow you to avoid loading the data once and then minipulating it. This method performs the data minipulation on load.


    Give this a try and then let me know if you have any other issues. I'm happy to help you get through this.




  • Options

    Hi Casey,


    Thanks for the quick response. I will give your method a try (I am about to be out of the office for a week). I do have Place installed and I can access the functions. 


    I have been loading Vertica using Kettle (a job loads data from Oracle into Vertica). I had hoped there would be a way to do table to table loading/conversion, but maybe I can figure out how to automate this using Kettle.


    Thanks again,







  • Options

    Hi Jimmy,


    I needed a little more time to think about this one, but I've got a solution for you:


    CREATE TABLE tmp AS SELECT posistiondate, ST_GeographyFromText('POINT('||long||' '||lat||')') from foo_place;


    This will take your table and create a new table with a column for timestamps and a column for GEOGRAPHY data-- combining your data from your long and lat columns.


    Hopefully you see this before you head out of town, or you see it when you get back.




  • Options

    Hi Casey,


    I am in for a little bit today before I am out tomorrow and Wednesday. Thanks for the suggested SQL, I am still getting the 2858 error, "could not find function definition." To make things simple, I have everything in the public schema (which is where the Vertica Place functions are stored). I think everything is installed correctly since I can call the ST_GeographyFromText function without error (as follows):


    SELECT public.ST_GeographyFromText('POINT(-85.2312345 33.2645612)' USING PARAMETERS ignore_errors='y');


    In the positions table I have latitude and longitude stored as numeric(15,6). In the following query, I cast them to varchar(15) in the following script that returns the same "could not find function definition" 2858 error:


    create table tmp as 
    select positiondate, public.ST_GeographyFromText('POINT('||cast(longitude as varchar(15))||' '||cast(latitude as varchar(15))||')' USING PARAMETERS ignore_errors='y') as geogfield
    from public.positions;



    I am somewhat new to Vertica. If I need to do this through a text file load (I am going to try that now), is there a suggested way or best practice for automating that? These positions come from Oracle and arrive every ten minutes. I have been very happy using Kettle to load the data hourly thus far. 





  • Options

    Hi Jimmy,


    You don't need to cast the long and lat column. Since the columns are already a numeric column type (int, numeric, etc.) you don't need to cast them. I recreated your workflow using a slightly modified version of your query. When I performed these queries I was able to load my test data:

    dbadmin=> \d foo_place
    List of Fields by Tables
    Schema | Table | Column | Type | Size | Default | Not Null | Primary Key | Foreign Key
    public | foo_place | posistiondate | timestamp | 8 | | f | f |
    public | foo_place | lat | numeric(37,15) | 16 | | f | f |
    public | foo_place | long | numeric(37,15) | 16 | | f | f |
    (3 rows)

    -- run the create table statement
    dbadmin=> create table tmp_2 as select posistiondate, public.ST_GeographyFromText('POINT('||long||' '||lat||')' USING PARAMETERS ignore_errors='y') as geogfield from foo_place;

    -- view the tmp_2 table
    dbadmin=> select st_astext(geogfield) from tmp_2;
    POINT (55.98 45.32)
    POINT (54.12 78.65)
    POINT (33.23 67.45)
    (3 rows)


    Try running the following query on your data:

    CREATE TABLE tmp AS SELECT positiondate, public.ST_GeographyFromText('POINT('||longitude||' '||latitude||')' USING PARAMETERS ignore_errors='y') AS geogfield FROM public.positions;

    The only thing I changed in the query was the casting. 


    Let me know if that works.





  • Options



    That did it! Thank you so much for your help! As a side note, the insert statement below will work as follows for anyone else who might be running into this (I also reinstalled Vertica Place and I think I might not have installed it properly on all nodes...that probably contributed to the errors I was getting):


    insert into tmp2(positiondate, geogfield)
    SELECT positiondate, public.ST_GeographyFromText('POINT('||longitude||' '||latitude||')' USING PARAMETERS ignore_errors='y') AS geogfield
    FROM public.positions;


    Thanks again!



Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file