Options

How to load a csv into a table

Hi Guys,
I have a simple csv file containing three columns ABC,DEF,XYZ
I have a table named masterdata in vertica with the exact three columns.

How do i get data from the csv into the data table?

i tried:
COPY masterdata FROM '/home/dbadmin/Desktop/data.csv' DELIMITER ',';

But all it says is 
ROWS LOADED 
-------------------------
                           0
(1 row)

What to do?

Comments

  • Options
    Navin_CNavin_C Vertica Customer
    Hello Anmol,

    Vertica accepts simple Text delimited files for loading data into database.
    But when it comes to CSV file, you need to use a additional parser.

    Parser name is TraditionalCsvParser.cpp.

    First you need to install the Traditional CSV parser in Vertica.

    Steps to Install:
    1. Find the parser at '/opt/vertica/sdk/examples/ParserFunctions:'
    2. check the file ParserFunctions.sql,this will help you to install Traditional CSV parser
    3. Create Library
    4. Create Parser

    Then coming to your example on loading data through CSV files.

    Suppose you gave a csv file like this having 3 columns

    Test.csv
    ABC,DEF,XYZ
    SDS,EDE,RFR
    WDW,EDF,DFG
    QWE,TYU,FGH
    You have a table called masterdata.
    CREATE TABLE NAVIN.MASTERDATA( COLUMN1 VARCHAR,
    COLUMN2 VARCHAR,
    COLUMN3 VARCHAR
    );
    nnani=> select * from NAVIN.MASTERDATA;
     COLUMN1 | COLUMN2 | COLUMN3
    ---------+---------+---------
    (0 rows)
    Time: First fetch (0 rows): 6.445 ms. All rows formatted: 6.485 ms
    Now using this parser to load data into MASTERDATA table.
    nnani=> copy navin.masterdata from '/home/dbadmin/navin/test.csv' parser       csvparser();     Rows Loaded
    -------------
               4
    (1 row)
    Time: First fetch (1 row): 2027.225 ms. All rows formatted: 2027.272 ms
    Checking the Data in table.
    nnani=> select * from navin.masterdata;
     COLUMN1 | COLUMN2 | COLUMN3
    ---------+---------+---------
     WDW     | EDF     | DFG
     ABC     | DEF     | XYZ
     QWE     | TYU     | FGH
     SDS     | EDE     | RFR
    (4 rows)
    Time: First fetch (4 rows): 18.602 ms. All rows formatted: 18.648 ms


    Hope this helps.








  • Options
    Thats awesome Naveen. But i tried this command after creating a masterdata table :

    copy masterdata from local '/home/dbadmin/Desktop/data.csv' WITH DELIMITER ';';

    And it worked. My question, what is the difference between parsing data this way and using the csv parser that you mentioned? Is the data going into a normal table or the flex table when you use a parser. Looking forward to hearing from you.
  • Options
    Navin_CNavin_C Vertica Customer
    Hi Anmol,

    Good that it worked.

    Vertica accepts data in Delimited format , but csv file is not a standard format for a delimiter.
    Plus csv is not a accepted file format by Vertica.

    This is where a parser comes into action and tries to simplify things for Vertica.
    Vertica does not understand a .csv file and what a parser does is, it changes/converts the file data which can be understood by vertica .

    Under the hood CSV parser simply tokenizes the strings inside the file and make it acceptable by Vertica.

    The data is directed to the table you wish to (whatever is in your COPY command a flex table / normal columnar table)

    Hope this helps.
  • Options
    HI Navin,

    While creating the Library for CSV paerser.. . I am getting and error.. Below is the details:

    CX=> \set csv_libfile '\'''pwd''/build/TraditionalCsvParser.so\'';CX=> CREATE LIBRARY TraditionalCsvParserLib AS :csv_libfile;
    ROLLBACK 3399:  Failure in UDx RPC call InvokeSetExecContext(): Error calling setupExecContext() in User Defined Object [] at [/scratch_a/release/30493/vbuild/vertica/OSS/UDxFence/vertica-udx-C++.cpp:195], error code: 0, message: Error happened in dlopen(): [/home/dbadmin/build/TraditionalCsvParser.so: cannot open shared object file: No such file or directory]

    Can you please help me with this... 

  • Options
    Navin_CNavin_C Vertica Customer
    Hi Ravi,

    Can you describe the steps you followed to set this package.

    Below are detailed steps to install this package

    1. This package come pre bundled with Vertica in this path 
    /opt/vertica/sdk/examples
    Here you can see a folder named ParserFunction, this is where your TraditionalCsvParser.cpp file exists


    2. First you need to compile all these C++ (cpp) files, Compiling is done using make command

    Come to this path
    /opt/vertica/sdk/examples
    Enter following command
    make clean
    make 
    these two command first clean the build directory and then make  / compile your make file (makefile is already existing in your same directory). This will compile all the C++ packages in this directory

    If everything goes good, you should be having \build directory in same location now.
    Inside build directory will be your .so files which will be used by Vertica

    3. Now run the install file for TraditionalCsvParser
    vsql -f parserfunctions.sql
    This will create libraries --> create function in database.


    This should solve the issue.

    Note- If TraditionalCsvParser.so is not created in \build directory then it might be an issue, you might be missing some BOOST headers. Get some boost libraries and this should fix it.
  • Options
    Hi Navin,

    I installed boost-devel package and followed the similar steps and got the libraries created.

    Thanks for helping me.

    Any chance, can you also help me in loading CSV data files into Vertica Flex tables.

    My requirement is that, I have some CSV files which same header names but not the exact number of cloumns in each file. Need to create a table (flex table) which can load all the data from these files... and then move to Main Table.

    Thanks in advance.
  • Options
    Navin_CNavin_C Vertica Customer
    Hi Ravi,

    This might help you.

    How to Load CSV files into Flex Tables

    Hope this helps.
    NC


  • Options
    Hi,

    Thanks so much :)

    can we get in touch personally, over office Lync or FB ?

    Ravi
  • Options
    Navin_CNavin_C Vertica Customer
    Hi Ravi,

    You can get in touch over here.

    Vertica Professionals

    Thanks
    NC
  • Options
    Help me !  
    when i use 'make' command . it occur error 
    'make: *** [/opt/vertica/sdk/examples/build/JavaScalarLib.jar] Error 1' 
    _____________________________________________

    contact via email : wipawee.nay11@lamduan.mfu.ac.th 
  • Options

    Hi NC,

     

    I am also trying to load some data into Vertica and I was able to create the libraries and the /build folder with respective .so no problem. The 'make' completed okay.

     

    However, I am now at this step:

    vsql -f parserfunctions.sql

     

    And, I am getting the following error when I perform that command. Could you please advice at what steps to follow next? or what could be the root cause of my problem?

     

    This is the issue seen: "vsql: FATAL 3781: Invalid username or password"

     

    Thanks!

  • Options

     I guess we need to put the gray metter ot work a bit ?!!!!

    The error is quite obvius ! don`t you think ? You need ot provide it with your credentials:

     

    vsql -U username -w password -d database name -f parserfunctions.sql

     

    Even better, run this : 

    vsql --help
  • Options

    Hi,

     

    I am getting below error while trying to install parser, 

     


    [dbadmin@localhost verticaSDK]$ vsql -U dbadmin -w foo -d GLBVRTC -f ParserFunctions.sql
    vsql:ParserFunctions.sql:12: ROLLBACK 3399: Failure in UDx RPC call InvokeSetExecContext(): Error calling setupExecContext() in User Defined Object [] at [/scratch_a/release/16125/vbuild/vertica/OSS/UDxFence/vertica-udx-C++.cpp:195], error code: 0, message: Error happened in dlopen(): [/home/dbadmin/verticaSDK/build/BasicIntegerParser.so: cannot open shared object file: No such file or directory]

     

    dbadmin=> select version();
    version
    ------------------------------------
    Vertica Analytic Database v7.1.2-0

     

    br,

    İsmail

  • Options

     Hi there is a similar post that was answered by @Navin_C

    Chekc the link of the post.

     

    Hope this helped

  • Options
    try this - copyt databse.tablename FROM LOCAL 'file path.csv' ENCLOSED BY '\"' delimiter ',' SKIP 1; "
  • Options

    @Navin and @VC thank you for posting such clear instructions.  Any insight how to get past an error finding the csv parser?

     

    $> yum install boost-devel
    $> cd /opt/vertica/sdk/examples
    $> make clean 
    $> make
    $> vsql -f ParserFunctions.sql # plus credentials
    $> vsql -U 
    ==> create table scratch.test (column1 varchar, column2 varchar, column3 varchar);

     

    All of the above steps went great.  The following lines yielded the same error:

     

    copy scratch.test from '/home/root/data/test.csv' parser csvparser();

     

    ERROR 3457:  Function csvparser() does not exist, or permission is denied for csvparser()
    HINT: No function matches the given name and argument types. You may need to add explicit type casts

     

    I get the same results trying other variations:

    copy scratch.test from '/home/root/data/test.csv' with parser csvParser();
    copy scratch.test from '/home/root/data/test.csv' with parser CsvParser();
    copy scratch.test from '/home/root/data/test.csv' with parser TraditionalCsvParser();
    copy scratch.test from '/home/root/data/test.csv' with parser Rfc4180CsvParser();

     

     

     @VC's solution with the delimiter works great for many (even most?) CSVs, but can fail with "rows loaded: 0" for RFC4180 compliant files that may have embedded quotes, line breaks, etc.

     

    copy scratch.test FROM LOCAL 'file path.csv' ENCLOSED BY '"' delimiter ',' SKIP 1;

     

    EDIT:  this was resolved by omitting the DROP LIBRARY statements at the end of ParserFunctions.sql and adding with to the value.

  • Options

    Hi Navin, I tried make clean and then i got the following results: -

     

    $ make clean
       rm -rf /tmp/libcsv-3.0.1
       rm -f /opt/vertica/sdk/examples/build/*.so
       rm -f /opt/vertica/sdk/examples/build/*.jar
       rm -rf /opt/vertica/sdk/examples/build/Java*
       rmdir /opt/vertica/sdk/examples/build >/dev/null 2>&1 || true

     

    then when I tried

    $ make

    I get the following error:

     

    g++ -I /opt/vertica/sdk/include -I HelperLibraries -g -Wall -Wno-unused-value -shared -fPIC -o /opt/vertica/sdk/examples/build/ScalarFunctions.so ScalarFunctions/*.cpp /opt/vertica/sdk/include/Vertica.cpp
    /bin/sh: g++: command not found
    make: *** [/opt/vertica/sdk/examples/build/ScalarFunctions.so] Error 127

  • Options
    SruthiASruthiA Vertica Employee Administrator

    Hi,

     

       it looks like you dont have g++ installed. Please install it

     

     

    Sruthi

     

     

  • Options

    Update, Vertica 7.1+ has an awesome RFC4180 compliant CSV parser built in, avoiding the need to compile C++ and install custom libraries.

     

     

    COPY my_schema.my_table 
    FROM STDIN 
    PARSER fcsvparser(reject_on_materialized_type_error=true)
    REJECTMAX 1 
    ABORT ON ERROR 
    REJECTED DATA AS TABLE my_schema.mytable_rejects;

     

  • Options

    Oh!! I missed that completely. Thank you ShruthiA

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file