Ctrl A Delimeter

Hi All, 

 

I'm trying to copy the file which contains ^A as delimeter and every line is terminated by ^B for next line

 

File :

10000^AA17^Aa17^A2423^B
10001^AA18^Aa18^A2423^B
10002^AA19^Aa19^A2423^B
10003^AA20^Aa20^A2423^B
10004^AA21^Aa21^A2423^B
10005^AA22^Aa22^A2423^B
10006^AA23^Aa23^A2423^B
10007^AA24^Aa24^A2423^B
10008^AA25^Aa25^A2423^B

 

Can you please help in loading the file using copy command

 

Thanks 

Santosh

Comments

  • Hi!

     

    Is it <Ctrl> + <A> or 2 chars <^A> ?

     

    By docs:

    DELIMITER -

    Indicates the single ASCII character used to separate columns within each record of a file. The default in HP Vertica is a vertical bar (|). You can use any ASCII value in the range E'\000' to E'\177', inclusive. You cannot use the same character for both the DELIMITER and NULL parameters. For more information, see Loading UTF-8 Format Data in the Administrator's Guide.

  • Hi Skwa, 

     

     

    Its Control A (^A) i think from the files which i recieved 

     

    I tried to use the Unicode for the same but not lucky to resolve 

     

    i used E\001 or U&'\0001' but unable to load the data 

     

    so used as below

     

    escape as E'A'
    delimiter as E'\^'
    record terminator E'\^B'

     

    it actually worked but doesn't know how much acurate it is ?

     

    can you please check once & help me to resolve. 

     

     

    Thanks 

    Santosh

     

  • Hi!

     

     

    For my opinion its a good choise for delimiter and record separator.

     

    • delimiter = <Ctrl+A> = '\001'
    • record terminator = <Ctrl+B> = '\002'
    • table definition: create table foo(id int, name varchar);
    • in vim 3 row looks:

     

    10000^AA17^B10001^AA18^B10003^AA19^B

     

     

    From vsql:

     

    dev=> copy foo from '/tmp/data.csv' direct delimiter e'\001' record terminator e'\002' abort on error;
     Rows Loaded
    -------------
               3
    (1 row)

    dev=> select * from foo;
      id   | name
    -------+------
     10000 | A17
     10001 | A18
     10003 | A19
    (3 rows)

     

    From terminal(bash):

     

    $ vsql -c "copy foo from '/tmp/data.csv' delimiter e'\001' record terminator e'\002' abort on error"
     Rows Loaded
    -------------
               3
    (1 row)

    $ vsql -c "select * from foo;"
      id   | name
    -------+------
     10000 | A17
     10001 | A18
     10003 | A19
     10000 | A17
     10001 | A18
     10003 | A19
    (6 rows)

     

     

    How to get same output in bash(with fetch/select):

     

    daniel@synapse:~$ vsql -F$'\001' -R$'\002' -Anqtc "select * from foo;"
    10000^AA17^B10001^AA18^B10003^AA19^B10000^AA17^B10001^AA18^B10003^AA19

     

    PS

     

    ASCII Control Codes:

    <Ctrl+A> = SOH = e'\001'

    <Ctrl+B> = STX = e'\002'

     

  • Hi , 

     

    Thanks but we are still unable to load the data with the ASCII values as Delimiter 

     

    the data gets into rejection tables and 0 rows loaded 

     

    is there any server settings to accept special ASCII values? 

     

     

    if i used the below copy command then the data got loaded but with few rejections but i think this is not the correct way to load the data 

     

    Query 

     

    COPY abc.Demo_new(
    col1,
    col2,
    col3,
    col4
    )
    FROM '/Demo.txt'
    REJECTED DATA AS TABLE abc.Demo_new_rejected
    skip 1
    delimiter as E'\^'
    record terminator E'\^B'
    --delimiter e'\001'
    --record terminator e'\002'
    direct;

     

     

    Can you please help me or any alternate solution to load the data

     

    Thanks 

    Santosh 

  • Hi!

     

    >> i think this is not the correct way

    No, by your description its isn't correct

     

    >> Can you please help me or any alternate solution to load the data

    Can you provide 3 rows of data(for example write 1KB of data to separate file) without any changes?

     

     

  • Hi, 

     

    PFA for the sample file (.csv)

     

    can you please check once & help us in loading the file. 

     

     

    Thanks 

    Santosh.

  • Hi!


    1. It's not <Ctrl+A> and <Ctrl+B>, its 2 chars:

        * delimiter <^> + <A> (2 chars)
        * record terminator <^> + <B> + <\r\n> (4 chars)

    2. Also its with BOM

    $ file sample.csv
    sample.csv: UTF-8 Unicode (with BOM) text, with CRLF line terminators

    Line terminator isn't a problem, but delimiter and BOM - it's a problem:

    1. remove BOM

    2. convert data to a single char as field separator

     

    ---

    On Linux it's very easy, but I don't know your environment.

     

  • Thank you very much 

     

    will check once & update 

     

     

    Thanks 

    Santosh

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file