We're Moving!

The Vertica Forum is moving to a new OpenText Analytics Database (Vertica) Community.

Join us there to post discussion topics, learn about

product releases, share tips, access the blog, and much more.

Create My New Community Account Now


COPY FROM LOCAL "Too many columns found" — Vertica Forum

COPY FROM LOCAL "Too many columns found"

Is there a way to inspect the content of a "COPY...FROM LOCAL..." statement? We want to see the corrupt data as seen on the database server.

Comments

  • Can you post a sample of your data ?
  • It's a fire hose--about 260,000 rows per second, Most are inserted without an error. We see a few dozen of these errors per minute, so we are looking for ways to investigate.
  • Is your data enclosed ? by any comma or string ? 
    What is you copy cmd look like ?

  • In database Catalog folder should be CopyErrorLogs directory.
    # list files and sort by modification time, oldest first
    $ ls -lrt
    In case of Client-Server load check error_messages table.
    Example
    CREATE TABLE public.stdin
    (
        id int,
        c char(1)
    );

    daniel@synapse ~ $ vsql -c "copy stdin from local stdin direct"
    Enter data to be copied followed by a newline.
    End with a backslash and a period on a line by itself.
    >> 1|a
    >> 2|b
    >> c|3
    >> 4|d
    >> CREATE TABLE public.stdin
    (
        id int,
        c char(1)
    );

    daniel@synapse ~ $ vsql -c "copy stdin from local stdin direct"
    Enter data to be copied followed by a newline.
    End with a backslash and a period on a line by itself.
    >> 1|a
    >> 2|b
    >> c|3
    >> 4|d
    >> \.
     Rows Loaded
    -------------
               3
    (1 row)

    daniel=> \x
    Expanded display is on.
    daniel=> select * from error_messages where message like 'Invalid%column%' order by event_timestamp desc;
    -[ RECORD 1 ]---+---------------------------------------------
    event_timestamp | 2014-05-20 21:45:23.745288+03
    node_name       | v_dev_node0001
    user_id         | 45035996273704962
    user_name       | daniel
    session_id      | synapse-3205:0x17b3
    request_id      | 0
    transaction_id  | 45035996273709962
    statement_id    | 1
    error_level     | ERROR
    error_code      | 67788930
    message         | Invalid integer format 'c' for column 1 (id)
    detail          |
    hint            |
    #46;
     Rows Loaded
    -------------
               3
    (1 row)

    daniel=> \x
    Expanded display is on.
    daniel=> select * from error_messages where message like 'Invalid%column%' order by event_timestamp desc;
    -[ RECORD 1 ]---+---------------------------------------------
    event_timestamp | 2014-05-20 21:45:23.745288+03
    node_name       | v_dev_node0001
    user_id         | 45035996273704962
    user_name       | daniel
    session_id      | synapse-3205:0x17b3
    request_id      | 0
    transaction_id  | 45035996273709962
    statement_id    | 1
    error_level     | ERROR
    error_code      | 67788930
    message         | Invalid integer format 'c' for column 1 (id)
    detail          |
    hint            |

  • The data is CSV. Each field is numeric.

    A code rollback resolved the issue. Fortunately cooperation won the day. But it'd be nice to have evidence to present for making a compelling argument.

    Thanks!
  • Jack Kidwell - way to troubleshoot loaded data.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file