We're Moving!

The Vertica Forum is moving to a new OpenText Analytics Database (Vertica) Community.

Join us there to post discussion topics, learn about

product releases, share tips, access the blog, and much more.

Create My New Community Account Now


Issue with ExternalSource — Vertica Forum

Issue with ExternalSource

Scenario 1  - remote file:

Remote server has a file with 15K rows

 

Shell script does ssh to this remote server and cat file to STDOUT

 

Vertica has an external table defined like (briefly, no details):

create external table ...
( columns according columns from remote file)
as copy with source public.ExternalSource(cmd='./shell_to_cat_file')

 

Now, if I run select * from this external table, I am getting random number of rows from the remote file. Like 267 rows, next run it can be like 464 rows, etc.. But never all set of rows (15K)

 

If I create external table to dump output from remote server to a file, like this:

create external table ...
( columns according columns from remote file)
as copy with source public.ExternalSource(cmd='./shell_to_cat_file > some_file.txt')

 and check how many rows were unloaded from remote file to the local, using above external table,  the file "some_file.txt" has all 15K rows from remote file.

 

Scenario 2 - local file:

If I have file local, not remote, 15K rows, and create external table like this:

 

create external table ...
( columns according columns from file)
as copy with source public.ExternalSource(cmd='cat some_file.txt')

 the resulty is also random, it outpusd not all 15K rows, it will never print all rows

 

If I create exterlanl table to read data without ExternalSource from the same local file:

create external table ...
( columns according columns from file)
as copy FROM 'some_file.txt';

 the result of select * is accurate, all 15K rows are printed

 

Question is  why ExternalSource works differently for above scenarious. This seems a bug to me.

Otherwise, what am I missing?

 

BTW: Vertica 7.1, 1 node, licensed

 

Thanks,

Oleg

 

Comments

  • If I create Python script to output file content, it works just fine:

     

    Python snippet:

    #!/usr/bin/python

     

    fname='/var/tmp/new.txt'

    with open(fname, 'r') as fin:

       print fin.read()

    table definition

     

    create external table ..

    ….

    AS copy

    with source ExternalSource(cmd='/var/tmp/print_file.py’)

    ;

     

    So I guess, it can be solved with such way around..

     

    ~ Oleg

     

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file
You can use Markdown in your post.