The Vertica Forum recently got a makeover! Let us know what you think by filling out this short, anonymous survey.
Accessing path of loaded file from S3

Leave a Comment
Can't find what you're looking for? Search the Vertica Documentation, Knowledge Base, or Blog for more information.
Answers
I don't think I understand what you are actually asking for.
A COPY FROM S3 command looks, for example, like this:
COPY myschema.mytable
FROM 'S3://my-bucket/my-dir/my-file.csv.gz' GZIP
DELIMITER ',' ENCLOSED BY '"' DIRECT
Do you want to store the information of
'S3://my-bucket/my-dir/my-file.csv.gz'
or parts of it in a column?Cheers -
Yes, that's correct, I like to capture /my-dir/ specifically. I found the CURRENT_LOAD_SOURCE() function that can be referenced in COPY, however, this only returns 'my-file.csv.gz' and not the rest of full URI of the S3 path.
hmm ... then you can only find it again in the
query_requests
table ...So I have no better idea than write a script like so:
COPY myschema.mytable FROM 'S3://my-bucket/my-dir/my-file.csv.gz' GZIP DELIMITER ',' ENCLOSED BY '"' DIRECT;
SELECT REGEXP_SUBSTR(request,'from ''s3://my-bucket/([^/]+)/',1,1,'i',1) FROM query_requests
WHERE request_type='LOAD'
ORDER BY start_timestamp DESC LIMIT 1;
REGEXP_SUBSTR
my-dir
Using Regular Expressions to parse the newest COPY command found in
query_requests
, and to extract the part of the string that comes, searched in case-insensitive mode, after thefrom
keyword , and the string stubs3://my-bucket/
(we know the name of the bucket and only want the sub-directory), but before the next forward slash.