Accessing path of loaded file from S3

Is there a way for COPY to access the S3 bucket/folder/ of a loaded file and pass it back to field?
This would be super useful in case of having delete / reload a particular folder that has some issue with the data.
Thanks for any pointers.



  • marcothesanemarcothesane - Select Field - Administrator

    I don't think I understand what you are actually asking for.
    A COPY FROM S3 command looks, for example, like this:
    COPY myschema.mytable
    FROM 'S3://my-bucket/my-dir/my-file.csv.gz' GZIP
    Do you want to store the information of 'S3://my-bucket/my-dir/my-file.csv.gz' or parts of it in a column?
    Cheers -

  • Yes, that's correct, I like to capture /my-dir/ specifically. I found the CURRENT_LOAD_SOURCE() function that can be referenced in COPY, however, this only returns 'my-file.csv.gz' and not the rest of full URI of the S3 path.

  • marcothesanemarcothesane - Select Field - Administrator

    hmm ... then you can only find it again in the query_requests table ...
    So I have no better idea than write a script like so:
    COPY myschema.mytable FROM 'S3://my-bucket/my-dir/my-file.csv.gz' GZIP DELIMITER ',' ENCLOSED BY '"' DIRECT;
    SELECT REGEXP_SUBSTR(request,'from ''s3://my-bucket/([^/]+)/',1,1,'i',1) FROM query_requests
    WHERE request_type='LOAD'
    ORDER BY start_timestamp DESC LIMIT 1;


  • marcothesanemarcothesane - Select Field - Administrator

    Using Regular Expressions to parse the newest COPY command found in query_requests, and to extract the part of the string that comes, searched in case-insensitive mode, after the fromkeyword , and the string stub s3://my-bucket/ (we know the name of the bucket and only want the sub-directory), but before the next forward slash.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file