We're Moving!

The Vertica Forum is moving to a new OpenText Analytics Database (Vertica) Community.

Join us there to post discussion topics, learn about

product releases, share tips, access the blog, and much more.

Create My New Community Account Now


s3EXPORT_PARTITION — Vertica Forum

s3EXPORT_PARTITION

relireli Vertica Customer

hiן all, can I do this in s3EXPORT_PARTITION, with a calculated value?

SELECT s3EXPORT_PARTITION(* USING PARAMETERS url='s3://db001/bystate.date')
   OVER (PARTITION by to_char(ts, 'YYYYMMDD')) from T;

when ts is a date-time value.

Tagged:

Answers

  • Jim_KnicelyJim_Knicely - Select Field - Administrator
    edited January 2021

    Sure!

    Quick example:

    dbadmin=> \d T
                                      List of Fields by Tables
     Schema | Table | Column |   Type    | Size | Default | Not Null | Primary Key | Foreign Key
    --------+-------+--------+-----------+------+---------+----------+-------------+-------------
     public | T     | ts     | timestamp |    8 |         | f        | f           |
     public | T     | c1     | int       |    8 |         | f        | f           |
    (2 rows)
    
    dbadmin=> SELECT * FROM T;
                 ts             | c1
    ----------------------------+----
     2021-01-03 18:44:00.795916 |  1
     2021-01-04 18:44:05.374714 |  2
     2021-01-05 18:44:08.886589 |  3
    (3 rows)
    
    dbadmin=> SELECT s3EXPORT_PARTITION(* USING PARAMETERS url='s3://bucket_name/test/bystate.date') OVER (PARTITION by to_char(ts, 'YYYYMMDD')) from T;
     rows |                         url
    ------+------------------------------------------------------
        1 | https://bucket_name/test/=20210103/bystate.39b4312b.date
        1 | https://bucket_name/test/=20210104/bystate.9e02d3ac.date
        1 | https://bucket_name/test/=20210105/bystate.e7c2c4c4.date
    (3 rows)
    
    dbadmin=> CREATE EXTERNAL TABLE t_ext (partition_column INT, ts TIMESTAMP, c1 INT) AS COPY FROM 's3://bucket_name/test/*/*';
    CREATE TABLE
    
    dbadmin=> SELECT * FROM t_ext ORDER BY 1;
     partition_column |             ts             | c1
    ------------------+----------------------------+----
             20210103 | 2021-01-03 18:44:00.795916 |  1
             20210104 | 2021-01-04 18:44:05.374714 |  2
             20210105 | 2021-01-05 18:44:08.886589 |  3
    (3 rows)
    
  • relireli Vertica Customer

    I run this query
    SELECT s3EXPORT_PARTITION(* USING PARAMETERS url='s3://bucket_name/test/bystate.date') OVER (PARTITION by to_char(ts, 'YYYYMMDD')) from T;
    and I get this error..

    my vertica version 10.0.1-6.

  • relireli Vertica Customer

    @reli said:
    I run this query
    SELECT s3EXPORT_PARTITION(* USING PARAMETERS url='s3://bucket_name/test/bystate.date') OVER (PARTITION by to_char(ts, 'YYYYMMDD')) from T;
    and I get this error..

    my vertica version 10.0.1-6.

    it parse a part of the data but not all the data.
    and in s3 the folder:

    how to do it like this ? date=20201205

  • Jim_KnicelyJim_Knicely - Select Field - Administrator

    Not sure what you mean by "it parse a part of the data but not all the data"...

    Can you run this query without error?

    SELECT to_char(ts, 'YYYYMMDD')) from T;

    For your question "how to do it like this ? date=20201205", you can use a sub-query and name the derived column that you are using as the partition...

    Example:

    dbadmin=> SELECT s3EXPORT_PARTITION(* USING PARAMETERS url='s3://bucket_name/test1/bystate.date') OVER (PARTITION BY date) FROM (SELECT TO_CHAR(ts, 'YYYYMMDD') AS date, * FROM T) foo;
     rows |                            url
    ------+-----------------------------------------------------------
        1 | https://bucket_name/test1/date=20210103/bystate.8d999fa5.date
        1 | https://bucket_name/test1/date=20210104/bystate.60394485.date
        1 | https://bucket_name/test1/date=20210105/bystate.7deec715.date
    (3 rows)
    
  • relireli Vertica Customer

    I will try to explain better,
    I run this query:
    SELECT s3EXPORT_PARTITION(* USING PARAMETERS url='s3://bucket_name/test/bystate.date') OVER (PARTITION by to_char(ts, 'YYYYMMDD'))
    i get this error

    the distinct value from the source table is

    in s3 I get these folders only:

    and there are files in the folders, but not all the folders that need to be.

  • Jim_KnicelyJim_Knicely - Select Field - Administrator

    Can you try running the export command from vsql?

  • relireli Vertica Customer

    I export command from vsql what you send,

    SELECT s3EXPORT_PARTITION(* USING PARAMETERS url='s3://bucket_name/test1/bystate.date') OVER (PARTITION BY date) FROM (SELECT TO_CHAR(ts, 'YYYYMMDD') AS date, * FROM T) foo;
    
    

    this is the result

    but I get the same error, and not all the data in the folder: :neutral:

    ERROR 5861: Error calling processPartition() in User Function s3export_partition at [/data/qb_workspaces/jenkins2/ReleaseBuilds/Hammermill/REL-10_0_1-x_hammermill/build/udx/supported/AWS/S3.cpp:799], error code: 0, message: Unknown exception: No response body. Response code: 4294967295

  • Jim_KnicelyJim_Knicely - Select Field - Administrator

    Hi,

    Hmm. Let's do some trouble shooting. First, can you post the DDL for the table?

    SELECT export_objects('','T');

    Once I have the DDL I will install your version of Vertica to test. What is the exact version you are running?

    SELECT version();

    Next, can you try running these?

    SELECT s3EXPORT_PARTITION(* USING PARAMETERS url='s3://bucket_name/test1/bystate.date') OVER (PARTITION BY date) FROM (SELECT TO_CHAR(ts, 'YYYYMMDD') AS date, * FROM T WHERE TO_CHAR(ts, 'YYYYMMDD') = '20201209') foo;
    
    SELECT s3EXPORT_PARTITION(* USING PARAMETERS url='s3://bucket_name/test1/bystate.date') OVER (PARTITION BY date) FROM (SELECT TO_CHAR(ts, 'YYYYMMDD') AS date, * FROM T WHERE TO_CHAR(ts, 'YYYYMMDD') = '20201210') foo;
    
    SELECT s3EXPORT_PARTITION(* USING PARAMETERS url='s3://bucket_name/test1/bystate.date') OVER (PARTITION BY date) FROM (SELECT TO_CHAR(ts, 'YYYYMMDD') AS date, * FROM T WHERE TO_CHAR(ts, 'YYYYMMDD') = '20201211) foo;
    
  • relireli Vertica Customer
    1. Attach file -DDL table.
    2. Version: Vertica Analytic Database v10.0.1-6
    3. s3EXPORT_PARTITION:
      • 20201209 : fails
      • 20201210:fails
      • 20201211 : ok
        I have no explanation for why some succeed and some do not. :neutral:
  • Jim_KnicelyJim_Knicely - Select Field - Administrator

    The DDL in your attached file is not Vertica DDL :) For example, it's referenceing a data type "numerivalue".

  • relireli Vertica Customer
    edited January 2021

    I add the new script in DDL Table, it is ok now?
    Thank you so much for all the help. o:)

  • relireli Vertica Customer

    I change the parameter chunksize and it's work!!
    SELECT s3EXPORT_PARTITION(* USING PARAMETERS url=''s3://bucket_name/test1/bystate.date',chunksize=20485760) OVER (PARTITION BY date) FROM (SELECT TO_CHAR(ts, 'YYYYMMDD') AS date, * FROM T WHERE TO_CHAR(ts, 'YYYYMMDD') = '20201211') foo;
    Thank you @Jim_Knicely :smiley:

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file