Get new records from dataview

ramik_allakhverdievramik_allakhverdiev Vertica Customer
edited September 2022 in General Discussion

Hello Dears,
Maybe someone can suggest me, here is the use case.
We are using ETL tool to pull out records from Vertica and push them to salesforce and looking for a solution on how to get only new records in Vertica which were recently added into table let's say 1 day back or during a week and get only those records not others. is there an option to do it via query?
Just to give an overview of data looks like

Maybe a newest created rows then pick them and return to our ETL for next updates to salesforce

Thank you in advance !

Tagged:

Best Answer

  • VValdarVValdar Vertica Employee Employee
    Answer ✓

    Hi ramik_allakhverdiev,

    Well that would be a basic filter:

    select *
      from MyTable
     where greatest(etl_created_at, etl_updated_at)::date >= current_date - 1
       and greatest(etl_created_at, etl_updated_at)::date <  current_date;
    

Answers

  • Shahzeb_ShafiShahzeb_Shafi Vertica Employee Employee

    Hello,

    To get the recent data fetched from the Vertica source, you can use a filter in the source query based on the etl_created_at or etl_updated_at columns.
    The filter criteria depends on the frequency of the workflow execution. For example, if the workflow executes daily you can use the filter etl_created_at >= sysdate -1

    You can modify the sysdate -1 criteria by getting it in different formats such as YYYY-MM-DD or DD-MM-YYYY

    Thanks
    Shahzeb

  • moshegmosheg Vertica Employee Administrator

    Lightweight partition copy increases performance by initially sharing the same storage between two tables.
    If you call COPY_PARTITIONS_TO_TABLE and the target table does not exist, the function creates the table automatically.
    See: https://www.vertica.com/docs/12.0.x/HTML/Content/Authoring/SQLReferenceManual/Functions/VerticaFunctions/PartitionManagement/COPY_PARTITIONS_TO_TABLE.htm

    create table fact01(reason_descr varchar(50), is_deleted_flg boolean, etl_created_at timestamp not null, etl_updated timestamp)
    order by etl_created_at
    segmented by hash(etl_created_at) all nodes
    partition by date(etl_created_at)
    ;
    CREATE TABLE
    copy fact01 from stdin delimiter ',' abort on error;
    select * from fact01 order by 3,4;
        reason_descr    | is_deleted_flg |       etl_created_at       |        etl_updated
    --------------------+----------------+----------------------------+----------------------------
     november_bot_farm  | f              | 2022-08-05 08:05:00.679696 | 2022-08-05 08:05:00.679696
     november_bot_farm  | f              | 2022-08-05 08:05:00.679696 | 2022-08-05 08:05:00.679698
     november_bot_farm  | f              | 2022-08-05 08:05:00.697696 | 2022-08-05 08:05:00.679696
     may_bot_farm       | f              | 2022-08-05 08:05:14.011098 | 2022-08-05 08:05:14.011098
     may_bot_farm       | f              | 2022-08-05 08:05:14.011098 | 2022-08-05 08:05:14.011098
     may_bot_farm       | f              | 2022-08-05 08:05:15.211978 | 2022-08-05 08:05:15.211978
     may_bot_farm       | f              | 2022-08-05 08:05:18.486118 | 2022-08-05 08:05:18.486118
     may_bot_farm       | f              | 2022-08-05 08:05:18.486118 | 2022-08-05 08:05:18.486118
     may_bot_farm       | f              | 2022-08-08 19:27:52.824    | 2022-08-08 19:27:52.824
     september_bot_farm | f              | 2022-09-01 08:05:00.679696 | 2022-09-01 08:05:00.679696
     bot_farm           | f              | 2022-09-01 14:05:26        | 2022-09-01 14:05:26
    (11 rows)
    
    select current_date() -1;
      ?column?
    ------------
     2022-09-01
    (1 row)
    
    select COPY_PARTITIONS_TO_TABLE (
              'fact01',
              current_date() -1,
              current_date() -1,
              'last_date_records',
              true);
                                  COPY_PARTITIONS_TO_TABLE
    ------------------------------------------------------------------------------------
     Created table last_date_records.
    1 distinct partition values copied at epoch 525.
    
    (1 row)
    
    select * from last_date_records;
        reason_descr    | is_deleted_flg |       etl_created_at       |        etl_updated
    --------------------+----------------+----------------------------+----------------------------
     september_bot_farm | f              | 2022-09-01 08:05:00.679696 | 2022-09-01 08:05:00.679696
     bot_farm           | f              | 2022-09-01 14:05:26        | 2022-09-01 14:05:26
    (2 rows)
    

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file