Get new records from dataview
Hello Dears,
Maybe someone can suggest me, here is the use case.
We are using ETL tool to pull out records from Vertica and push them to salesforce and looking for a solution on how to get only new records in Vertica which were recently added into table let's say 1 day back or during a week and get only those records not others. is there an option to do it via query?
Just to give an overview of data looks like
Maybe a newest created rows then pick them and return to our ETL for next updates to salesforce
Thank you in advance !
Tagged:
0
Best Answer
-
VValdar
Vertica Employee Employee
Hi ramik_allakhverdiev,
Well that would be a basic filter:
select * from MyTable where greatest(etl_created_at, etl_updated_at)::date >= current_date - 1 and greatest(etl_created_at, etl_updated_at)::date < current_date;
1
Answers
Hello,
To get the recent data fetched from the Vertica source, you can use a filter in the source query based on the etl_created_at or etl_updated_at columns.
The filter criteria depends on the frequency of the workflow execution. For example, if the workflow executes daily you can use the filter etl_created_at >= sysdate -1
You can modify the sysdate -1 criteria by getting it in different formats such as YYYY-MM-DD or DD-MM-YYYY
Thanks
Shahzeb
Lightweight partition copy increases performance by initially sharing the same storage between two tables.
If you call COPY_PARTITIONS_TO_TABLE and the target table does not exist, the function creates the table automatically.
See: https://www.vertica.com/docs/12.0.x/HTML/Content/Authoring/SQLReferenceManual/Functions/VerticaFunctions/PartitionManagement/COPY_PARTITIONS_TO_TABLE.htm
create table fact01(reason_descr varchar(50), is_deleted_flg boolean, etl_created_at timestamp not null, etl_updated timestamp) order by etl_created_at segmented by hash(etl_created_at) all nodes partition by date(etl_created_at) ; CREATE TABLE copy fact01 from stdin delimiter ',' abort on error; select * from fact01 order by 3,4; reason_descr | is_deleted_flg | etl_created_at | etl_updated --------------------+----------------+----------------------------+---------------------------- november_bot_farm | f | 2022-08-05 08:05:00.679696 | 2022-08-05 08:05:00.679696 november_bot_farm | f | 2022-08-05 08:05:00.679696 | 2022-08-05 08:05:00.679698 november_bot_farm | f | 2022-08-05 08:05:00.697696 | 2022-08-05 08:05:00.679696 may_bot_farm | f | 2022-08-05 08:05:14.011098 | 2022-08-05 08:05:14.011098 may_bot_farm | f | 2022-08-05 08:05:14.011098 | 2022-08-05 08:05:14.011098 may_bot_farm | f | 2022-08-05 08:05:15.211978 | 2022-08-05 08:05:15.211978 may_bot_farm | f | 2022-08-05 08:05:18.486118 | 2022-08-05 08:05:18.486118 may_bot_farm | f | 2022-08-05 08:05:18.486118 | 2022-08-05 08:05:18.486118 may_bot_farm | f | 2022-08-08 19:27:52.824 | 2022-08-08 19:27:52.824 september_bot_farm | f | 2022-09-01 08:05:00.679696 | 2022-09-01 08:05:00.679696 bot_farm | f | 2022-09-01 14:05:26 | 2022-09-01 14:05:26 (11 rows) select current_date() -1; ?column? ------------ 2022-09-01 (1 row) select COPY_PARTITIONS_TO_TABLE ( 'fact01', current_date() -1, current_date() -1, 'last_date_records', true); COPY_PARTITIONS_TO_TABLE ------------------------------------------------------------------------------------ Created table last_date_records. 1 distinct partition values copied at epoch 525. (1 row) select * from last_date_records; reason_descr | is_deleted_flg | etl_created_at | etl_updated --------------------+----------------+----------------------------+---------------------------- september_bot_farm | f | 2022-09-01 08:05:00.679696 | 2022-09-01 08:05:00.679696 bot_farm | f | 2022-09-01 14:05:26 | 2022-09-01 14:05:26 (2 rows)