Is there a way to set a KafkaSource end condition to load data until there are no new data arrivals
olivertwsit
Community Edition User
Hi,
My goal is to manually copy data into Vertica using a KafkaSource. Apparently, I can define an end condition so that no new data arrives until a timeout period expires.
What is the timeout period and how do I set it?
With a KafkaSource, I can COPY data into Vertica over a defined period of time. Would it be possible to retrieve the Kafka message offset where this COPY statement ended so that the next round begins exactly there?
0
Answers
Yes right after your COPY .... KafkaSource , issue a SELECT KafkaOffsets() OVER () and you will get last consumed offsets per partition,
https://www.vertica.com/docs/12.0.x/HTML/Content/Authoring/KafkaIntegrationGuide/KafkaFunctions/KafkaOffsets.htm
regarding you timeout question, vertica provides 3 ways to specify the termination condition, Please check the below link.
https://www.vertica.com/docs/12.0.x/HTML/Content/Authoring/KafkaIntegrationGuide/UsingCOPYwithKafka.htm#6