[Kafka Streaming] Recover scheduler and start loading from newest offset first
[Deleted User]
Administrator
If the scheduler goes down for 2 hrs, for example, when it comes back, is there a way to get the latest data first ? They’d like to prioritize the latest data to be processed first, then go back to read older data, any recommendations? Or if it is possible to have 2 schedulers which read from different beginning and ending offsets, it seems this is NOT an option but please confirm.
Question credit to Wang, Wei (HPSW Big Data Platform Presales)
0
Comments
At this time this type of configuration is not possible. It is possible to start from specific offsets which can be configured manually, however I would not recommend this approach as setting the offset back after collecting the recent data would likely cause data duplication.
Answer credit to Mark Fay