kafka_config.kafka_offsets table has 200,000 rows after a week
Hi,
I have a vertica connected to kafka for one week and have over 200,000 rows in the kafka_config.kafka_offsets table. It adds 3 rows every 10 seconds. Normal? I think this will cause me problems after a few months!
Thanks,
Drew
0
Comments
Any thoughts on this? I have 1.6MM rows of data in kafka_config.kafka_offsets after just over a month.
Thanks,
Drew
Has any one experienced this?
I tried to remove old entries from the kafka_config.kafka_offsets table:
delete from kafka_config.kafka_offsets where transaction_id < 45035996273705601
but ran into an errors:
[Vertica][VJDBC](6460) ERROR: UPDATE/DELETE/MERGE a table with aggregate projections is not supported
No one else is having this issue?
Hello Drew
I too am seeing this (albeit in Vertica 8.1 - where this table is now called stream_microbatch_history). As I understand this, it records 1 row per source/target per microbatch. In my setup, that's 11 source/targets every 10 seconds (the default microbatch duration).
This (and other) tables are not being automatically cleared down (even in Vertica 8.x).
I will investigate/research and let you know what I find.
Regards
Mark
Delete will not work as this table has live agregate projection.
This tables partitioned by following expression. You may try doing drop_partition on partitions that are old. You need to be super careful that you don't drop partition that has current end_offset for a target schema,table,cluster, topic and partition .
date_part('day', kafka_offsets.batch_start)
Use following query that can help identify partitions that are REQUIRED and can't be dropped.
select distinct partition from (select target_schema,target_table,kcluster,ktopic,kpartition,date_part('day', max(kafka_offsets.batch_start)) as partition from kafka_config.kafka_offsets group by 1,2,3,4,5) as tab;
You may also use following query to identify partition that Vertica kafka integration needs and should not be dropped.
select distinct date_part('day', batch_start) from kafka_config.kafka_offsets_topk;
Thanks, @MAW -- It's nice to hear that someone else experiences this behavior. It's shocking to think it has been designed this way and continues to function in this manner in 8.x.
We have opened enhancement request for Vertica to handle automatic prunning of these tables .