Vertica 7.2 Kafka not fetching data from some partitions
for example, from log:
COPY "public"."events" SOURCE
KafkaSource(stream='events|0|1590085,events|1|1586923,events|2|-2',
brokers='1.2.3.4:9092', duration=interval '9872 milliseconds',
stop_on_eof=true, executionparallelism=1 ) PARSER KafkaJSONParser( )
REJECTED DATA AS TABLE public.kafka_rej6 DIRECT NO COMMIT
But:
/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list 127.0.0.1:9092 --topic events --time -1
events:0:1590085
events:1:1586923
events:2:1597176
As you can see, vertica do not consume partition #2 at all. I trying to set --start-offset without results. BTW many other strange things happened with kafka package, like suddenly node downs, but here I whant to know, why my partitions did not consuming.
Yes, kafka_rej6 rejection table is empty, kafka's server.log not say anyting as same as vertica's dbLog.
Now look:
COPY "public"."events" SOURCE KafkaSource(stream='events|2|-2',
brokers='1.2.3.4:9092', duration=interval '9872 milliseconds',
stop_on_eof=true, executionparallelism=1 ) PARSER KafkaJSONParser( )
REJECTED DATA AS TABLE public.kafka_rej6 DIRECT NO COMMIT;
Rows Loaded
-------------
204505
If I keep only one partition in stream Vertica is succesfully consume messages. WHY?
I'm very disappointed with integration spended almost two weeks on it without any result. No any logs and silent running make it impossible to debug.
0
Comments
Also I have this errors sometimes:
and this (in Kafka)
Also this:
all this causing sudden crash one of vertica nodes.
As it turned out, it was my foult. One of my nodes simply could not resolve broker host. Rether I have hostnames and node tried to resolve the hostname without success. So I just add record with my broker host to /etc/hosts.
Errors from previous my post I still can't explain.
Hi Ravlio!
I see you have experience integrating kafka+vertica.
Did you have problems with KafkaLib?
Can you help me with https://community.dev.hpe.com/t5/Vertica-Forum/Library-with-name-KafkaLib-does-not-exist/m-p/233738#U233738 ?
When waiting for an answer or where to go? My nodes keeps crashing for questionable reasons:
The flow is about 100k messages/sec. Not so much. I tried everything. Tried change duration frame, tried send synthetic messages like {"id":1} Tried direct, trickle load. I tried resource pool tuning. I have 3 nodes Community Edition, free of additional load for time while testing.
Feeling that Vertica Kafka falling with any slightest problem.
Hi!
Do you have plan B? Do you plan to write your own kafka producer?
I gave up and return to classic 'COPY FROM LOCAL DIRECT' form with json parser. Not bad, but not as seamless as I wanted.
I'm sorry to hear about this and not respond in a timely fashion: these issues that you ran into are known and currently our top priority fixes. We are hoping to resolve them as soon as we can.
There is a work-around in the meantime for the node failures. You will want to run this command on your cluster:
alter database <dbname> set EnableCooperativeParse=0;
Thank you very much. Disabling EnableCooperativeParse helped. Fourh hour without node falling.