COPY from KAFKASOURCE does not return number of affected rows in ADO.NET driver
Hi,
I am using the COPY comand to bulk insert data from a KAFKASOURCE into a table. Using a jdbc driver I get the number of affected rows as a result, however, using the ExecuteNonQuery-command from the ADO.NET driver the return value is always -1. This is only the case when copying from a KAFKASOURCE not when copying FROM LOCAL for instance.
Is there a way to get the number of inserted rows when copying from a KAFKASOURCE ?
Best Answer
-
Bryan_H Vertica Employee Administrator
I've asked the client driver team to investigate this issue. It's recommended to open a support case for more detailed issue testing and updates.
I can think of two other ways to get the accepted row counts:
Immediately following the COPY, run SELECT GET_NUM_ACCEPTED_ROWS ();
Set a label on the COPY with the STREAM NAME parameter, then you can find metrics in the LOAD_STREAMS system table.
There's some more info and options at https://www.vertica.com/docs/12.0.x/HTML/Content/Authoring/DataLoad/MonitoringCopyLoadsMetrics.htm0
Answers
Try: select * from dc_roses_created order by time desc;
Thanks for the hint!
For me it is difficult to find the right line in 'dc_roses_created ', because I have parallel COPY statements and it is hard to identify the right one just using the timestamp. However, it can potentially be used as a workaround.
In any case, it does not solve the problem that the command itself does not return the number of affeted rows, as it should, which would be much easiear. Is that a bug and , if yes, when will it be fixed?
Good point, I will open a support case.
Im the meatime, I will use load metrics, as you suggest.
Thanks!
Just one update: " SELECT GET_NUM_ACCEPTED_ROWS ();" does also not seem to work using the ADO.NET client,
I get an error: VerticaException: [0A000] ERROR: No single-source bulk loads have been executed in this session.
However, I using COPY and the select-statement in the same session.
Another update: 'load_sources' and 'load_streams' are both empty after COPY. Maybe this also does not work using a KAFKASOURCE?
SELECT GET_NUM_ACCEPTED_ROWS() is not supported for COPY from KAFKASOURCE.
load_streams will only report COPY statements that last a while. It's likely your COPY statement doesn't run for long enough (with KafkaSource this is controlled by number of partitions, offsets, duration....). Note that if you want to know the number of Kafka messages processed by your COPY kafkasource statement, you can call the KafkaOffsets meta-function within the same session.