Loading AVRO file with nullable fields
Hello,
I am trying to load an AVRO file in Vertica 8.1. I've tried loading it two different ways:
- in a Flex table, I get empty string columns
- in an existing table with the correct schema, I get
COPY: Input record 1 has been rejected (Error: Row [1] failed to append value for column: [timestamp] with value: [].). Please see /tmp/vertica/000000000000.rejected, record 1 for the rejected record. This record was read from 000000000000
It looks like the parser is based on https://github.com/vertica/FlexTable/blob/master/src/AvroParser.cpp and based on the code in there, I think this is what is happening : my AVRO schema handles NULL values the usual way in AVRO but the Parser doesn't support the UNION construct for that : it falls through a case construct that returns an empty string.
Example: { "type" : "record", "name" : "Root", "fields" : [ { "name" : "timestamp", "type" : [ "long", "null" ] } }
Has anyone solved this ?
0
Answers
Hello,
We have upgrade our avro-cpp library from 1.7.0 to 1.8.1 in 8.1 release because old avro-cpp has a bug on schema-evolution which we need that for schema-registry feature in 8.1.
Avro-cpp lib fixed this problem in 1.7.7 however it introduced a new bug in handling union type. When you have something like {"name":"timestamp", "type":["long","null"]}, you will get an error/crash while constructing schema object.
Avro community has got a fix for this and we are working a patch for it. However, even though the crash has been fixed it changed the behavior how union type should be handled. I don’t expected our patch will come out on 8.1SP1 while we are still actively working on it and it should come out soon.
A possible solution might work at this moment is changing the union type from "type":["long","null"] to "type":[“null”,"long"]
Thank you very much
Yang