Apportioned UDL issue
Hey Everyone -
We are trying to move toward using UDLs to handle our data loads. Right now we ingest a lot of "similar-to-CSV" data in various formats. We use a C# process to transform these files in whatever form they come in to a standardized command that we use Vertica to then ingest (including using a nonsense Unicode character as a separator). What we're doing a POC for is to see if we can just have Vertica do this processing work through C++ UDLs.
The problem is we really want to use apportioning for performance reasons, but we are struggling to figure out how in the Source to provide the headers. In our case some of our files will have the same headers but we don't necessarily know which order those headers will come in. So we need to read the file and map the header columns to standardized output columns. There does not seem to be any way to pass down the header information from the File Apportioning Source class to the Parser.
We're using the terminology and such from this github repo: https://github.com/vertica/UDx-Examples/tree/master/Java-and-C++/ApportionLoadFunctions.
We have tried several things; what we thought would work was extending the portions by the header data and then manually copying the headers in to each portion. But that results in us not reading the correct amount of bytes as it appears the piece who does the reading has no idea the reads we're doing aren't actually going back to the file.
Any thoughts or tips?