Best practice for schematizing semi-structured data in flex tables
We are using an automated process to periodically load batches of semi-structured data into a flex table. Is there any best practice for a way to transform this data into a more defined schema, post-load? For instance, let's say that the loaded data is deeply nested making queries on the flex table itself more difficult to construct. Here are the options that I can think of:
- Once the automated process has finished the load, make a call to Vertica to run an external procedure or user defined function that transforms the data.
- Don't transform the data at all - materialize expected columns and run queries against the single flex table.
- Attempt to add structure to the data pre-load, and forgo the need for using a flex table at all.