Export To Parquet - add option to set timezone to UTC
Hi,
I stumbled about knowledge article from this morning:
https://portal.microfocus.com/s/article/KM000037925?language=en_US
In short, Vertica always convert TIMESTAMP datatype to server timezone when exporting to Parquet.
That is not good. First, Vertica docs in several places describe TIMESTAMP datatype as having UTC. TIMESTAMP in UTC is very convenient and fast performing. We do use TIMESTAMP extensively, and exclusively as UTC.
Please file new feature request - add option to EXPORT TO PARQUET, to set TIMESTAMP datatype timezone to UTC.
Thank you
Sergey
0
Answers
Thank you for your feedback.
As described in the documentation, Vertica aligns with Hive standards when exporting to Parquet. The TIMESTAMP data type does not include timezone information, which is why it is not converted to UTC during export. To avoid timezone-related issues, we recommend using the TIMESTAMPTZ data type instead.
TIMESTAMPTZ is designed specifically for storing timezone-aware timestamps, making it the ideal choice for maintaining UTC consistency and avoiding potential problems when exporting data to Parquet.
Well.
Here is a very odd Vertica behaviour - if you will export into Parquet table with timestamp column, then populate data back from parquet into same table with (select * from parquet table), you will get different result in Timestamp column.
That is quite an odd behaviour.
Having thousands of tables with TIMESTAMP datatype, that makes managing and working with Vertica unnecessary harder, and for no apparent reason.
Just another Vertica oddity to remember about.