to_date function returns wrong result
Hi,
I expect vertica returns error when converting to date from wrong/incomplete stringselect to_date('1230','yymmdd')
returns:2014-06-05
which doesn't make any sense. We expect incorrect date to fail to convert.
Same query in oracle correctly gives an error (not a valid month / input value not long enough for dateformat)
I found that it actually might be a feature rather than a bug,
https://www.vertica.com/blog/convert-a-string-to-a-date-including-unconventional-dates/
Is there a way how to disable this 'lenient '/unpredictable feature?
Tagged:
0
Answers
This is actually expected because we emulate Postgres-SQL. The calculation is that '1230' are assumed to be the year 'yy' and month 'mm' portions; '30' is an illegal month, so we convert by rolling forward 24 months leaving 6 as the remainder. I've no idea why it's day 5 though.
The same result is obtained from Vertica and Postgres engines:
Vertica (10.0.0-0)
dbadmin=> select to_date('1230','yymmdd');
to_date
2014-06-05
(1 row)
psql (9.2.24)
postgres=# select to_date('1230','yymmdd');
to_date
2014-06-05
(1 row)
However, I couldn't work out a way to suppress this behavior in Vertica. The "FM" and "FX" modifiers suggested don't appear to apply here. You could open a ticket if you have access to support, but it is likely quicker to validate inputs using the date-time functions native to the client (unless using vsql or manual entry).
Maybe add in your own constraints?
Example:
dbadmin=> SELECT to_date('1230','yymmdd'), to_date(CASE WHEN length('1230') < length('yymmdd') THEN NULL ELSE '1230' END, 'yymmdd'); to_date | to_date ------------+--------- 2014-06-05 | (1 row)thanks
i will open ticket i guess,
there should be a choice of strict enforcement, similar to type conversion with exclamation mark
Vertica was born from PostGreSQL and we inherited the TO_DATE functionality.
In versions of PostGreSQL 9.x and below, their documentation warned:
See: https://www.postgresql.org/docs/9.6/functions-formatting.html
We see this in Vertica:
dbadmin=> SELECT version(); version ------------------------------------- Vertica Analytic Database v10.0.0-0 (1 row) dbadmin=> SELECT to_date('20096040','YYYYMMDD'); to_date ------------ 2014-01-17 (1 row)But it looks like PostSreSQL fixed this issue in their verison 10 release and they removed the above warning from their documentation
See: https://www.postgresql.org/docs/10/release-10.html
In particular:
In Vertica 10 we still get this oddball result:
dbadmin=> SELECT version(); version ------------------------------------- Vertica Analytic Database v10.0.0-0 (1 row) dbadmin=> SELECT to_date('2009-06-40','YYYY-MM-DD'); to_date ------------ 2009-07-10 (1 row)That being said, please open a support case. Feel free to include the above info in the case description.
When you do, can you please email the case number? I will be sure to tie your case to the Jira ticket.
My email is james.knicely@microfocus.com
Thanks!
For the meantime can this help?
CREATE OR REPLACE FUNCTION my_to_date(x VARCHAR(30), dformat VARCHAR(30)) RETURN DATE AS BEGIN RETURN TO_DATE(CASE WHEN LENGTH(x) < LENGTH(dformat) THEN THROW_ERROR('to_date() reject out-of-range input:(' || x || ')') ELSE x END, dformat); END; SELECT MY_TO_DATE('1230','yymmdd'); vsql:create_function.sql:6: ERROR 7137: USER GENERATED ERROR: to_date() reject out-of-range input:(1230)Three years on, this is still an issue in Vertica 10.x but it was mentioned here that is has been resolved in postgresql. Any plans to change the to_date function to throw an error when the date string is not a valid date? (Example: Return error for to_date('20230431','YYYYMMDD') since April 31 is not a valid date instead of returning valid date 2023-05-01 and assuming you must have wanted the day after April 30.) Thanks!
There are no plans to fix this behavior currently, though this should be noted as a "known issue" in release notes or elsewhere in the documentation, so I will check for that. A more complex but more complete workaround would be a Python UDScalar "to_date_pg" and "to_timestamp_pg" using datetime.strptime, which shows the expected behavior, but uses different pattern syntax:
>>> datetime.strptime("20200431","%Y%m%d");
ValueError: day is out of range for month
>>> datetime.strptime("1230","%Y%m%d");
ValueError: time data '1230' does not match format '%Y%m%d'