mac pyodbc string encoding issue

Im using pyodbc and running into encoding issues with strings. After executing a query i end up strings that look like: u'p\x00w\x00e\x00c\x00w\x00'

If i decode the string as utf-16, the string looks correct. This only seems to happen when connecting from macs, as the linux odbc drivers seem to work correctly.

I've tried using different values for CHARSET (ex. UTF8, UTF16) in the connection string to no avail. My DriverManagerEncoding is set to UTF-32 in vertica.ini. Does anyone have any ideas what might be happening here? Thanks.

Using:
python 2.7
osx v10.8 (with iODBC)
vertica v6.1.2



Comments

  • Can you post how looks a predefined word?
    For example result of query:
    => select 'abcdefghijklmnopqrstuvwxyz' as alphabet from dual;
    I will think about work around.

    PS

    u'p\x00w\x00e\x00c\x00w\x00' = pwecw ?

  • >>> cur.execute("""select 'abcdefghijklmnopqrstuvwxyz' as alphabet from dual;""")
    >>> rows = cur.fetchall()
    >>> rows[0][0]

    u'a\x00b\x00c\x00d\x00e\x00f\x00g\x00h\x00i\x00j\x00k\x00l\x00m\x00n\x00o\x00p\x00q\x00r\x00s\x00t\x00u\x00v\x00w\x00x\x00y\x00z\x00'


    and yes 'pwecw' is correct, just some dummy data i was using during testing. Much thanks for the help.

  • Hi!

    I have a work around for you, but I don't want to be a part of this community (more exactly this forum, forum formatting make me crazy) - so if you want my help, so please open a new topic here - www.vertica-forums.com (my member nick is sKwa).

    To be honest:  here you can get help from Vertica employees (i.e. to get official answer) while forum mentioned above it's unofficial user 2 user help only, but IMHO more friendly and usable than this forum (Employees are wonderful, but they to not do real projects)


    PS
    To community: I will not do it any more (ask to move to other forum) but I started to help so I want to finish with it, but not here (usability of this forum under all critics).


  • Well, not sure how to respond to that...

    vertica-forums.com is a great site; we're glad it's there too.  And we are a younger site, with fewer big users posting so far; fair criticism.  But Daniel hasn't explained his concerns about the forum's usability, so we haven't been able to help.  Others have posted their concerns as "Ideas" posts and we have tweaked the site accordingly.  Oh well; sounds like he's happy there now and still helping the community, that's why it's good to have multiple sites.

    In any case, it looks like there is in fact a PyODBC bug here.  A quick workaround would be to transcode the resulting strings:

    >>> import codecs
    >>> codecs.decode(u'a\x00b\x00c\x00d\x00e\x00f\x00g\x00h\x00i\x00j\x00k\x00l\x00m\x00n\x00o\x00p\x00q\x00r\x00s\x00t\x00u\x00v\x00w\x00x\x00y\x00z\x00', 'UTF-16')
    u'abcdefghijklmnopqrstuvwxyz'

    The underlying weirdness is that Apple's stock Python uses UTF-16 as its internal string representation; all other Python builds that I know of default to UTF-8.  This periodically causes various problems with Apple's stock Python, and with builds intended to emulate it.

    You could keep playing around with the config files; try to get "UTF-8" and "UTF-16" in the right respective places.  Or post on the PyODBC bug; see if folks there have thoughts.

    You could also try an alternative Mac Python build.  I think Homebrew's Python uses UTF-8?

    I unfortunately don't have a recent Mac (well, I do, but its motherboard is dead...) so I can't test these myself; I can just throw them out there as ideas...

    Adam
  • Thanks Adam. Not sure what my plan is since the code will need to run on both mac and linux (short of some really hacky if(mac) encode() type logic), but i think i have the info i need.
  • Possibly the same problem that i had.

    See http://www.vertica-forums.com/viewtopic.php?f=35&t=1863&p=6174#p6174
    for the fix that worked for me.

    kesten

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file