pyodbc multithreading
Hello,
has anyone tried running a multithreaded python app using pyodbc ?
I found that the program below serializes at second db2.cursor call in either ODBC or pyodbc so that parallelism is lost . Any ideas where the culprit is ?
This is running on vertica 6.1.3 with 6.1.3 linux drivers on unixODBC 2.2.14
#!/usr/local/bin/python
import sys, io, os, datetime, pyodbc, threading, time
# db login info
dsn = 'VerticaDSN';password = '...';
def test():
db = pyodbc.connect("DSN=%s;PWD=%s" % (dsn, password));
cursor = db.cursor()
print "first query"
cursor.execute("create table temp.tn_foo2 as select * from ... limit 10000000;")
print "first query finished"
def test2():
print "running second query"
db2 = pyodbc.connect("DSN=%s;PWD=%s" % (dsn, password));
cursor2 = db2.cursor() cursor2.execute("select count(*) from ...;")
print "second finished"
print "starting thread"t=threading.Thread(target = test)
t.start()
time.sleep(1)
print "starting thread 2"t=threading.Thread(target = test2)
t.start()
Thanks,
has anyone tried running a multithreaded python app using pyodbc ?
I found that the program below serializes at second db2.cursor call in either ODBC or pyodbc so that parallelism is lost . Any ideas where the culprit is ?
This is running on vertica 6.1.3 with 6.1.3 linux drivers on unixODBC 2.2.14
#!/usr/local/bin/python
import sys, io, os, datetime, pyodbc, threading, time
# db login info
dsn = 'VerticaDSN';password = '...';
def test():
db = pyodbc.connect("DSN=%s;PWD=%s" % (dsn, password));
cursor = db.cursor()
print "first query"
cursor.execute("create table temp.tn_foo2 as select * from ... limit 10000000;")
print "first query finished"
def test2():
print "running second query"
db2 = pyodbc.connect("DSN=%s;PWD=%s" % (dsn, password));
cursor2 = db2.cursor() cursor2.execute("select count(*) from ...;")
print "second finished"
print "starting thread"t=threading.Thread(target = test)
t.start()
time.sleep(1)
print "starting thread 2"t=threading.Thread(target = test2)
t.start()
Thanks,
0
Comments
Read about limitations also. You are programming for Vertica, so read a limitation of Vertica
(or you will start to drive without to learn a rules? its possible, but you will get a lot of accidents).
You need connection per query and THREADING=1 defined in odbcint.ini
Vertica 7
https://my.vertica.com/docs/7.0.x/HTML/index.htm#Authoring/ProgrammersGuide/ClientODBC/ODBCFeatureSu...
The following features are not supported:
- More than one open statement per connection. For example you cannot execute a new statement while another statement has a result set open. If you need to execute multiple statements at once, open multiple database connections.
Vertica 6https://my.vertica.com/docs/6.1.x/HTML/index.htm#12795.htm
The following features are not supported:
I found some python library, that works with Vertica much better than pyodbc IMO.
You can try it https://github.com/uber/vertica-python
When pyodbc fails on PROFILE and INTERVAL values, this library not and it works much more stable.
Some example (that doesn't work with pyodbc): Actually it doesn't depends on psycopg2, but if you want some features like "parameter bindings", so it require a psycopg2.
Hope it will help you
(I succeed to run in parallel, but as subprocess and not thread)
I should have updated this thread as well earlier, because i was able to run multiple connections in parallel by mostly upgrading the unixODBC driver manager.
I upgraded the unixODBC driver manager to 2.3.2 and added in /usr/local/etc/odbcinst.ini a Threading attribute.
Regards,Colin