JDBC pools for dashboard sessions
Hi All!
Jdbc pools question:
How do I control the JDBC pools that are connected to my dashboard? How do I control it and don't let it be overloaded so the system can continue working at the best performance all the time?
I read in the documentation -
https://www.vertica.com/kb/Low-Latency-Optimizations/Content/BestPractices/Low-Latency-Optimizations.htm
In Application Configuration section:
"Use JDBC connection pools to reduce the overhead of repeatedly opening network connections between the client and the server. Opening a new connection for each request takes time and is costly. When a request comes in, Vertica assigns the request to a pre-existing connection in the pool. If there are no free connections, Vertica creates a new connection. When the request completes, the connection returns to the pool, awaiting another request. Using JDBC connection pools eliminates the time it takes to authenticate a connection every time a new request is made. This lowers the latency of the query because it reduces the amount of time it takes to connect"
How do I do that exactly ? I was looking in the documentation but it was not clear enough....
Thank you so much for helping me!
Keren
Answers
Maybe this is helpful?
Using a Pooling Data Source
If you have the ability to change your JDBC driver settings you need to set EnableRoutableQueries to true so that any pool values you configure will be taken into account.
Setting EnableRoutableQueries will mean that everything will go to a single node so depending on the volume and type of queries your dashboard will send it might be better just to do a connection and get the load balancing.
https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/ConnectingToVertica/ClientJDBC/KVApi/kvApiOverview.htm
Our reports are done with Power BI using direct query and we’ve no way to influence any settings in Microsoft’s gateway connector so we spent a lot of time on query budgeting to try and ensure that queries initiate and don’t just queue up.
https://www.vertica.com/blog/do-you-need-to-put-your-query-on-a-budgetba-p236830/
You should work on configuring your resource pool regardless of whether or not you’re using a connection pool
Regards
Dave
@kfruchtman,
Are you using a particular tool to create your dashboards? Some third party tools provide the option to enable and configure pooling in the GUI. For example, Pentaho BA provides the option when creating a connection to Vertica (Pentaho also uses JDBC):
https://www.vertica.com/kb/PentahoPDI_CG/Content/Partner/PentahoPDI_CG.htm
https://www.vertica.com/kb/Vertica-Integration-with-Pentaho-Business-Analytics/Content/Partner/Vertica-Integration-with-Pentaho-Business-Analytics.htm
The JDBC properties to modify the maximum number of connections in the pool, are described in the documentation at the end of the page: https://vertica.com/docs/10.1.x/HTML/Content/Authoring/ConnectingToVertica/ClientJDBC/JDBCConnectionProperties.htm
You should use caution before making changes based on the available resources in your environment, refer to the following topics in the documentation:
System limits: https://www.vertica.com/docs/10.1.x/HTML/Content/Authoring/SQLReferenceManual/SystemLimits.htm
Managing sessions: https://www.vertica.com/docs/10.1.x/HTML/Content/Authoring/AdministratorsGuide/ManageWorkloads/ManagingSessions.htm
Hope this is helpful.
Norma
Unless you're working with incredibly fast queries or you have a really slow network, I wouldn't bother with connection pooling.
If your queries take many seconds to run and your connection times are only a small percentage of that, the benefit you'll get will be tiny. You'll be better off making sure that your projections are optimal, resource pool is well configured etc.
You can use the attached script to check your connection/initiate time, edit the variables as required. In an infinite loop it wakes up every 5 seconds to make a load balanced connection and outputs the time it took in milliseconds. The production cluster I work on isn't particularly special and we rarely see anything above 2ms.
Sorry, in the script the connect should be after the setting of starttime
i.e.
STARTTIME=
date '+%Y%m%d%H%M%S%3N'
connect=
$VERTICA_VSQL -C -h $VERTICA_HOST -U $VERTICA_USER -w $VERTICA_PASS -d $VERTICA_DB -p $VERTICA_PORT -c "SELECT local_node_name();"
echo $connect
ENDTIME=
date '+%Y%m%d%H%M%S%3N'