Wednesday 13 February 2019

Connection pooling | Database performance | pgBouncer | DB Server side Vs AppServer Side

In this iteration a database connection pooling solution is implemented. This enables the application to reuse database connections thus increasing performance at the database end. The pooling software used is pgBouncer.

> If the reader wants to compare the load test reports on pgBouncer on DB hosts Vs pgBouncer on Web app server scroll down to the end of this post.

The content of this post is discussed in the following video for those who prefer that to reading.



The changes at the db side is shown in application architecture below. Instead of connecting directly to the database applications connect to pgBouncer.

1) Setting up a connection consumes server resources and time. It requires a round trip to the database, authenticating a user and checking user privileges among others. Then the query is executed. Some client side frameworks hold on to a connection for a period of time and then release it. i.e The connection is closed after use. This means that a new request with the same credentials even from the same client has to go through the entire process again. This is can be avoided by returning the used connection to a pool and requesting new connections from a pool of active connections.

2) Each database has a configurable maximum concurrent connections limit. For Postgresql this is specified using max_connections in settings /var/lib/pgsql/data/postgresql.conf and has a default value of 100. Beyond this number connections are refused.

3) The load on the database can increase when a) when the user base increases resulting in increased activity b) new application features c) the application scales using load balanced web application servers to meet user load. 

Whatever the reason, an application that used to see a maximum of  60 concurrent connections to the database ends up getting 120. Some of those extra connections end up refused at the database end. Pooling solutions also have a maximum client connection limit. The objective to configure and use a set number of database connections efficiently. Say we have configured the database to allow 150 concurrent connections. The server may have one or more databases for different users. For example, database A and B for userA and userB respectively. Also assume Database A sees more hits/usage than database B. Of total connection allowed by the database say 10 connections may be needed for backend database admininstration and rest are free for application servers. On pgBouncer we can set the maximum allowed connections to say 280 (greater than the database maximum) and configure pgBouncer pools for each user + database pair. In this case we can do userA + database A pool with pool size of 90 and userB + database B with pool size 50. Thus regulate connections.

Adding a pooling solution is an overhead. An immediate increase in throughput is not evident until the application goes through a situation when there is contention for database connections. In that situation instead of getting rejected by the database and failing there, the request is fulfilled by the next available connection in pool. The application is configured to minimise database accesses using caching. So throughput may not be a direct indicator of performance gains. A contention can be simulated only by reducing the cache timeouts which is not ideal. Finally, if the maximum number of connections at pgBouncer is exceeded they are still refused.

The question of putting pgBouncer on the application server hosts or on the database

This depends on what works out in the end and is needed. For a small application with multiple databases we can host pgBouncer on the application server and tangibly save the round trip time to the db hosts. If there are many application servers we may be configuring large number of pooled connections at the source end and still end up with a deluge at the database end. This is like having the club bouncer stopping you at your door step because a headcount is reached at the club Vs you going to the club and getting stopped there.

Aggregate reports on load test with two runs are shown below. One with pgBouncer on DB hosts and second with pgBouncer on web server hosts. Both tests take 3-5 minutes to finish. 

Notice that with pgBouncer at web server hosts throughput increased by ~ 21.4% and errors dropped to near 0%. This is due to time saved with not having to setup connections to a remote host each time it is needed and having only a localhost connection. However the webserver hosts pgBouncer pools and has 100s of pooled connections ready to be used. As already mentioned multiple web servers with pools like this can cause deluge the database host.

Load test results with pgBouncer on DB hosts
 
Load test results with pgBouncer on web server hosts