Yesterday I spent the entire day getting the following amazing state-of-the-art not-ever-done-before feature to work:
- Executing a SQL statement from my program
Because, as everyone knows, I don’t suffer from NIHS, I used standard object-relational mapping software Hibernate, with a standard programming language Java, using the standard web-application server Tomcat, and now I am using the standard “connection pooling” software C3P0 (which I didn’t know I needed to execute a SQL statement, see below..)
The program is, in fact, already completed, and is nearly deployed. On the test server it works fine and even on the (future) live server it worked fine. But the customer noticed that if one installed it one day, the next day it didn’t work. I’ve had such symptoms many times before, so I know immediately what was going on:
- MySQL drops a connection after 8 hours (configurable)
- The software is used during the day, but isn’t used during the night, therefore the connection times out in the night
- Therefore in the morning, the program one installed the day before no longer works
Perhaps I exaggerated the simplicity above of what I was really trying to achieve. It should really be expressed as the following:
- Executing a SQL statement from my program, even if a long time has passed since the last one was executed
But that amounts to the same thing in my opinion! It isn’t rocket science! (But in fact is, see below..)
A obvious non-solution is to increase the “connection drop after” time on the MySQL server from 8 hours to e.g. “2 weeks” (“wait_timeout” in “mysql.cnf”). But software has got to be capable of reconnecting after a connection drops. The database server may need to be reset, it may crash, it may suffer hardware failure, etc. If, every time one restarts one particular service, one has to restart a thousand dependent services (maybe some Java, some Perl, some PHP, some robots, ..) and then maybe restart services which are dependent on them – that’s a maintenance nightmare. So the software has to be altered to be able to handle connection drops automatically, by reconnecting. Once the software has been so altered, one no longer needs to alter the “wait_timeout” on the server.
The error was:
org.hibernate.util.JDBCExceptionReporter: The last packet successfully received from the server was 56697 seconds ago. The last packet sent successfully to the server was 56697 seconds ago, which is longer than the server configured value of ‘wait_timeout’. You should consider either expiring and/or testing connection validity before use in your application, increasing the server configured values for client timeouts, or using the Connector/J connection property ‘autoReconnect=true’ to avoid this problem.
Quite a helpful error message, don’t you think? But
- I’m not going to increase “wait_timeout” as discussed above,
- “testing validity” in the application – well I was using standard software Hibernate which should take care of this sort of thing automatically, but evidently wasn’t
- and we were already using ?autoReconnect=true in the JDBC URL (this evidently wasn’t working).
I figured I really needed to get to the bottom of this. Googling just showed (many) people with the same problem, but no solutions. The only way to get to the bottom of software is to read the source. (It has been the way to resolve issues of simple things simply not working in MySQL before.)
I stopped looking in the MySQL source for why “autoReconnect=true” didn’t work when I saw the following text in the source describing the autoReconnect parameter:
The use of this feature is not recommended, because it has side effects related to session state and data consistency
I have no idea what particular side-effects are meant here? I guess that’s left as an exercise for the reader, to test their imagination.
And anyway, I figure that a reconnect-facility belongs in the “application” (Hibernate in my case) as opposed to in database-vendor specific code. I mean the exactly the same logic would be necessary if one were connecting to PostgreSQL or Oracle, so it doesn’t make sense to build it in to the database driver.
So then I looked in the Hibernate code. To cut a long story short, the basic connection mechanism of Hibernate (as specified in all the introductory books and websites, which is probably how most people learn Hibernate) doesn’t support reconnecting, one has to use H3C0 connection pool (which itself didn’t always support reconnecting)
(I don’t want to use container/Tomcat-managed connections, as I have some command-line batch jobs, and I don’t want to use different code for the batch jobs than the web application. Although another company defined Servlets which were “batch job servlets”, and these were executed via a “wget” entered into crontab – to get the user of container-managed connections – but this seems a too-complex solution to my taste..)
But once one’s used H3C0, the default behavior seems to be that to process a request, if the connection is dead then the user sees and error – but at least it reconnects for the next request. I suppose one error is better than infinite errors, but still not as good as zero errors. It turns out one needs the option testConnectionOnCheckout - which the documentation doesn’t recommend because testing the connection before a request might lead to lower performance. Surely the software firstly has to work, only secondly does it have to work fast.
So, to summarize, to get a connection to “work” (which I define as including handling dropped connections by reconnecting without error): In “hibernate.cfg.xml”:
<!-- hibernate.cfg.xml -->
<!-- no "connection.pool_size" entry! -->
Then create a file “c3p0.properties” which must be in the root of the classpath (i.e. no way to override it for particular parts of the application):
Amazing, that that stuff doesn’t just work out of the box. Programming the solution myself in Uboot took, I think, 1 line, and I’m sure it’s not more in WebTek either.
That was an amazing amount of effort and research to get the simplest thing to work. Now if only this project had been paid by the hour…..
[Update 28 May 2009] More Java hate today. Starting a new application, deployed it, and it didn’t work. In the morning, the application was down. Reason: The new project used Hibernate 3.3, and upgrade from 3.2 to 3.3 requires the “connection.provider_class” property to be set. Previously the presence of “c3p0.max_size” was enough.