Archive for the ‘Broken’ Category

Upgrade to Lenny, everything down :(

Sunday, February 21st, 2010

How annoying, I upgraded from Debian Etch (Apache 2.2.3-4) to Debian Lenny (Apache 2.2.9), and then my Subversion Server (over HTTPS) gave the following error when surfed to from Firefox, which worked fine before:

An error occurred during a connection to svn.example.com.
SSL received a record that exceeded the maximum permissible length.
(Error code: ssl_error_rx_record_too_long)

What does that mean!? There’s not a great deal of info on the web.

Fundamentally, in my case, the first thing to work out, is that that error message means (or meant, in my case at least) HTTP was being transmitted over the HTTPS port, i.e. it wasn’t valid HTTPS at all, thus the protocol error. This could be confirmed by surfing to http://…:443/ (i.e. not https://) and seeing that the content (the Subversion server in my case) was correct.

The question was why? I had a bunch of sites in the “sites-enabled” directory, and another one of them (not my Subversion site!) had a

<VirtualHost *>

whereas it should have been

<VirtualHost *:80>

i.e. the port was missing. I’m not quite sure why it had that effect, as the request to the Subversion HTTPS URL did deliver the Subversion content, just not over HTTPS any more. But perhaps without the :80, it decided all ports should be subject to NameVirtualHost, and as that’s not possible with HTTPS, switched HTTPS off for all ports and all sites?

Nightmare ….

See also: http://stackoverflow.com/questions/119336/ssl-error-rx-record-too-long-and-apache-ssl

PHP infinite recursion

Wednesday, November 18th, 2009

What can I say? How about “toy language”?

$ php -r 'function foo() { foo(); } foo();'
Segmentation fault

I’m not saying that infinite recursion is a good idea, but during development it can happen by accident, and I don’t expect such a simple error to crash the PHP interpreter! (Also it took me about 20 minutes to debug this problem, as I had no idea where it happened, nor indeed what the problem was..)

PHP 5.2.6 on Linux 2.6.26 Debian

Java: Always explicitly specify which XML parser to use

Tuesday, September 15th, 2009

There is the following design error in Java (at least in Servlets):

  1. A server may serve multiple applications; each application may use different libraries or even different versions of the same library, “side by side”.
  2. XML parsers, transformers (XSLT), etc., have a standard interface, and there may be different implementations of this interface from different vendors, open-source projects, etc.
  3. Which XML parser, transformed etc. is actually used depends on a global system variable.

And it’s point 3 that’s the problem really. Points 1 and 2 are debatable: they certainly bring advantages, but they certainly bring complexity too.

I just had the problem that one of my web applications stopped working, but only intermittently. Restarting the server led to everything being OK, but later things would not be OK. I do hate environments where everything appears to work, yet in fact doesn’t. I mean how do you know when you’re “done” in such an environment? (Or how do you even know you are in such an environment?)

The bug was caused by:

  1. Application one used the default XML parser, and didn’t have any extra JARs (libraries) for reading XML
  2. Application two required a special XML parser, set the global variable so it would be used, and included the JARs necessary for the special XML parser

So when a request came to application 1, after a request had come to application 2, then the system would try to instantiate the special XML parser within application 1 (specified in the global variable set by application 2), but wouldn’t find it, as it wasn’t deployed in application 1 (and applications can’t use one another’s libraries, due to feature #1).

This seems obvious when one describes it, but looking at the logs, on a live server, with the system down and the clock ticking? – Far from obvious.

So now, I assert, every time you want to create an XML parser, do the following:

If you require a special XML library, use:

System.setProperty("javax.xml.parsers.DocumentBuilderFactory",
    "org.apache.xerces.jaxp.DocumentBuilderFactoryImpl");
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
...

If you require the standard XML library, use:

Properties systemProperties = System.getProperties();
systemProperties.remove("javax.xml.parsers.DocumentBuilderFactory");
System.setProperties(systemProperties);
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
...

There is also the possibility to pass a parameter to DocumentBuilderFactory to specify which XML parser technology to use. That’s a good option too, as it wouldn’t “corrupt” this global variable (“system property”). However I think one should be defensive, and always delete the global variable if one wishes to use the standard XML parser, and therefore it doesn’t matter if this global variable gets corrupted or not.

Never do the following:

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();

This simply relies on whichever XML parser is currently set in the global variable. You have no way to guarantee that some other application running on the same server won’t set the global variable to use an XML parser you don’t have installed in your application. Even if you have control of the server and all applications, you don’t know what software you’ll be writing in the future. (In this case I installed a new application to a server which’d been running fine for 1 year, but due to setting the global variable, the old application broke..)

The same applies for all those other “factory” situations such as TransformerFactory.newInstance() etc.

This feels all quite inelegant to me, and has just cost me a lot of time, and it’s not as if I’m so new to programming Java. I am wondering if there is a better way to approach it? Or is Java just broken in this particular respect?

P.S. This is not the only thing that went wrong with the old application today. I upgraded from Java 5 to Java 6 and suddenly some XML was not compliant against a schema according to Java – I had hit this error.

Never close PHP class files with the “?>” tag

Friday, August 21st, 2009

When developing PHP, a front-end PHP file will include other files: classes, utilities, etc.

When writing those class files, one also needs to use the <?php tag at the start of the file, otherwise PHP will simply take the text and output it unchanged to the browser. (PHP’s assumption is that it sits in a web page, with probably more markup than code, so by default characters in the source code get copied one-to-one to the browser and the <?php?> tags are necessary to introduce PHP to the “exceptional circumstance” that one might actually want to program some PHP.)

If one must open the class source file with <?php then it would seem to make aesthetic sense to close it with ?>. However, there are no negative side-effects if one does not close the tag, plus one very negative side-effect if one does close it.

We performed a minor release a while ago, after which the display of generated PDF files no longer worked. Yet the minor release had nothing to do with the section of code that produced PDFs. What sort of weird action-at-a-distance could possibly be happening here?

The reason was that one class file in the minor release had a blank line after the ?> tag. This was impossible to spot in the text editor. The blank line was printed to the browser, which was also invisible in nearly all of the site, as HTML ignores blank lines. PDFs probably do as well (I haven’t checked) but the problem wasn’t with the content. As HTTP response content is streamed to the browser (as opposed to being collected first and then sent to the browser at the end of the request), HTTP headers can only be set before the first byte of output has been produced by the software. As the blank line in the class source file consituted content, and the source file was (necessarily) parsed before the code could be executed, the HTTP header “Content-Type: text/pdf” couldn’t be sent, and various errors about headers not being able to be sent, combined with the binary source of the PDF, arrived at the user’s screen.

So given there are no disadvantages, and one particulary weird source of bugs can be removed, I would say one should never end PHP files with ?>.

My favourite Hibernate error

Wednesday, June 10th, 2009

… is this one. I’ve wasted many an hour searching for the cause of this. And it’s one you’re likely to run into pretty quickly when you try to write your first Hibernate configuration file.

The XML

<one-to-many type=”OtherClass”/>

delivers the error

Error parsing XML: Attribute “type” must be declared for element type “one-to-many”.

This looks like a perfectly self-explanatory error, however looking at the file, the element does have a “type” attribute. What should one do?

Thinking about it, I only just introduced the “type” attribute to the <one-to-many> element in my config. What happens if I change the attribute name to “fsdjkfdk”?

<one-to-many fsdjkfdk=”OtherClass”/>

The error is now:

Error parsing XML: Attribute “fsdjkfdk” must be declared for element type “one-to-many”.

What the error means is that the attribute must not be declared, as opposed to must.

It’s amusing to read even people on the Hibernate team get confused by this error, and can’t find a solution.

(Hibernate 3.3.1 – the most current version – although I encountered this error within the first hour of ever using Hibernate in Q1/2006.)

Per-CPU performance statistics are useless

Wednesday, April 1st, 2009

Windows, Linux and OS X offer the ability to view the utilization of each CPU/core in the system. This is completely useless. On all these operating systems, tasks get switched from one core to another on a regular basis. (I don’t know why this happens, but I suppose there is no reason for it not to happen.)

Here is my CPU-bound single-threaded program running on a dual-core computer.

I suppose all one can really say is that if one has N cores and the average CPU% usage (over all cores) is approximately 100/N then probably one is running a program which can’t take advantage of multiple cores.

I would rather replace the current “CPU usage history, per core” multiple graphs with:

  • One graph, showing a history of the average over all CPUs (visually the same as if one had a 1-core CPU).
  • I would then add horizontal marker lines: If one had 4 cores, I would add 4 equally spaced marker lines. This would show that if the performance reached a marker line (e.g. 25% for the first line) then probably running the equivalent of 1 single-threaded program.

I mean it’s not a brilliant solution but I reckon it would be more meaningful than the way the information is currently displayed.

Automatic reconnect from Hibernate to MySQL

Friday, October 24th, 2008

Yesterday I spent the entire day getting the following amazing state-of-the-art not-ever-done-before feature to work:

  • Executing a SQL statement from my program

Because, as everyone knows, I don’t suffer from NIHS, I used standard object-relational mapping software Hibernate, with a standard programming language Java, using the standard web-application server Tomcat, and now I am using the standard “connection pooling” software C3P0 (which I didn’t know I needed to execute a SQL statement, see below..)

The program is, in fact, already completed, and is nearly deployed. On the test server it works fine and even on the (future) live server it worked fine. But the customer noticed that if one installed it one day, the next day it didn’t work. I’ve had such symptoms many times before, so I know immediately what was going on:

  • MySQL drops a connection after 8 hours (configurable)
  • The software is used during the day, but isn’t used during the night, therefore the connection times out in the night
  • Therefore in the morning, the program one installed the day before no longer works

Perhaps I exaggerated the simplicity above of what I was really trying to achieve. It should really be expressed as the following:

  • Executing a SQL statement from my program, even if a long time has passed since the last one was executed

But that amounts to the same thing in my opinion! It isn’t rocket science! (But in fact is, see below..)

A obvious non-solution is to increase the “connection drop after” time on the MySQL server from 8 hours to e.g. “2 weeks” (“wait_timeout” in “mysql.cnf”). But software has got to be capable of reconnecting after a connection drops. The database server may need to be reset, it may crash, it may suffer hardware failure, etc. If, every time one restarts one particular service, one has to restart a thousand dependent services (maybe some Java, some Perl, some PHP, some robots, ..) and then maybe restart services which are dependent on them – that’s a maintenance nightmare. So the software has to be altered to be able to handle connection drops automatically, by reconnecting. Once the software has been so altered, one no longer needs to alter the “wait_timeout” on the server.

The error was:

org.hibernate.util.JDBCExceptionReporter: The last packet successfully received from the server was 56697 seconds ago. The last packet sent successfully to the server was 56697 seconds ago, which  is longer than the server configured value of ‘wait_timeout’. You should consider either expiring and/or testing connection validity before use in your application, increasing the server configured values for client timeouts, or using the Connector/J connection property ‘autoReconnect=true’ to avoid this problem.

Quite a helpful error message, don’t you think? But

  • I’m not going to increase “wait_timeout” as discussed above,
  • “testing validity” in the application – well I was using standard software Hibernate which should take care of this sort of thing automatically, but evidently wasn’t
  • and we were already using ?autoReconnect=true in the JDBC URL (this evidently wasn’t working).

I figured I really needed to get to the bottom of this. Googling just showed (many) people with the same problem, but no solutions. The only way to get to the bottom of software is to read the source. (It has been the way to resolve issues of simple things simply not working in MySQL before.)

I stopped looking in the MySQL source for why “autoReconnect=true” didn’t work when I saw the following text in the source describing the autoReconnect parameter:

The use of this feature is not recommended, because it has side effects related to session state and data consistency

I have no idea what particular side-effects are meant here? I guess that’s left as an exercise for the reader, to test their imagination.

And anyway, I figure that a reconnect-facility belongs in the “application” (Hibernate in my case) as opposed to in database-vendor specific code. I mean the exactly the same logic would be necessary if one were connecting to PostgreSQL or Oracle, so it doesn’t make sense to build it in to the database driver.

So then I looked in the Hibernate code. To cut a long story short, the basic connection mechanism of Hibernate (as specified in all the introductory books and websites, which is probably how most people learn Hibernate) doesn’t support reconnecting, one has to use H3C0 connection pool (which itself didn’t always support reconnecting)

(I don’t want to use container/Tomcat-managed connections, as I have some command-line robots which do some work, and I don’t want to use different code for the robots as the web application. Although another company defined Servlets which did “robot work”, and the robot was just a “wget” entered into Tomcat – to get the user of container-managed connections – but this seems a too-complex solution to my taste..

But once one’s used H3C0, the default behavior seems to be that to process a request, if the connection is dead then the user sees and error – but at least it reconnects for the next request. I suppose one error is better than infinite errors, but still not as good as zero errors. It turns out one needs the option testConnectionOnCheckout - which the documentation doesn’t recommend because testing the connection before a request might lead to lower performance. Surely the software firstly has to work, only secondly does it have to work fast.

So, to summarize, to get a connection to “work” (which I define as including handling dropped connections by reconnecting without error): In “hibernate.cfg.xml”:

<!-- hibernate.cfg.xml -->
<property name="c3p0.min_size">5</property>
<property name="c3p0.max_size">20</property>
<property name="c3p0.timeout">1800</property>
<property name="c3p0.max_statements">50</property>
<property name="connection.provider_class">
   org.hibernate.connection.C3P0ConnectionProvider</property>
<!-- no "connection.pool_size" entry! -->

Then create a file “c3p0.properties” which must be in the root of the classpath (i.e. no way to override it for particular parts of the application):

# c3p0.properties
c3p0.testConnectionOnCheckout=true

Amazing, that that stuff doesn’t just work out of the box. Programming the solution myself in Uboot took, I think, 1 line, and I’m sure it’s not more in WebTek either.

That was an amazing amount of effort and research to get the simplest thing to work. Now if only this project had been paid by the hour…..

[Update 28 May 2009] More Java hate today. Starting a new application, deployed it, and it didn’t work. In the morning, the application was down. Reason: The new project used Hibernate 3.3, and upgrade from 3.2 to 3.3 requires the “connection.provider_class” property to be set. Previously the presence of “c3p0.max_size” was enough.

mysqli_affected_rows

Wednesday, October 8th, 2008

Recently I programmed the following screen in PHP:

  • The user logs in
  • The user has a subscription
  • The subscription has a number of states (“terminate”, “auto-extend”, ..)
  • There is a screen allowing the user to change this state
  • The screen is a set of radio buttons – each radio button relates to one state
  • The user clicks on the radio-button representing the state they wish, clicks “ok”, and the new state gets saved to the database

Not rocket science eh? Well, unbelievably my implementation of the above had a bug. How on earth was that possible?

The bug was the following: If you changed the state, everything worked fine. But if you chose the same state as is already selected, an Exception gets thrown.

Initially I suspected a simple coding mistake. When I looked at the code, everything looked right. I had used the following “algorithm”:

  • Update the “subscription” row using SQL
  • Check the result of the SQL statement, that exactly 1 row was updated (in case e.g. id referenced a non-existing subscription, which would be an error)

I used the PHP function mysqli_affected_rows for that and unbelievably that has the following functionality: it only returns the number of changed rows i.e. the number of rows:

  • Matching the where clause, and
  • Currently having values different to those values being written to the row.

I can’t imagine a case where one would want to know that. I couldn’t find any function to return the number of rows matching, independent of if the values were changed or not. (The older version mysql_affected_rows exhibits the identical functionality.)

So I had to write the following function:

/**
 * Returns the number of rows which matched the WHERE
 * clause on the last UPDATE statement. This is not the
 * same as mysqli_affected_rows, which only returns the
 * number of changed rows.
 */
public static function DbUpdatedRows() {
    $link = self::DbGetLink();  // mysqli object
    $info = mysqli_info($link);
    if (preg_match('/Rows matched: (\d+) +Changed/',
            $info, $matches))
        return $matches[1];
    throw new Exception("DbUpdatedRows called although ".
        "it doesn't look like an UPDATE was the ".
        "last statement: mysqli_info returned '$info'");
}

I’ve just checked, and in InnoDB inside a transaction, it’s good to see that (as with Oracle) write-locks are indeed placed on all matched rows not just updated rows.

And don’t get me started on using DB-specific function calls (i.e. functions named mysql_x) as opposed to using a DB-abstraction layer like DBI in Perl, JDBC in Java, etc. Nor why I’m using PHP or MySQL in the first place.

Crazy algorithm for displaying text size value

Friday, July 18th, 2008

Anyone who has written any graphics or text manipulation software will know the following problem:

  • Each character or object has a particular style, for example the size of the writing, the thickness of the stroke, etc.
  • There is some user interface element e.g. a dialog or rollup where the user can view and edit the attribute of the currently selected object
  • The user may select multiple objects

And there we have it: the user selects two pieces of text, one is 12pt, the other 18pt, and opens the “font size” dialog. What size is it to display? There are a number of solutions to this problem, none particularly elegant:

  • Display one of the sizes, i.e. 12pt or 18pt. (It doesn’t matter if the program uses an “intelligent” algorithm to decide which size to display—the largest number, the smallest number, the value of the leftmost character, the value of the character the user selected first or last—the user won’t work out which algorithm has been used!)
  • Display either a gray box or the text “multiple selected”. Microsoft Word does this, and I suppose it’s an acceptable solution. The user is clearly informed that the value of all the characters is neither 12pt nor 18pt.

However, amazingly enough, I came across a new solution in the graphics program Inkscape. And it’s worse than any of the above! It has a certain elegance about it, and for sure this lead to the developers thinking its a good idea, but it’s certainly completely useless in practice.

The solution used by Inkscape is to display the average of the values. So if you have a character at 18pt and a character at 12pt, then display 15pt in the dialog. This is really confusing, as none of the characters you’ve selected actually are 15pt.

So if e.g. you have an 18pt character followed by lots of 12pt characters, as you shift-right and select more and more of the 12pt characters, the size displayed in the dialog slowly becomes less, displaying most of the time various fractional values, tending towards 12pt.

In fact–that’s wrong. It’s even more interesting than that. In writing this post I tried it out to check what was really happening. In fact it takes the average of the sizes of constant-sized spans of characters. For example if you have an 18pt char and a 12pt char, then it displays 15pt; if you have 18pt then 12pt then 18pt it displays 16pt; if you have 18pt then 18pt then 12pt, it displays 15pt, as there is one span of 18pt, and one span of 12pt, the average of 18pt and 12pt being 15pt.

It’s crazy stuff. It took me about 20 minutes to work out that this algorithm was what was being used. With my wedding tomorrow, one would imagine I should be concentrating on things other than font size display algorithms at the moment!

Too much data for the browser window? Solution: iframe

Wednesday, April 16th, 2008

Recently I was asked to put a webpage online which contained some navigation and some content. There was more content than could fit the browser on an average screen. So the content looked basically like this:

But then I was sent a new version which was cleverer,

  • The content was stored in an iframe
  • There was Javascript which dynamically altered the size of the iframe to be the size of the monitor resolution (i.e. independent of current browser size)

The result of this complexity was that the iframe nearly—but not quite—fitted into the browser’s content area (assuming you had your browser maximised).

Because the iframe didn’t sit quite at the top of the window, this meant:

  • The bottom “scroll down” arrow button of the inner iframe (the one you need to actually see more content) was, by default, off the bottom of the screen
  • As there was content beyond the bottom of the screen (the “scroll down” button of the iframe) the main browser window also displayed a scroll bar
  • To scroll to see more data, you need the inner iframe. But instinctively one reaches for the right-most scrollbar, as that’s normally what you need to see more data, e.g. in a browser, Word document, etc.
  • It only got worse if you don’t use the browser maximised.

I mean I don’t know what aspect of “the browser will display a scroll bar if there’s more data than will fit in the window, without needing Javascript and iframes” they didn’t understand.

I suppose the navigation will always be shown, but I think people are used to navigation scrolling with the content these days. And if the objective was to keep the navigation on the screen, one could have used a normal frame, which would have required no Javascript, and still resulted in only one scrollbar, in the place where the user expects it.

Really the whole “scrollbar within a scrollbar” concept (which unfortunately is pretty much mandatory if you have a text area within a webpage) is really so nasty.