Never close PHP class files with the “?>” tag

When developing PHP, a front-end PHP file will include other files: classes, utilities, etc.

When writing those class files, one also needs to use the <?php tag at the start of the file, otherwise PHP will simply take the text and output it unchanged to the browser. (PHP’s assumption is that it sits in a web page, with probably more markup than code, so by default characters in the source code get copied one-to-one to the browser and the <?php?> tags are necessary to introduce PHP to the “exceptional circumstance” that one might actually want to program some PHP.)

If one must open the class source file with <?php then it would seem to make aesthetic sense to close it with ?>. However, there are no negative side-effects if one does not close the tag, plus one very negative side-effect if one does close it.

We performed a minor release a while ago, after which the display of generated PDF files no longer worked. Yet the minor release had nothing to do with the section of code that produced PDFs. What sort of weird action-at-a-distance could possibly be happening here?

The reason was that one class file in the minor release had a blank line after the ?> tag. This was impossible to spot in the text editor. The blank line was printed to the browser, which was also invisible in nearly all of the site, as HTML ignores blank lines. PDFs probably do as well (I haven’t checked) but the problem wasn’t with the content. As HTTP response content is streamed to the browser (as opposed to being collected first and then sent to the browser at the end of the request), HTTP headers can only be set before the first byte of output has been produced by the software. As the blank line in the class source file consituted content, and the source file was (necessarily) parsed before the code could be executed, the HTTP header “Content-Type: text/pdf” couldn’t be sent, and various errors about headers not being able to be sent, combined with the binary source of the PDF, arrived at the user’s screen.

So given there are no disadvantages, and one particulary weird source of bugs can be removed, I would say one should never end PHP files with ?>.

Wasted 3 hours of my life

Working at Easyname, I heard from our support department that strange things were happening with PayPal payments – either they seemed not to be appearing on our site, or there were errors in the logfiles, and the users were complaining.

This is obviously a “drop everything” kind of bug – no matter the fact I was deep in the middle of doing some other software development (or even if I’d been working at another customer), if stuff to do with money doesn’t work, everything else must be stopped and that must be looked at with the highest priority.

What I did:

  • Look at the logfiles, there was an “unknown error” there (not very helpful!)
  • Set up a developer account for Paypal, get our software on my dev system to work with it
  • Test the PayPal IDN protocol using their tools and my newly set-up development environment
  • Change the software to produce more logging (rather than “unknown error” – although to my knowledge, that log had never been printed before, in the year or so the system was online)
  • Release the version with more debugging online
  • Communicate with various people I’m working with (management, support, ops, ..)
  • Test the new version by making a payment
  • See that it still didn’t work

What I should have done:

  • Do a search using twitter.
bluecrowbar I’m seeing problems with PayPal IPN (Instant Payment Notification). The notification is currently not instant. 13 minutes ago from Twitterrific
mystore Paypal IPN payments recived, they were just 4 hours late ;) 27 minutes ago from web
mystore @staleedstrom Check @chomasia ’s This page has details of the IPN issue with PayPal http://bit.ly/3yJ1w about 3 hours ago from web
tonydalian2009 Paypal IPN doesnt work by now. sigh~~~about 4 hours ago from web
travelfish What part of the word “Instant” in IPN doesn’t payPal understand? about 4 hours ago from TwitterFox

What can I say? Looking at logfiles is such a legacy approach to bug solving.

Starting Jetty: FAILED

It is literally 23:19 on a Sunday and I’ve been working through the weekend to get a release out of some software I’m working on.

The Java application webserver (Jetty) was taking a long time to restart each time I did a change, so for some reason I thought I’d experiment with some new command-line options. Probably not the right time to do that.

Normally I would type

$ sudo  /etc/init.d/jetty6 restart
Stopping Jetty: OK
Starting Jetty: OK

and everything would be good. I tried typing

$ sudo /etc/init.d/jetty6 supervise

Then some stuff happened that I didn’t really understand. Rather than try and work out what it did I tried to restart it again using the old restart mechanism

$ sudo  /etc/init.d/jetty6 restart
...
Starting Jetty: FAILED

OK I mean that was basically what was going on, it just wrote FAILED. How helpful! There was no info in the logfile. I searched Google but didn’t come up with anything.

A reboot later, and about half an hour of looking into /etc/init.d/jetty6 with vi and randomly making changes and printing more stuff out yielded the fact that the “supervise” command had evidently run Jetty “as me” and not as the “jetty” user. So when the normal “restart” command came along and tried to run the program as “jetty” then there were files it couldn’t write to.

Solution:

$ sudo chown jetty /var/log/jetty6/2009_07_12.stderrout.log

My favourite Hibernate error

… is this one. I’ve wasted many an hour searching for the cause of this. And it’s one you’re likely to run into pretty quickly when you try to write your first Hibernate configuration file.

The XML

<one-to-many type=”OtherClass”/>

delivers the error

Error parsing XML: Attribute “type” must be declared for element type “one-to-many”.

This looks like a perfectly self-explanatory error, however looking at the file, the element does have a “type” attribute. What should one do?

Thinking about it, I only just introduced the “type” attribute to the <one-to-many> element in my config. What happens if I change the attribute name to “fsdjkfdk”?

<one-to-many fsdjkfdk=”OtherClass”/>

The error is now:

Error parsing XML: Attribute “fsdjkfdk” must be declared for element type “one-to-many”.

What the error means is that the attribute must not be declared, as opposed to must.

It’s amusing to read even people on the Hibernate team get confused by this error, and can’t find a solution.

(Hibernate 3.3.1 – the most current version – although I encountered this error within the first hour of ever using Hibernate in Q1/2006.)

Optical device

I saw a cool device today. One of my customers manages data centers and they have a new data center and need to connect it to their old ones, so they’ve bought/rented some “dark fiber” between their data centers and have fiber multiplexers at each end so that lots of fiber-optical devices at each end can all talk to one another over this single dark fiber. (Or something like that – I don’t know too much about networking equipment!)

So basically this multiplexer looks like any other switch thing, a long horizontal device with a bunch of identical ports on the back, each with a number etc, and one “master” port which connects to the dark fiber. (only the optical sockets look slightly different from e.g. RJ45 sockets..)

But then i noticed, this device only had the identical data ports/master port on the back. I couldn’t find a power socket, I looked at all of its sides, and was confused for a while. Until i worked out the reason why I couldn’t see the power socket – the device was purely optical and didn’t need power!

Imagine that, data center, a bunch of hard-core networking and computing equipment, the last thing one expects is devices that don’t need power!

Order of function parameters

Functions take parameters. What order should these parameters be in?

Perhaps a bit of a ridiculous question, given that it clearly doesn’t matter.

void writeUser(Connection c, User u) { ... }
void writeUser(User u, Connection c) { ... }

If one writes either of these functions, they will both work, and any performance differences between the two would be an extreme micro-optimization which wouldn’t be platform independent (e.g. in RISC OS the first 6 parameters to a C function were stored on the stack and the rest weren’t, unless one used a “pass by value” structure, so it might have an impact on performance… but I digress)

However, certain languages support the feature partial application, where passing only some of the parameters to a function produces a function taking only the remaining parameters. For example (in SML, syntax examples, BTW A sad sign of the times when Googling for SML yields “Did you mean: XML”)

fun add (a,b) = a + b;
add 1 2;                 (* prints 3 *) 

val successor = add 1;   (* generates function taking 1 parameter *)
successor 2;             (* prints 3 *)

So in those languages, it makes sense to place the most constant parameters as left-most as possible. (The syntax only allows left-most parameters to be partially applied.)

For example a function to write a user to a database connection might be defined as taking two parameters, a database connection and the user to be written, and partial application could be used as follows:

  Result of partial application
writeUser db user   Function which writes any user to a fixed database connection
writeUser user db   Function which writes the particular user to any database connection

Clearly the first partial-application function is much more useful than the second.

Even in languages which don’t support partial application I’ve noticed a convention of functions being defined this way, probably because at least some (of the best?) code is written by people with experience of functional programming languages. I like this convention, and adhere to it myself.

MySQL “lock tables” does an implicit commit

The MySQL “lock tables” (and “unlock tables”) command has a nasty side-effect, it implicitly commits the current transaction.

This caused a bug in production code (a normally irrelevant temporary error, which should have normally caused a rollback, only rolled back to the last “unlock tables” command due to its implicit commit, and thus left the database in an inconsistent state, meaning that when the request was retried the software found the database in a state it wasn’t expecting and thus couldn’t process the request even though the irrelevant temporary error had now been fixed. So the request got retried indefinitely, and the result was that a lot of orphan invoices were created in the database, which wouldn’t have been a problem but invoice numbers were then consumed by these unused invoices, and Austrian law dictates that invoice numbers must be sequential, i.e. the numbering system shouldn’t have huge holes in it. Or, as in our case, a number of huge holes, as the system was up and running for other requests during this bug (a feature, that one faulty request shouldn’t take the whole system down), so other invoices did get correctly generated at the time, and their invoice numbers were then small islands in the sea of unused/invalid invoice numbers produced by the bug.)

I wish this had been more obviously documented (MySQL 5.1 lock tables documentation). It is there in the bullet points (and the comments!), but it’s hardly well emphasized, and I missed it.

The trouble is that the “lock tables” command comes from the stone-age of MySQL before it supported transactions at all.

This has been “fixed” in MySQL 6.0 (MySQL 6.0 changelog), via the introduction of a new options ”in share mode” / “in exclusive mode”.

In the mean-time, if you need to lock a table (which one needs to do in order to ensure table-wide sequential numbering, as a primary key isn’t designed for that) I would recommend creating a table with a name like “mutux” and creating a row for each type of table-lock you might like to acquire, and then “select for update” the row corresponding to your table. This creates an in-transaction row-level lock, doesn’t do an implicit commit, and the lock will be automatically released when the transaction ends.

Java really delivers “write once, run anywhere”

Java’s slogan was “write once, run anywhere”. They received a certain amount of criticism but I have to say that compared to other programming languages it’s really true. You can use it for:

  • Background jobs (without user-interface)
  • Server-side web applications (many web servers & web frameworks available)
  • In the web browser (applets)
  • In the web browser (translation from Java to Javascript by GWT)
  • On the desktop (using platform-independent Swing, or with native Apple UI using Cocoa)
  • On some mobile phones (J2ME, Google Android) – although I haven’t tested how good that really works?

Writing the previous version of a certain application website in Perl, there was no easy way to give the customer a “tool” to test out new versions of the configuration file. These files would normally be installed on the server, were multiple megabytes in size, and the Perl would parse and use them. For testing, it was not ideal to have to upload potential new files to the test server, due to their size.

The new version is in Java and also takes a configuration file, but I have written a Swing (desktop) tool which simply allows the tester to select a new potential configuration file from their local hard disk, and the desktop tool reuses all the processing logic 1:1 that the web server in production would use.

That wouldn’t have been possible with the old version of the logic written in Perl. (I know there are windowing libraries for languages like Perl but they are hardly as easy to deploy – i.e. install on a Windows workstation – as a Java application – simply double-click the .jar file once Java is installed)

I am writing the web front-end for the new version in GWT so I can reuse certain (mainly user validation) code between the web browser (giving the user instant feedback in case of errors) and the web server (necessary for security in case someone bypasses the client and sends HTTP requests directly.) And simply pass Java objects between the web server and the web client, without having to worry about how that transfer works (JSON, XML, etc.)

Other mainstream candidates for languages which run on multiple places:

  • Objective-C is not too bad, running on Apple desktops, Apple iPhones, and on the web server via WebObjects (does the current version of WO still use Objective-C or is it Java-only these days?) – but not in the browser
  • Javascript runs on the web browser and server, and no doubt mobile browsers (but not desktops as far as I know)
  • Perhaps C#? Certainly good desktop, web server integration, no doubt IE/Windows integration via ActiveX

www.aaa-plus.com

Thanks the the internet archive I found one of the first (or one of the only?) websites I designed i.e. I did the graphics, did the “implementation” using a tool which produced HTML, etc. (Although the company logo etc already existed, I didn’t write the texts..)

This was for AAA+ which was the first company I worked for in my life, which was also the company I worked for just after coming to Vienna, so has a lot of associations of newness in my life.

And I think it doesn’t look too bad :) www.aaa-plus.com (Internet Archive)

Per-CPU performance statistics are useless

Windows, Linux and OS X offer the ability to view the utilization of each CPU/core in the system. This is completely useless. On all these operating systems, tasks get switched from one core to another on a regular basis. (I don’t know why this happens, but I suppose there is no reason for it not to happen.)

Here is my CPU-bound single-threaded program running on a dual-core computer.

I suppose all one can really say is that if one has N cores and the average CPU% usage (over all cores) is approximately 100/N then probably one is running a program which can’t take advantage of multiple cores.

I would rather replace the current “CPU usage history, per core” multiple graphs with:

  • One graph, showing a history of the average over all CPUs (visually the same as if one had a 1-core CPU).
  • I would then add horizontal marker lines: If one had 4 cores, I would add 4 equally spaced marker lines. This would show that if the performance reached a marker line (e.g. 25% for the first line) then probably running the equivalent of 1 single-threaded program.

I mean it’s not a brilliant solution but I reckon it would be more meaningful than the way the information is currently displayed.