Code generation? Don’t generate to Java

I tried to write a program using Java; it all seemed to be going well but then I hit a ridiculous limit. Java cannot be used for this type of problem. I have now completely re-written it in a different programming language, and that works fine.

Be aware of this limit. I was unaware of it when I started this project. But it makes Java completely unsuitable for a whole class of problem.

My customer supplies me with a config file from time to time, this specifies a certain algorithm. When the user enters data, this algorithm must be applied. The algorithm is complex, so performance is an issue.

The solution I chose was to generate code to execute the algorithm, based on the information in the config file. This is a valid computer-science approach, and is used for similar problems. For example, language parsers are often expressed as a grammar, and code to parse documents in the grammar are generated. JSPs are turned into Java classes which are then compiled and executed. WebTek pre-compiles HTML templates containing macros into code which produces the resulting HTML when executed.

However, don’t try this in Java, unless you are only working with small problems. A single method in Java can only be 64KB in size, once compiled.
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4262078

This means, JSPs can only be of a certain length, parsers can only parse languages of a certain complexity, if WebTek were written in Java then templates could only be of a certain length and complexity, and so on. Do you want to place such restrictions on the software you produce?

My specific problem involves simulating one million variations to a particular solution. How can I fit that into 64K?

  • That is 0.06 bytes per solution variation; yet the simulation of a single variation involves many lines of code (i.e. in total compiling to more than 0.06 bytes!).
  • I could put each variation into its own method, and have a big method which calls them all—but a method call takes more than 0.06 bytes!
  • I could have a hierarchy of methods: one main method which calls, say, 100 sub-methods, each of those call 100 sub-sub-methods, and finally those methods call the methods for the individual variations.

It’s not even possible to know how many bytes a method will generate to! So, as the complexity of the simulation of a variation is expressed in the config file, I would have to essentially have to do a “trial and error” approach: generate a method, compile it, if I get the error concerning the 64KB limit, split the problem up into slightly smaller methods, try the compilation again, repeat, etc. (And the Java compiler is not even very fast.)

This is all so wrong! This is complexity, which isn’t solving the customer’s problem. This complexity costs me time (and thus my customer money), complexity leads to bugs and difficulty of maintenance, etc.

So I have changed the language. Rather than generate Java, I generate C and compile it using the GNU gcc compiler. From the GNU coding standards:

Avoid arbitrary limits on the length or number of any data structure, including file names, lines, files, and symbols, by allocating all data structures dynamically.

This is a good standard! I like it. All programs should be written with this in mind. Your program may well be online in 10 or 20 years, and the hardware may well have changed: a 64KB limit may seem reasonable one year but is a real limitation 10 or 20 years later in software which would otherwise still be useful.

So, if you are solving this type of problem, don’t use Java.

P.S. On a separate project I used a similar approach using Perl, and that worked out fine too.

Learnt two new smileys today

Thanks to Nessus!

\o/ – arms raised in the air – success gesture
<3  – heart

PHP infinite recursion

What can I say? How about “toy language”?

$ php -r 'function foo() { foo(); } foo();'
Segmentation fault

I’m not saying that infinite recursion is a good idea, but during development it can happen by accident, and I don’t expect such a simple error to crash the PHP interpreter! (Also it took me about 20 minutes to debug this problem, as I had no idea where it happened, nor indeed what the problem was..)

PHP 5.2.6 on Linux 2.6.26 Debian

“Go” programming language

Before everyone gets too excited about the “Go” language, let’s not forget it lacks:

Generics are necessary to express concepts like List<List<MyType>>. If you are in to static typing, then you need to be able to express those concepts. (And if you are not, then a statically typed language like Go is not for you anyway.)

Obviously it has good features as well, such as its concurrency constructs and type inference, but without those features above, I can’t see how you can do any useful modeling in the language; or rather how can you can express a useful model in the language without losing a lot of information.

http://golang.org/doc/go_lang_faq.html

Perhaps those features will be added in future versions.

3 changes price from £10 to £1250 then charge me to cancel

Where do I begin to describe how much I hate three (mobile operator UK) today.

(I had a mobile internet subscription / device from them.)

Firstly – and this is not new – the program they supply with the computer to show you how many MB you’ve used (and thus how much you’ve got left before they charge you) is “inaccurate” (= underestimates the amount you’ve used), some support person from three told me that. But this program has a massive window that comes up, with the 3 logo plastered all over it – I mean it’s not exactly obvious that this program is inaccurate. Why don’t they just put a huge warning up saying “this data is inaccurate”?

The product is something like 1000 MB surfing for £10 a month. The main point, for me, being that you can use it in certain foreign countries, including Austria, and the price is the same as if you used it in the UK. This is why I bought it – so I could use it from both the UK and Austria. This was called “roam like home”.

Well – that’s no longer the case. I noticed a bill (thankfully only £15), called them, tried to cancel, they told me I’d have to pay a cancellation fee as I’m still in the contract period. I mean this is so totally outrageous.

So I spent 45 minutes on the phone, failed to get them to drop the early cancellation fee (which admittedly is only 20 pounds). They say then sent me an SMS informing me of the changes, giving me 30 days in which to cancel. As I hadn’t cancelled within that time, I had to pay the fee.

However a UMTS modem for a computer – which I only use very rarely, simply isn’t switched on when it’s not in use. Normal SMS expire if they aren’t delivered after a certain period of time (10 days?) but they assured me this was a “special” SMS which didn’t expire. And I have often had problems with SMS when roaming, but they assured me this couldn’t have been the case.) Even if that had been the case, if the device is off for those 30 days, I wouldn’t receive the SMS in time. And anyway, I didn’t receive the SMS at all (but they claim I “must have done”)

And they also said they posted it on their website – nice one, as if I surf to three.co.uk every day to hang out – it’s the new Facebook, so much fun to be had there….

They also say they are not obliged to provide this service, it’s not in the terms and conditions. I knew they’d say this. But there were massive advertisements for this feature when I went into the shop and bought it – I mean many huge person-size signs advertising the service. The guy in the shop told me this was the product, he didn’t say “this is the product until further notice”. I even have an email from 3 confirming this is the case, again with no mention of this being something which could expire or change:

Austria is a 3 like home network. If you use data in this country then it’ll be first deducted from your allowance provided you are latched unto 3 network.

Normally out of allowance charges in Austria is £3.00 per MB however, as Austria is a 3 like home country if you exceed your allowance and are roaming to 3 network, you’ll be charged 10 pence per MB.

I mean I understand services can change over their lifetime, and prices can change, etc. But I mean this is no minor service change:

  • Before, when I bought it, I can use 1000 MB in Austria for £10
  • Now, 1 MB in Austria costs £1.25, so 1000 MB would be £1,250

So I spoke to them, to all their “managers” etc, on the phone, and they told me there’s nothing they can do. I didn’t cancel it in time; they understand my frustration but their hands are tied.

They’ve fundamentally changed the service – making it useless to me – and now charge me for cancelling it

Thankfully it didn’t cost me much - £15 in extra MB used, and £20 cancellation fee. But that’s mainly due to the wisdom I acquired by being charged €2500 by another mobile operator. This guy got charged £500 due to this change by 3, for example.

I hate mobile operators. Three is now added to my list of companies I’ll buy anything from again, the other being T-Mobile.

I mean this is capitalism – the operators are happy when they can charge people a few thousand Euros because they didn’t understand they were roaming, or didn’t understand the T&C had changed. As a consumer one is powerless. Capitalism shouldn’t have come to this.

Java: Always explicitly specify which XML parser to use

There is the following design error in Java (at least in Servlets):

  1. A server may serve multiple applications; each application may use different libraries or even different versions of the same library, “side by side”.
  2. XML parsers, transformers (XSLT), etc., have a standard interface, and there may be different implementations of this interface from different vendors, open-source projects, etc.
  3. Which XML parser, transformed etc. is actually used depends on a global system variable.

And it’s point 3 that’s the problem really. Points 1 and 2 are debatable: they certainly bring advantages, but they certainly bring complexity too.

I just had the problem that one of my web applications stopped working, but only intermittently. Restarting the server led to everything being OK, but later things would not be OK. I do hate environments where everything appears to work, yet in fact doesn’t. I mean how do you know when you’re “done” in such an environment? (Or how do you even know you are in such an environment?)

The bug was caused by:

  1. Application one used the default XML parser, and didn’t have any extra JARs (libraries) for reading XML
  2. Application two required a special XML parser, set the global variable so it would be used, and included the JARs necessary for the special XML parser

So when a request came to application 1, after a request had come to application 2, then the system would try to instantiate the special XML parser within application 1 (specified in the global variable set by application 2), but wouldn’t find it, as it wasn’t deployed in application 1 (and applications can’t use one another’s libraries, due to feature #1).

This seems obvious when one describes it, but looking at the logs, on a live server, with the system down and the clock ticking? – Far from obvious.

So now, I assert, every time you want to create an XML parser, do the following:

If you require a special XML library, use:

System.setProperty("javax.xml.parsers.DocumentBuilderFactory",
    "org.apache.xerces.jaxp.DocumentBuilderFactoryImpl");
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
...

If you require the standard XML library, use:

Properties systemProperties = System.getProperties();
systemProperties.remove("javax.xml.parsers.DocumentBuilderFactory");
System.setProperties(systemProperties);
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
...

There is also the possibility to pass a parameter to DocumentBuilderFactory to specify which XML parser technology to use. That’s a good option too, as it wouldn’t “corrupt” this global variable (“system property”). However I think one should be defensive, and always delete the global variable if one wishes to use the standard XML parser, and therefore it doesn’t matter if this global variable gets corrupted or not.

Never do the following:

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();

This simply relies on whichever XML parser is currently set in the global variable. You have no way to guarantee that some other application running on the same server won’t set the global variable to use an XML parser you don’t have installed in your application. Even if you have control of the server and all applications, you don’t know what software you’ll be writing in the future. (In this case I installed a new application to a server which’d been running fine for 1 year, but due to setting the global variable, the old application broke..)

The same applies for all those other “factory” situations such as TransformerFactory.newInstance() etc.

This feels all quite inelegant to me, and has just cost me a lot of time, and it’s not as if I’m so new to programming Java. I am wondering if there is a better way to approach it? Or is Java just broken in this particular respect?

P.S. This is not the only thing that went wrong with the old application today. I upgraded from Java 5 to Java 6 and suddenly some XML was not compliant against a schema according to Java – I had hit this error.

ssh to easyname account

We have just released a feature on www.easyname.eu where I work – it’s an automated domain-registration and web-hosting service – that you can “ssh” (log in) to your easyname filespace account. I think that’s cool – I don’t know of many web hosting companies where that feature is offered!

(I didn’t program the feature – neither the front-end nor the back-end – but I just released it, so I would have been responsible for any problems – although there weren’t any….)

Never close PHP class files with the “?>” tag

When developing PHP, a front-end PHP file will include other files: classes, utilities, etc.

When writing those class files, one also needs to use the <?php tag at the start of the file, otherwise PHP will simply take the text and output it unchanged to the browser. (PHP’s assumption is that it sits in a web page, with probably more markup than code, so by default characters in the source code get copied one-to-one to the browser and the <?php?> tags are necessary to introduce PHP to the “exceptional circumstance” that one might actually want to program some PHP.)

If one must open the class source file with <?php then it would seem to make aesthetic sense to close it with ?>. However, there are no negative side-effects if one does not close the tag, plus one very negative side-effect if one does close it.

We performed a minor release a while ago, after which the display of generated PDF files no longer worked. Yet the minor release had nothing to do with the section of code that produced PDFs. What sort of weird action-at-a-distance could possibly be happening here?

The reason was that one class file in the minor release had a blank line after the ?> tag. This was impossible to spot in the text editor. The blank line was printed to the browser, which was also invisible in nearly all of the site, as HTML ignores blank lines. PDFs probably do as well (I haven’t checked) but the problem wasn’t with the content. As HTTP response content is streamed to the browser (as opposed to being collected first and then sent to the browser at the end of the request), HTTP headers can only be set before the first byte of output has been produced by the software. As the blank line in the class source file consituted content, and the source file was (necessarily) parsed before the code could be executed, the HTTP header “Content-Type: text/pdf” couldn’t be sent, and various errors about headers not being able to be sent, combined with the binary source of the PDF, arrived at the user’s screen.

So given there are no disadvantages, and one particulary weird source of bugs can be removed, I would say one should never end PHP files with ?>.

Wasted 3 hours of my life

Working at Easyname, I heard from our support department that strange things were happening with PayPal payments – either they seemed not to be appearing on our site, or there were errors in the logfiles, and the users were complaining.

This is obviously a “drop everything” kind of bug – no matter the fact I was deep in the middle of doing some other software development (or even if I’d been working at another customer), if stuff to do with money doesn’t work, everything else must be stopped and that must be looked at with the highest priority.

What I did:

  • Look at the logfiles, there was an “unknown error” there (not very helpful!)
  • Set up a developer account for Paypal, get our software on my dev system to work with it
  • Test the PayPal IDN protocol using their tools and my newly set-up development environment
  • Change the software to produce more logging (rather than “unknown error” – although to my knowledge, that log had never been printed before, in the year or so the system was online)
  • Release the version with more debugging online
  • Communicate with various people I’m working with (management, support, ops, ..)
  • Test the new version by making a payment
  • See that it still didn’t work

What I should have done:

  • Do a search using twitter.
bluecrowbar I’m seeing problems with PayPal IPN (Instant Payment Notification). The notification is currently not instant. 13 minutes ago from Twitterrific
mystore Paypal IPN payments recived, they were just 4 hours late ;) 27 minutes ago from web
mystore @staleedstrom Check @chomasia ‘s This page has details of the IPN issue with PayPal http://bit.ly/3yJ1w about 3 hours ago from web
tonydalian2009 Paypal IPN doesnt work by now. sigh~~~about 4 hours ago from web
travelfish What part of the word “Instant” in IPN doesn’t payPal understand? about 4 hours ago from TwitterFox

What can I say? Looking at logfiles is such a legacy approach to bug solving.

Starting Jetty: FAILED

It is literally 23:19 on a Sunday and I’ve been working through the weekend to get a release out of some software I’m working on.

The Java application webserver (Jetty) was taking a long time to restart each time I did a change, so for some reason I thought I’d experiment with some new command-line options. Probably not the right time to do that.

Normally I would type

$ sudo  /etc/init.d/jetty6 restart
Stopping Jetty: OK
Starting Jetty: OK

and everything would be good. I tried typing

$ sudo /etc/init.d/jetty6 supervise

Then some stuff happened that I didn’t really understand. Rather than try and work out what it did I tried to restart it again using the old restart mechanism

$ sudo  /etc/init.d/jetty6 restart
...
Starting Jetty: FAILED

OK I mean that was basically what was going on, it just wrote FAILED. How helpful! There was no info in the logfile. I searched Google but didn’t come up with anything.

A reboot later, and about half an hour of looking into /etc/init.d/jetty6 with vi and randomly making changes and printing more stuff out yielded the fact that the “supervise” command had evidently run Jetty “as me” and not as the “jetty” user. So when the normal “restart” command came along and tried to run the program as “jetty” then there were files it couldn’t write to.

Solution:

$ sudo chown jetty /var/log/jetty6/2009_07_12.stderrout.log