Archive for the ‘Coding’ Category

Java 5 enums can be compared with ==

Thursday, September 6th, 2007

Java Enum instances are singletons. This seems to be not clearly documented by Sun (at least I found it difficult to find). But it’s the case.

What this means is that it’s possible to compare enumerated types by identity, which is cool for readability. (And it means that the switch statement works.)

You don’t have to write this:

if (PurchaseState.complete.equals(anItem.getPurchaseState()) { ...

You can write:

if (anItem.getPurchaseState() == PurchaseState.complete) { ...

This is documented here in the “discussion” section.

Insert documentation here

Tuesday, August 7th, 2007

Ah I really hate opening code and seeing the following

/**
* Insert class or interface description here.
*/

This is created by the IDE’s helpful “create new class” (and similar) menu options.

I wish people would actually write documentation. Even a single sentence to describe what the class is modeling would be helpful if it’s not obvious from the name. Or object invariants (e.g. boughtCount <= offeredCount).

To find a class without documentation is annoying. But to see such an IDE-generated phrase is a slap in the face!

Copying the contents of one directory into another: not as easy at it looks!

Tuesday, July 24th, 2007

Task: You want to copy the contents of one directory into another existing directory. On Linux.

I.e. if the source directory is “x” and the destination directory is the already-existing directory “y”, if there are files “x/1″ and “x/2″ then files “y/1″ and “y/2″ should be created. If “x” is an empty directory then no files in “y” should be created.

Now, this is not as easy as it sounds.

cp -r x y

This command copies “x” into “y” meaning that the resulting files end up being “y/x/1″ etc.

cp -r x/* y

This copies files like “x/1″ into files “y/1″ correctly, but if “x” is an empty directory, an error is presented, that the file “x/*” cannot be found.

Surely this should be easy! I even considered firstly deleting the directory y, and then copying x as y.

rm -rf y
cp -r x y

This is rather inelegant as you have to set the permissions on “y” again if they’re non-standard, and it doesn’t work if “y” isn’t empty.

I came up with the following solution.

cd x
cp -r –parents . ../y

This copes the “current directory” and all children (i.e. all files) into “y”, but the “–parents” option tries to create any hierarchy leading to the source into the destination. So if you copy “a/b/c” into “d” then it creates “d/a/b/c” as opposed to just “d/c” which it would normally create.

In my case the “hierarchy” is just “.” so it copies e.g. “./1″ into “y/./1″ i.e. “y/1″.

Java: List<X> or X[] ?

Thursday, July 12th, 2007

Since the creation of Java 1.5, one’s been able to parametrize classes using generics, with a syntax similar to C++ templates.

Before Java 1.5, I would always return simple list data structures as arrays. This was

  • Type-safe (e.g. User[] as opposed to List; the former one knows what’s in the collection, in the latter one doesn’t)
  • One could find out the length of the collection with array.length (in contrast to C arrays)

But since Java 1.5, one has a choice. One could use the Java collections framework, now supporting generics, or still use arrays.

Perhaps it’s because I don’t like change, but I would still advocate using arrays as opposed to Lists:

  • The generic information is thrown away at compile-time, so a List<X> and List<Y> look the same at run-time, whereas X[] and Y[] do not. Introspection, and getting exceptions at the time of a wrong array cast, and not later, are the benefits here.
  • You can easily create an array declaratively. int[] x = new int[] { 1, 2 }; You can’t do the same with the collections frameworks.
  • I’m sure arrays are faster
  • Arrays are also simpler. I think one should, given two solutions to the same problem, nearly always take the simpler unless there’s a clear benefit of the more complex solution (which I don’t see in this case)

Maybe the point about faster and simpler aren’t really relevant points these days. But collections are still the sort of things which one accesses in inner loops. Consider the ways in which X[] is faster than List<X>:

  • To iterate over the collection, with List<X> you need to create an Iterator. This also happens if you use the “foreach” construct.
  • To get a particular element, or to get the length, you have to call methods, like aList.get(3).

It has been said that using Iterators is preferable to using a for loop over array indexes, for software design reasons. This may be the case in some special situations, but I really don’t think it’s an advantage in common usage.

  • One can iterate over an array or collection with the Java 1.5 “foreach” keyword: so in this case the source code looks the same.
  • The code “for (i=0; i<array.length; i++)”, i.e. non-iterator code, is not really difficult to write or difficult to read.

Hibernate / Boolean Fields / MySQL 5.0

Wednesday, July 4th, 2007

There’s a problem persisting boolean fields using Hibernate 3.2.2 to MySQL 5.0, if you allow Hibernate to generate your schema, and you leave Hibernate to generate the schema in the default way. It works fine on MySQL 4.1 and it doesn’t matter if you use boolean (primitive) or Boolean (object) types for the fields.

with a class such as:

public class MyObject {
   protected boolean myField;
   public boolean getMyField() { return myField; }
   public void setMyField(boolean x) { myField = x; }
}

and a Hibernate mapping such as:

<property name="myField" column="my_field" not-null="true" />

and allow Hibernate to generate the schema on startup, e.g. by writing the following in the “hibernate.cfg.xml” file:

<property name="hbm2ddl.auto">create</property>

Against MySQL 4.1 this all works, and the column has the data type tinyint(1). But in MySQL 5.0 the data type is bit(1) (which seems logical enough) but Hibernate then throws the following unhelpful exception upon every insert:

could not insert: [com.company.MyObject]
org.hibernate.exception.DataException: could not insert: [com.company.MyObject]
at org.hibernate.exception.SQLStateConverter.convert(SQLStateConverter.java:77)
at org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:43)
....
Caused by: java.sql.SQLException: Data too long for column 'my_field' at row 1
at ....

The solution is to change the Hibernate mapping for the field to this:

<property name="myField" not-null="true" >
   <column sql-type="BOOLEAN" not-null="true" name="my_field" />
</property>

Then the field is generated as tinyint(1) and then it all works fine again.

Generate Javadoc HTML only for public members

Monday, July 2nd, 2007

In Java there are four protection levels which members (fields and methods) can have:

  1. Private
  2. Protected
  3. Package-level
  4. Public

Any member can have Javadoc (including private members).

But when one generates the Javadoc, which protected levels should be included?

Generated Javadoc is used by humans. These humans are probably not you. And thus are probably clients of your classes, either within or outside of your organization. It’s possible, although unlikely, that they may be able to access package-level members. It’s possible they may need to subclass your class, although in (nearly) all cases I can conceive of, they won’t do that without looking at your source code.

Javadoc should be simple to understand. There’s simply a lot of potentially documentable stuff going on in a class, which is capable of reducing simplicity. Setters which only Hibernate needs to see (private), or which only your factories in your package need to see (package-level).

Javadoc should therefore only be generated only for public attributes. That’s what Sun’s JDK docs do as well (for example you don’t see any protected or private stuff here). And there’s an additional benefit of simplicity is that if this is the only level for which the Javadoc is being generated, it doesn’t even state the protection level in the summary, so you see “int getX()” in the method list as opposed to “public int getX()”.

This can be achieved with the “-public” option to the Javadoc generation program. In Netbeans 5.5, right-click on the project in the “projects” tab, select the menu item “properties”, go to the “documentation” entry under the “build” entry, and enter “-public” in the “additional javadoc options” field.

Making progress with introduction of unit tests to Uboot

Monday, May 14th, 2007

The old uboot code had, amazingly enough, 21k lines of unit tests. But they were not useful unit tests, as one had to run each program individually, and they each had a bunch of (different) prerequisites, such as account_id 3 existing and having an empty inbox, and so on. And with the older tests, their output would be a bunch of print statements (e.g. insert message; print count of messages), and one would have to compare the printed output with the expected results (which weren’t documented anywhere).

I am converting them to PerlUnit (which is a clone of JUnit) so that we can automatically and easily run as many tests as possible before each release. This is an incredibly productive task, as I don’t even need to write new unit tests (and think about testing strategy), I’m just converting the lines to a format enabling them to be convenient to run!

So far 3.6k lines in 86 test functions in 33 test classes :)

$ ./test.pl
...................................................
...................................
Time: 48 wallclock secs ( 8.65 usr  0.56 sys +  0.02 cusr  0.25 csys =  9.48 CPU)

OK (86 tests)

Transfering some hex. Sometimes gets replaced by string "INF". Why?

Thursday, May 10th, 2007

This was never going to work out. Data transfer interface. Our side in Perl and their side in PHP. Both scripting languages (bad) and not even the same scripting language (incompatible badness).

Over the data transfer interface, we are transferring users. Including a code to enable them to unsubscribe from an email newsletter. The first 7 characters of the code identify the users (digits) and the rest of the code is a hex string containing some security information.

All works great. But some users can’t use the code? It turns out on the destination system they have “INF” in the field instead of the code.

It turns out that some of these users have e.g. 1234567 to identify the user, and e.g. 123e1234567 as their hex code. That makes the security code “1234567123e1234567″. And that “looks like” a floating point number to Perl. But quite a big one. Almost as big as Infinity in fact, so might as well call it that.

I hardly think the flexibility we “won” through every data instance having its own type based on what its data “looks like” hardly compensates the anger of a segment of our users not being able to unsubscribe from their newsletter, or the extra expense to the company of the time to debug this problem (which was then an urgent problem, as it was only discovered after the system went live, as it only affected 0.6% of our users).

P.S. my solution was to put a space in front of the code, which is taken off by the receiving system, so the data always “looks like” a string. But I wouldn’t like to guarantee that what “looks like” a string won’t change with the next version of the Perl SOAP client libraries we are using.

Class names repeating information stated in the package name

Sunday, May 6th, 2007

Classes in modern programming languages can be arranged in hierarchies, e.g. a perl class might be called “Uboot::Message::Mail” or a Java class “com.uboot.message.Mail”.

In some programming languages (e.g. Perl) one always refers to the class by its full name (such as “Uboot::Message::Mail”) and never by its leaf name (e.g. “Mail”). For example:

use Uboot::Message::Mail;
my $mail = Uboot::Message::Mail->new();
print "it's a mail" if ($mail->isa("Uboot::Message::Mail"));

In other langauges (e.g. Java) one almost always refers to classes via their leaf-name, such as:

import com.uboot.message.Mail;
class MyClass {
   public void static main(String[] args) {
      Mail mail = new Mail();
      if (mail instanceof Mail) System.out.println("it's a mail");
   }
}

For those languages such as Perl, which require using the class’ full path at all times, it’s not necessary to repeat information in the leaf name that has been specified already in the path. For example, a class to model an entry in a Uboot address book might be in a directory called “Uboot/ABook” in which case the entry class can be called “Uboot::ABook::Entry”.

But in Java, you don’t want to have a class called “Entry” because, as soon as the “import” statement scrolls out of sight, you’ll not know if your instance, helpfully statically typed to be an “Entry”, is an address book entry, a guestbook entry, a blog entry, or any other conceivable type of entry. In that case the class needs to be called something like “com.uboot.abook.ABookEntry”.

Class names like “Uboot::ABook::ABookEntry” or “Uboot::Monitoring::MonitoringResult” are (only in langauges such as Perl) needlessly redundant and long.

perl / switch statement: Cool Limitation

Wednesday, May 2nd, 2007

Look at the documentation for the Perl switch statement. Look down the bottom at the “limitations” section. Look at the last limitation.