Archive for the ‘Coding’ Category

Parameter ordering: Destinations on the left

Saturday, October 18th, 2014

Most (all?) programming languages write assignment with the destination variable on the left hand side, i.e.

a = b+c;

Therefore most (all?) programmers are familiar with the destination for an operation being on the left hand side. Therefore, I think, when designing functions or commands doing copying or assignment, the result should also be on the left hand side.

How to send email programatically from Java to easyname’s SMTP server

Thursday, August 21st, 2014

If you want to send email then it’s best to do it over the email server that is the authoritative one for the sender email address.

If you are using easyname and want to send email from a Java program from an easyname-controlled email address via easyname’s email servers, this is how you do it.

Update: Thanks to AndiT. RobinK, DavidZ for pointing out that a better way to do this would be to install a local mail server (MTA) e.g. postfix, send the email from Java to the MTA, and have the MTA send the email to easyname. If easyname experiences a problem, the email will be queued by the MTA. Configuring an MTA is outside the scope of this article :)

The “from” address must exist at easyname. The username and password in the following code is an “email box” name and password from the same easyname account.

Thanks to this article!

String sender = "";
String emailBoxName = "12345mail2";
String emailBoxPassword = "foo";

Properties props = new Properties();
props.put("", "");
props.put("mail.smtp.socketFactory.port", "465");
props.put("mail.smtp.socketFactory.class", "");
props.put("mail.smtp.auth", "true");
props.put("mail.smtp.port", "465");

Session session = Session.getDefaultInstance(props,
  new javax.mail.Authenticator() {
    protected PasswordAuthentication getPasswordAuthentication() {
      return new PasswordAuthentication(emailBoxName, emailBoxPassword);
MimeMessage msg = new MimeMessage(session);
msg.setFrom(new InternetAddress(sender));
msg.addRecipient(RecipientType.TO, ...);


Recognizing URLs within plain text, and displaying them as clickable links in HTML, in Wicket

Wednesday, March 12th, 2014

I have just, out of necessity for a customer project, written code which takes user-entered plain text, and creates out of that HTML with URLs marked up as clickable links.

Although marking up links in user-entered text is standard functionality, Stack Overflow would have you believe that it’s not something that should not be attempted, as it cannot be done perfectly. This is technically correct, however, users are accustomed to software which does a best-effort attempt, and customers are accustomed to take delivery of software meeting users expectations.

The software I have written is available as open-source, either as a Java class with the method encodeLinksToHtml which takes some plain text and returns safe HTML with clickable links, or as a component in the Wicket web framework called MultilineLabelWithClickableLinks.

Finding links within text is not as easy at it seems

Users may enter with/without protocol (http://). Domains may or may not have www at the start. There may or may not be a trailing slash. There may or may not be information after the URL. Having a whitelist of acceptable domain endings such as “.com” is a bad idea as the list is large and subject to change over time. Punctuation after links should not be included (for example “see”, with a trailing dot which is not part of the URL)

The software matches “foo://”, where:

  • Protocol is optional
  • Domain must contain at least one dot
  • Last part is optional and can contain anything apart from space and trailing punctuation (= part of the sentence in which the link is embedded)

Quotes are not allowed because we don’t want <a href=”foo”> to have foo containing quotes (XSS).

Making links clickable is not as easy as it seems


  • Conversion from plain text to HTML requires that entities such as “&” get replaced by “&amp;”.
  • Links such as “” need to get replaced by “<a href=’’>;b</a>”. (“&” in URL needs to stay “&” in the href, but needs to become “&amp;” in the visible text part)


  • One cannot firstly replace entities and then markup links, as the links should contain unescaped “&” as opposed to “&amp;”.
  • One cannot firstly encode links and then replace entities as the angle brackets in the link’s “<a href..” would get replaced by “&lt;a href…” which the browser would not understand.

Therefore, the replacement of HTML entities, and the replacement of links, must be done in a single (complicated) pass, rather than two (simple) passes.

Jetty doesn’t show errors on web application start-up

Monday, December 23rd, 2013

From a certain version of the “jetty” package in Debian Linux, if the web application didn’t start up (servlet init() throws an Exception), this error wasn’t logged anywhere. The solution is to install the libjetty-extra package.

sudo apt-get install libjetty-extra

It took some amount of experimentation to find the solution. I don’t know why you’d ever want to not log errors; i.e. why the logging of errors is an “extra”.

Jetty is a Java web server, similar to Tomcat, it’s my server of choice. It’s simple, doesn’t seem to do much apart from run WARs, and (apart from this issue) I’ve rarely had any problems with it.

Wrapping IDs in objects

Thursday, December 5th, 2013

In Java, I like to wrap ID values in objects, rather than just passing them around the code as their native “int” or “long” or “String” values.

The reasons are twofold why using an object is better:

  • Code becomes more readable, for example foo(LoginId x) is more readable than foo(long x). (Although perhaps neither foo nor x are good names, so perhaps the example over-exaggerates this improvement.)
  • The compiler can do more checking. If you pass a job advert’s id (as a long) to a function expecting a login id (as a long), the compiler cannot warn you of your mistake. This becomes particularly relevant if you have a function taking two IDs and you pass them the wrong way around.

Things to consider when writing such an “ID object”

  • Do not allow the ID contained within the object to be null. Having “LoginId x” where x is not null, but the value contained within x is null, makes no sense. (For example, use primitive types if dealing with numerical IDs, as they cannot be null.)
  • If the ID is a string, don’t allow this string to be empty; same as above.
  • Implement equals and hashCode methods so that these IDs can be used within Sets, or as keys within Maps, or as keys in Wicket drop-downs, or wherever.
  • Make them serializable.
  • Make the ID attribute a “public final” attribute. That means useless getter methods can be avoided, from the object itself and from client code.
  • The object should have a single constructor which takes the value and sets it in the attribute.
  • Implement toString so that the debugger can display the object usefully.

Don’t use constants for table and column names when writing SQL statements

Friday, June 14th, 2013

I was always in two minds about using constants for table and column names when writing SELECT queries. Now I’ve concluded that constants are definitely bad, and should not be used. Here’s why.

The topic of discussion is difference is between writing

sql = "SELECT * FROM " + TABLE_NAME + " WHERE ..."


sql = "SELECT * FROM my_table WHERE ..."

There are the following consequences from this choice as far as I can see:

  • A compiler like Java can tell you if you misspell a variable name (e.g. tableName) but not if you misspell “myTable” in the middle of a string (Win for constants, if you’re using a static typing language)
  • You can rename the values easier: just change and rename the constant. But how often do just table names or columns get renamed? Normally when I change the database I am implementing a new feature, and everything has to be changed anyway. (Marginal win for constants)
  • The layout of the query is shorter and easier to read if constants are not used. (Win for not using constants)

To avoid having this choice in the first place:

  • One could use an ORM, but at least in Java, e.g. HQL in Hibernate still has a string for column names, and table names if you’re doing joins, so the problem is still there.
  • Using a system like LINQ in .NET which allows you to specify queries in a way the compiler understands, not just a string. (But can it do everything SQL can do including vendor-specific things?)
  • Being able to extend the language with other languages such as SQL and regular expressions. This is a fantasy of mine and a friend, hasn’t happened yet. This would work by the compiler working in conjunction with the database engine to assert that the query is valid at compile time (and possibly even creating an db-specific internal parsed representation right there and then.)

Compare the following two pieces of code and I think the choice will become obvious. Both pieces of code come from the current code-base I’m working on, neither have I written myself.

sb.append("SELECT ").append(RecruiterRole.TABLE_NAME).append(".*,");
sb.append(" FROM ").append(RecruiterRole.TABLE_NAME);
sb.append(" WHERE ");
sb.append(RecruiterRole.COLUMN_COMPANY_ID).append(" = ?");
sb.append(" AND ");


sql.append(" SELECT * ");
sql.append(" FROM application");
sql.append(" WHERE job_advert_id IN (");
sql.append("   SELECT job_advert_id");
sql.append("   FROM share");
sql.append("   WHERE talent_scout_login_id = ?)");
sql.append(" AND potential_applicant_identity_id NOT IN (");
sql.append("   SELECT potential_applicant_identity_id");
sql.append("   FROM positive_endorsement");
sql.append("   WHERE talent_scout_login_id = ?)");
sql.append(" AND company_id = ?");
sql.append(" AND share_talent_scout_login_id = ?");
sql.append(" ORDER BY datetime_utc DESC");

Here are the reasons why the second code is more readable:

  • Because table/column names are inline, the code reads easier
  • Indenting is used for sub-selects
  • Each condition, order-by is on its own line (e.g. “AND company_id=?”)
  • Keywords uppercase, column and table names lowercase

The danger of the above code is less that errors in spelling will only be detected at run-time and not at compile-time, but that the query does the wrong thing (while appearing to do the right thing). For example, an error I saw recently (which obviously did not make the live system!) was that users could see data not only from themselves but from all users because the “WHERE login_id=?” had been forgotten. But to the untrained eye, or a user on the test system with only a few users, the query appeared to work.

In this case, it’s a clear win for readability, over compile-time checking of a mistake which will is unlikely to happen and will be identified at run-time.

Are Java enum values instances or classes?

Saturday, April 13th, 2013

That depends on if the enum values provide methods which differ from one another.

The following code produces just one class file, “Network.class”. “facebook” and “linkedIn” are just instances of the Network Java class.

public enum Network {

  public void printName() { System.out.println(getName()); }

But the following code produces one class file for each value, named “Network$1.class” etc., as well as one class file for the abstract superclass, “Network.class”.

public enum Network {
  facebook {
    public Client newClient() { return new FacebookClient(); }
  linkedIn {
    public Client newClient() { return new LinkedInClient(); }

  public abstract Client newClient();

“facebook” and “linkedIn” are in fact different Java classes now.

Having a constructor taking parameters, and initializing each value of the enum by calling this constructor with values, is not sufficient to force the generation of individual classes per value.

Just because they are different classes in this situation doesn’t automatically mean you can do everything you’d expect to be able to do with a class. You can’t test for class membership using “instanceof” for example (not that this would be very useful for an enum, as there is only one instance of every value).

Generating XML programmatically? Don’t use CDATA

Wednesday, March 20th, 2013

If you want to represent characters, you want to represent any possible sequence of characters. In an XML file, escaping < with &lt; gives you a way to represent any character. Using CDATA doesn’t give you a way to do that.

In an XML document there are two ways to represent character data. Either

  1. You just write the characters in the XML file (in which case you need to escape characters that look like XML tags i.e. < with &lt; etc.), or
  2. You use CDATA, no longer having to care about replacing < with &lt; etc. Ends as soon as the first ]]> is encountered, which is unlikely to appear in your characters.

These two syntaxes result in an identical XML document. An XML parser must consider XML files identical, which differ only by if CDATA is used, or not. In both cases, there is text within a tag.

CDATA has a (somewhat strange) syntax like:

<my-tag><![CDATA[My characters]]></my-tag>

From its naming, CDATA (“Character DATA”) might seem like exactly what you need to represent character data. Combine that with the fact that characters such as < don’t need to be escaped.

However, in fact it’s exactly the opposite of what you need. Whereas, by not using CDATA, it’s possible to escape the characters that mustn’t appear (i.e. replace < with &lt;) with CDATA there is no way to escape the characters that mustn’t appear (i.e. ]]>).

It might seem “unlikely” that ]]> is actually going to appear in the character data your user wishes to represent. (This is actually irrelevant, as software should work all the time, not just be “unlikely” not to work.) However, even this “unlikelihood” is misleading. No matter what sequence of characters XML had chosen to end CDATA, as soon as you represent data in itself (e.g. send an XML document in a tag), this sequence will appear in the data. So if you’re working with XML (which you are), this will happen more often than you think.

CDATA is just a convenience mechanism for writing XML files by hand. If you’re writing files by hand, you know which characters appear in your data, so you know whether you can use CDATA safely or not. If you’re creating a program to write XML files, you don’t know what the user’s data will contain, so you can’t use CDATA.

Don’t names classes AbstractThing or ISerializable

Monday, February 25th, 2013

Naming a class AbstractThing violates the Liskov Substitution Principle, and causes client code to read wrong. Call a class a name which you’d be happy to call any of the objects belong to the class.

If you want to borrow your friend’s phone, and you’re not sure what brand it is, how many times have you asked “Excuse me mate, can I borrow your abstract phone? Cheers.”

Object-oriented modeling and programming allows certain classes of objects to specialize other general classes of objects. For example, both a Rectangle and Circle are special types of of Shape.

Every Rectangle is a Shape. Everywhere you can use a Shape, you can use a Rectangle. Code that deals with a Rectangle (but which can also deal with Circles) might look like:

Shape s = .....;

Using an object (e.g. a Rectangle) anywhere you can use a generalization of it (e.g. a Shape) is an important part of the object-oriented concept, and known as the Liskov Substitution Principle.

It’s also obvious: what sort of sentence or logic would make statements about shapes, but then not be applicable to rectangles?

If you name the generalization an AbstractShape, this principle is violated. A Rectangle isn’t an AbstractShape. If anything it’s a “Concrete Shape”! A rectangle isn’t abstract (in the sense of “I don’t know what type of shape this is. Could be a rectangle, could be anything else.”). Code using AbstractShape then reads wrong:

AbstractShape s = new Rectangle(...);

The same is true of interfaces. What do all objects which are serializable have in common? They are Serializable. Naming the interface ISerializable would lead to code like:

ISerializable s = new String("hello")

But s refers to a particular instance; a particular thing. There’s nothing abstract or “just an interface” about s. s is a String, which is a specialization of the more general type of thing called a Serializable. The line of code, including s’s type, must reflect what s actually is. It’s a serializable.

Further, what if you change an interface to an abstract class, or a concrete class to an abstract class, or vice-versa? The client code must be updated. While isn’t too much of a pain with refactoring tools, it still leads to the suspicion that irrelevant information about the tools used to construct the implementation are leaking out to client code.

A class of a set of objects is named to capture the common aspects of those objects. If all objects are rectangles, call their class Rectangle. Naming a superclass, abstract or not, is no different. What do all the object modeled have in common? Some are rectangles, some are circles, but they are all Shapes.

The objects being classified into classes are not AbstractShapes, BaseShapes, IShapes, or similar.

Always use a “default” clause in a “switch” statement

Tuesday, February 19th, 2013

I was told that it’s good practice to include a default clause in all switch statements. I recently remembered this advice but can’t remember what the justification was.

Things can and do go wrong. Values will not be what you expect, and so on. To quote the old computer science motto: If it can go wrong, it will.

Not wanting to include a default clause implies you are confident that you know the set of possible values. If that’s the case, then a value which isn’t one of them indicates an error somewhere. And if that’s the case, you’d certainly want to be informed about it.

switch (myVar) {
   case 1: ......; break;
   case 2: ......; break;
   default: throw new RuntimeException("unreachable: myVar" + myVar);

There’s no reason to include more information than just the value of any variables which are known only at run-time. Anything which is visible from the source code, you don’t need to include in the text, as you’ll be looking at the source code to debug the error. (The exception stack-trace will include the line number.)

Originally from