Avoid adding “drive-by formatting changes” to commits

Recently, when reviewing code, I saw a commit resembling the following:

- if (x) foo();
- if (y) bar();
+ if (y) {
+    bar();
+ }

It’s not easy to see that of the two “if” statements, only one was actually deleted. The other had its formatting changed, but was otherwise not altered.

To prevent this, go through the “diff” before doing the commit. Revert any changes which have happened which haven’t changed the functionality of the code.

Committing not only the things you meant to change, but a bunch of other changes that don’t change the code’s functionality, has the following negative consequences:

  • The “blame” commands of the VCS will now show this new commit as the “last changer” of the “if (y)” line above. If you’re trying to find out why that code was written, this formatting change isn’t the commit you’re interested in. (And the commit message “removed X” won’t help you understand why the “if (y)” line is there.)
  • Conflicts. We love feature branches. If two people do such a drive-by change, in inconsistent ways, in different feature branches, when it comes to merge, that will be a conflict, that some person is going to have to spend valuable time and energy fixing.
  • The diff is more difficult to read than it needs to be. In the above commit, it takes longer to see that only the first line has been removed. Code reviews are important.

I think there are the following reasons why such changes might get introduced in the first place:

  • The developer felt like they wanted to make the code “nicer”. It’s a good intention, but leave that for special “code cleanup” commits. Each commit should have one purpose.
  • IDEs often auto-format code. In that case, the developer might not even realize they were making the formatting change? (But a “diff” before commit will show it to you and allow you to revert it.)
  • Java IDEs have “organize imports automatically” differently. IntelliJ lays out the “import” statements differently to Eclipse. So, a trivial change to a piece of code, and “organize imports” will completely change the import section, for no functional difference. If you see such needless import changes in the diff before commit, revert them.
  • Some text editors chop off trailing spaces in a line. This is a noble cause, but, again, it makes it difficult to see what the programmer has actually changed in the commit. In the most pathological case I’ve seen a commit with hundreds of files altered, for only a single line of actual change. This is going to be a nightmare to review and a nightmare to merge. Turn this feature of your editor off.
  • Tabs vs Spaces inconsistency. IDEs often change the file to their preferred settings. You can’t even see these changes! Again the “diff” tool before commit will make you aware of them.

Business and Work 2014

I went to a lecture recently and the CEO of Runtastic spoke. He said: you don’t “win or lose”, you “win or learn”. 2014 was a year of learning for me. Some of it was good, but some of it was definitely in the “you win or learn” sense of learning.

Collecting on debts

Many customers thought they would just order software from me and worry about paying it later. (The word “later” in the sense of tomorrow in “Alice in Wonderland”, i.e. not that indistinguishable from “never”.)

Here are two approaches I discovered which worked well:

  1. I developed a website for one customer, and she refused to pay, on the grounds she didn’t need the website any more. I took her to KSV (debt collection agency) and they did manage to collect the entire amount. It took 6-9 months but was pretty painless. I didn’t have a contract with her, but I had emails saying she was happy with the work, and this whole process only kicked off after 3 months of non-payment. I understand, in Austria, f you don’t dispute an invoice within one month the invoice is considered “accepted”. So because she “accepted” the invoice, she had to pay.
  2. With one other customer, who “definitely” wanted to pay according to him, but “later”, I finally stumbled upon the following approach which was successful. Building off his assertion he “definitely” wanted to pay, I asked him until when he wanted to pay. He said a year thence. I said “OK, take your entire debt, divide by 12, what you’re saying is, you’re happy to pay this on a monthly basis for the next 12 months?”. He couldn’t really say no.

Debts are going well, I hardly have any debt owed to me left. (And even if the customers were to stop repaying, at least I’ve collected a significant portion of the debt, which I wouldn’t have done had I not acted.)

Q1/Q2: Merging then un-merging (catastrophe)

I went into 2014 with two employees (MartinL, DavidZ) and not a great deal of work. And the work I did have, was for customers who were enthusiastic about the word “payment” only when combined with the word “later”.

I acquired a new customer, who were a general software development house, partly owned by their main customer, which was basically run by the same people. This main customer was developing Java Enterprise software, with a huge stack of server-VMs, Maven, ORMs, Spring, C# Windows thick-clients, Lua mobile client, and so on. Not surprisingly, despite having modest functional requirements, it (a) was large and complex software, and, as a consequence, (b) didn’t work. They needed our help. So far so good.

The software development house suggested merging their company with mine. They had 4-5 people, 2-3 customers (incl. this main customer with the Java software). I had 2 people, 3-4 customers. If we merge our resources, all customers would have access to a larger team, all employees would have access to a larger pool of customers. In principle this sounded good to me.

They insisted I become salaried at their company. The new company would continue to have the name of their old company before the merge. I suppose one might describe these as “red flags” in retrospect, that this wasn’t as much of a “merge” as had been originally proposed.

Not only was it a takeover (not a merge), they really were only interested in us working for their main customer. They weren’t interested in my customers (who I’d taken to the company in good faith), nor any employees of mine who were working for my customers (who I’d taken to the company in good faith). The boss even said that MartinL had “produced no value” (because he had only been working hard for months satisfying the customers I’d brought to the “merge”.)

I didn’t negotiate hard for them to pay me for the “merge” because I felt if they hadn’t paid, they hadn’t actually “bought anything”. This meant, from my perspective, I could exit any time I wanted and take all customers and employees who wanted to come with me, effectively un-merging. So that’s what I did.

Q3: Employed at CCA

I worked for “Control Center Apps” from August for a few months, which is a company run by a friend of mine. I developed a small Javascript demo, and worked on-site at Sepura (a CCA customer).

I used JQuery for the Javascript demo. Both JQuery and Javascript were somewhat new to me. JQuery seems so easy to use, and is so well documented, using it was really a pleasure. JavaScript less so, but I suppose it was about time that I really understood it better.

For that project I initially tried to get into AngularJS myself, but without much success. There were a lot of examples on the website, which were nice, and if I typed them in exactly as they were, they worked, but with examples alone it’s difficult to progress to changing the example into your own application. Some of the examples also had HTML as strings inside Javascript source, which didn’t strike me as particularly nice, and put me off it.

Q4: Running my business again

I decided to leave CCA and focus on running my own business again. I am still on good terms with CCA and do work for them still, but self-employed and with my team.

I did some training for the first time this year. Three days of Perl and one day of MySQL. Was an interesting experience. Training institute gave me no indication of the experience of the participants. It turns out the three participants were experienced sysadmins. I used my laptop connected to a projector and let them guide me through the areas they wanted to learn about.

I am working in cooperation with SebastianK since Q4 and he has developed software for CCA based on AngularJS as a Single-Page App (SPA), communicating with a Java back-end. Project isn’t delivered yet, but is going well. This is in contrast to our normal stack of Wicket in Java (not SPA).

I have a philosophy of, for each new project, using the stack that seems appropriate for the project, i.e. I don’t have the policy of using the same stack for all projects. If all teams must use the same stack, then the advantage is code-reuse and knowledge-reuse. The disadvantage is, that as the company grows, it gets stuck in a stack which is perhaps not appropriate for the future, as the world outside the company has a habit of changing and moving forward. The more projects use the same stack, the harder it is to change. At some point you just get stuck. We avoid that.

Other than that, MartinL and I supported our old customers (mobilreport, firstbird, Offer Ready), we did a few small website projects (using outsourced designers), and I did few pieces of consultancy (mainly database and Java software optimization).


Other than a certain unavoidable quantity of “learning” which is no doubt ahead of us all, I have the following objectives for my company in the coming year:

  • Build up more of a brand, so that people can recognize us (and it gets a bit annoying to refer to Adrian Smith in offers, which has two meanings, my company including all employees, and me as a person e.g. working on-site)
  • Build up more of a website, so that people who don’t already know us can find information on me and perhaps decide to work with us
  • Not merge with anyone: remain running my business
  • Make sure that our customers are satisfied
  • Make sure that employees consider my company a great place to work
  • Acquire more customers, acquire more employees

Parameter ordering: Destinations on the left

Most (all?) programming languages write assignment with the destination variable on the left hand side, i.e.

a = b+c;

Therefore most (all?) programmers are familiar with the destination for an operation being on the left hand side. Therefore, I think, when designing functions or commands doing copying or assignment, the result should also be on the left hand side.

How to send email programatically from Java to easyname’s SMTP server

If you want to send email then it’s best to do it over the email server that is the authoritative one for the sender email address.

If you are using easyname and want to send email from a Java program from an easyname-controlled email address via easyname’s email servers, this is how you do it.

Update: Thanks to AndiT. RobinK, DavidZ for pointing out that a better way to do this would be to install a local mail server (MTA) e.g. postfix, send the email from Java to the MTA, and have the MTA send the email to easyname. If easyname experiences a problem, the email will be queued by the MTA. Configuring an MTA is outside the scope of this article :)

The “from” address must exist at easyname. The username and password in the following code is an “email box” name and password from the same easyname account.

Thanks to this article! http://www.mkyong.com/java/javamail-api-sending-email-via-gmail-smtp-example/

String sender = "foo@myeasynamedomain.com";
String emailBoxName = "12345mail2";
String emailBoxPassword = "foo";

Properties props = new Properties();
props.put("mail.smtp.host", "smtp.easyname.eu");
props.put("mail.smtp.socketFactory.port", "465");
props.put("mail.smtp.socketFactory.class", "javax.net.ssl.SSLSocketFactory");
props.put("mail.smtp.auth", "true");
props.put("mail.smtp.port", "465");

Session session = Session.getDefaultInstance(props,
  new javax.mail.Authenticator() {
    protected PasswordAuthentication getPasswordAuthentication() {
      return new PasswordAuthentication(emailBoxName, emailBoxPassword);
MimeMessage msg = new MimeMessage(session);
msg.setFrom(new InternetAddress(sender));
msg.addRecipient(RecipientType.TO, ...);


Forward Agents

At one company I worked at, we had “customer care agents” who would answer requests from users, e.g. emails and telephone calls.

I always thought the name was a bit strange. I suspected it had been created by someone who wasn’t a native English speaker, or perhaps I am just disconnected with the world of business terminology. (I am aware that James Bond is “secret agent” but I had never heard of “customer-care agent”.)

Anyway, one day, over a few beers, colleagues and I were discussing our company structure; perhaps impolitely, but at least partially accurately.

  • We understood what the people at the very top of the organization did: they made the decisions that were then implemented by the rest of us; they were the visionaries who created the company.
  • And we understood what the people at the very bottom of the organization did: they implemented the decisions; for example, in the case of our customer care agents, explained those decision to the customers.

But, the managers in the middle of the organization? We (only-half)-jokingly considered the workflow thus:

  • Their job was to take e.g. emails from the people at the very top, and forward them onto their team for implementation. If this “team” was itself composed of managers, they too would then forward the instructions on to their team, until the instruction reached someone at the bottom of the organization who could actually execute the instruction.
  • Once the individuals at the bottom of the organization had had announced the successful completion of a work package, they would email their team leaders, who would forward them to their managers, who would forward them onto the people at the very top of the organization.

I mean, I have often received emails from a manager which is simply a forward header followed by an actual email considering instructions from someone who needs it. Now that I manage myself, I myself have often just forwarded things on.

It was thus decided, over that beer, to henceforth refer to those, whose job it is to simply forward information up and down the company hierarchy without adding any value of their own, as “forward agents”.

I suppose my point is, not only that this amusing term was coined (at least it was amusing to us), but I wonder what the world would be like if people just communicated directly? I try and adopt this attitude in my company to maximize efficiency and reduce costs, which are paid by the customer eventually.

  • Every time I receive such a piece of forwarded information, I wonder, could I have received this information directly?
  • Every time I forward something on, I wonder, would it be better to just connect to the two individuals so they can bypass me?

“Sexy girls”

One of the main project my company works on is analyzing mobile phone bills for companies – a company gives out mobile phones to all its employees, then they get a huge bill every month, often delivered by post and printed on (perhaps literally) many reams of paper.

We analyze all such files electronically. That’s what we do.

I was analyzing a new file format the other day, using live data from one customer. One service used by an employee is called “sexy girls”. Great stuff. I’m going to go out on a limb here, and assert that was some non-work-related stuff.

Outlandish newlines

A file recently turned up from an external partner. Our software was having problems parsing it. I opened it up in TextPad (text editor for Windows) and everything looked fine. Obviously I consider our software to be perfect so I was a little perplexed as how this file could be causing problems…

It turned out there were newline issues. Opening the file in a hex editor revealed the file used nnr for newlines, i.e. 3 bytes.

I am amazed by (at least) the following facts:

  1. The fact that a file uses such newlines.
  2. The fact that any text editor can just open files using this newline scheme.

My PC’s door

pcToday I wanted to use some software i have on CD on a new computers. Obviously modern computers don’t have CD drives. One “mini PC” I have at one of my offices, however, does. My plan was, today, as I’m at this office, to use the CD drive to create an ISO of the disk. (It’s my computer, doesn’t belong to an employer.)

However, for some time, the “door” of my little mini PC has been stuck. It worked fine at my old office sektor5 (= i could open it to get to the CD drive) but it never worked at the new company hiQ. I’ve tried it a few times (to get to the reset button) but could never get it open, it seems stuck (in the case of the reset button i just turned it off and on then, using the main button which is outside the door).

Today, i decided, it’s only a PC, it’s only plastic. If it really doesn’t open then, well, I might as well force it and break the door if necessary. There’s no point having a nice-looking PC door, if it doesn’t work, and you can’t actually use the equipment inside.

So, basically, today I tried really hard to persuade it to open without breaking it, but it just wouldn’t open it. So i forced it, now the plastic broke, now i can use the CD drive, so in a way everything’s good.

I now realize that the problem was I was trying to open the door the wrong way; the hinge was on the other side. If only it’d occurred to me to try and open it the other way, I would still have a beautiful door, nothing would have been broken. I don’t know why this just didn’t occur to me at all.

Now my PC looks like shit with a broken door hanging off it, and all completely needlessly.

That makes me sad.

Recognizing URLs within plain text, and displaying them as clickable links in HTML, in Wicket

I have just, out of necessity for a customer project, written code which takes user-entered plain text, and creates out of that HTML with URLs marked up as clickable links.

Although marking up links in user-entered text is standard functionality, Stack Overflow would have you believe that it’s not something that should not be attempted, as it cannot be done perfectly. This is technically correct, however, users are accustomed to software which does a best-effort attempt, and customers are accustomed to take delivery of software meeting users expectations.

The software I have written is available as open-source, either as a Java class with the method encodeLinksToHtml which takes some plain text and returns safe HTML with clickable links, or as a component in the Wicket web framework called MultilineLabelWithClickableLinks.

Finding links within text is not as easy at it seems

Users may enter with/without protocol (http://). Domains may or may not have www at the start. There may or may not be a trailing slash. There may or may not be information after the URL. Having a whitelist of acceptable domain endings such as “.com” is a bad idea as the list is large and subject to change over time. Punctuation after links should not be included (for example “see foo.com.”, with a trailing dot which is not part of the URL)

The software matches “foo://foo.foo/foo”, where:

  • Protocol is optional
  • Domain must contain at least one dot
  • Last part is optional and can contain anything apart from space and trailing punctuation (= part of the sentence in which the link is embedded)

Quotes are not allowed because we don’t want <a href=”foo”> to have foo containing quotes (XSS).

Making links clickable is not as easy as it seems


  • Conversion from plain text to HTML requires that entities such as “&” get replaced by “&amp;”.
  • Links such as “foo.com/a&b” need to get replaced by “<a href=’foo.com/a&b’>foo.com/a&amp;b</a>”. (“&” in URL needs to stay “&” in the href, but needs to become “&amp;” in the visible text part)


  • One cannot firstly replace entities and then markup links, as the links should contain unescaped “&” as opposed to “&amp;”.
  • One cannot firstly encode links and then replace entities as the angle brackets in the link’s “<a href..” would get replaced by “&lt;a href…” which the browser would not understand.

Therefore, the replacement of HTML entities, and the replacement of links, must be done in a single (complicated) pass, rather than two (simple) passes.

2 of 30

How would you diagnose the following bug?

  • A number of checkboxes representing user interests (Football? Music?); user can select/deselect their interests.
  • Software has worked well for years in production.
  • Suddenly intermittent reports start coming in that sometimes some checkboxes get unchecked by themselves.
  • You test live, you test on the test server, all is good.
    But the reports keep on trickling in, leading you to suspect that it isn’t just user foolishness (e.g. not understanding how to use a checkbox, which I wouldn’t normally put past users..)

This happened at Uboot around 2005.

I’ve forgotten how it came to pass that we fixed the bug. But the bug was this. Perhaps it sounds obvious once explained, but it was anything other than obvious at the time, especially given the fact it was completely unreproducable.

The software was coded in Perl5, and I hadn’t coded this particular screen myself. But, I had no reason to suspect that anything was wrong with the code, as, as I say, it had worked well live for years.

In Perl, “everything is a string”, apart from that, in fact, that’s not true at all. “everything appears to be a string; but might not be” describes the situation better. If you have a string “foo” then it’s stored as a string. If you have a string like “45” then it’s stored as an integer internally and converted to/from a string as needed. If you have a string like “45.6” then it’s stored as a double internally and converted to/from a string as needed. (Or it might be more complex than that, I’m not sure, perlnumber)

Supposedly this makes everything “easier” if you treat everything as a string. But I have no anger towards the junior developer who coded this screen who believed, as things appeared to be strings, things actually were stings. I mean, why wouldn’t you think that?

The checkboxes were stored in the software as a huge bit field. Perhaps this wasn’t the best representation, as that doesn’t scale (if you want to store 100 interests, you’re going to have to change the approach). But, that was the representation that had been chosen and, as I said, this had been online and worked well for years.

At some point, someone had decided to augment our 64-bit servers with some 32-bit servers. So you can imagine the rest. 2 out of 30 servers were 32-bit, 2 out of 30 clicks went to those servers, our dev server was the older 64-bit server, all our software had been developed on the old 64-bit servers. So 2 out of 30 clicks, all interests apart from 32 of then got lost.

Lesson learned (or not, as the lesson would have to be learnt by the programming language community): If things appear to be strings, make sure they actually are strings. Or, make sure it’s obvious that they’re not strings.

Had the software been written in Java, this couldn’t have happened as, independent of machine word size (32-bit or 64-bit) each Java numeric data type has a defined width, and is guaranteed to behave identically on any JVM implementation.