Let’s rewrite this in Java!

4 January 2013 — in Project Management, Software Architecture

I've heard the phrase "Let's rewrite this in Java!" uttered in various meetings at various companies at various times in my career. Often by managers. All such projects to completely re-write the company's software in a new language or framework inevitably end in disaster. Why is this? I must confess I don't really know.

I suppose it doesn’t have to be Java, although it often is. Java is the classic. For example, a big company comes along and sees, oh my god, the entire code-base is written in a scripting language!! and feels the need to “clean up” the source code. Without further thought, the decision is taken to write everything. I think at least such companies inevitably choose to transition to Java.

It’s the same with frameworks. Oh my god, this is written with web framework X which is no longer trendy, or with no framework at all!!! The decision is taken to “introduce” a new front-end framework, often trendy, which necessitates a re-write of at least the entire front-end, if not the entire system.

Consequences

By “end in disaster,” I mean, mainly, that the system is never completed, and never goes online. At some point the code just gets deleted. (Or stored in some backup system, or on a branch of the VCS, where nobody ever looks at it, which are basically equivalent, if one judges the success of software by it going online and the benefit it brings to its users, and/or the money it brings to its sponsoring company, etc)

There is a danger to re-writes, beyond just costing the sponsoring organization money, without any benefit. In a small team it seems too much to concentrate on two projects at once (the old and the new). And the tasks associated with the new (conception, programming afresh) are much more fun than the tasks associated with the old (bug fixing, performance issues), so the programmers tend to prefer them. They also feel the work on the new system is a better investment of their time, as that work will be online a year from now, in contrast to work on the old system. With weak technical management programmers may do what they prefer as opposed to what is important. But if this takes a year or two, what about existing customers? The live system can just stagnate, while their competitors add new features and progress. I think in this case it’s important to have people assigned to the old project, and people assigned to the new. It might suck to be the guy assigned to the old project but what can you do? The work has to be done.

Reasons for failure

Perhaps it’s just a nature of re-writing software, as described by Joel on Software. But I think it’s more than that. I’ve seen re-writes or refactoring projects be successful. Classically the re-write of the Mac operating system was very successful, and the re-writing of the Netscape browser into what became Mozilla Firefox. The re-writing of Windows into Windows NT. At Uboot we re-wrote many pieces of software and they all worked out fine, and were much better for having been re-written.

I think it could have something to do with how the project starts out. What is trying to be achieved? I think if the project starts out with the objective “we need these completely new features” and then it turns out that a re-write is necessary, as the current code-base can’t support these new features, then such a re-write can be successful. If it starts out with the only objective being “we need to re-write it”, for example because the programming language is considered wrong, or the software design is considered wrong, i.e. things which are more faults of the old system than benefits of the new, then things will start to get into trouble.

I think if the objective is simply “we need a re-write” then everyone will chime in with all the problems the old system had and how they can be fixed. For example the software design can be improved (for its own sake). Newer frameworks can be used (for their own sake, or some vague reason such as they are more modern.) Radical new features, which have been on the roadmap for a while, but never really seemed to be totally necessary, can be introduced. Perhaps this is just too much at once, combined with re-writing a system which might have had years of work already put into it without all those new features? But I’m not sure it’s just that.

I also notice that, while the old system is still online, such re-writes do not have any feeling of urgency. (“The old system is online anyway. It’s important that we get it right this time, even if that takes longer.”) Perhaps that too is bad – urgency is often a good motivator for simplicity, and simplicity an attribute of systems with desirable properties such as ease of maintain, having fewer bugs, fewer scalability problems. Related: Perhaps people feel more allowance to act as an architecture astronaut; after all, the objective is to have an architecture, not to have features for customers.

Perhaps it’s because if there are no clearly expressed objectives beyond “we need a re-write” then the other objectives which haven’t been clearly stated (such as which new features to add) can change over time? Half way through the project a new framework is released and is decided to be used, on the grounds, well, we’re doing a re-write so we should use the most modern framework available. Or the same with new requirements which crop up during the project. If one introduces a new framework or feature sufficiently often, and these introductions require re-writing part of the already written re-write, then perhaps the re-write will never get a chance to be finished.

Perhaps it’s that people have a problem conceiving of a new system (which, on the day it’s released, must already have lots of features and scale to lots of users)? Perhaps people find it easier to go one feature at a time, and scale the existing system when necessary? Which is how an initial system often starts out.

Perhaps the projects fail because they take too long? One the one side you want to have people on the old team producing new features for the customers and keeping up with the competition. On the other side you don’t want the new project to work against a moving target and keep on re-writing bits of the re-write. These are contradictory objectives. Perhaps the trick is to get the re-write out the door in 3-6 months, before the old system has had a chance to change too much? Which is a problem if the system is beyond a certain size, or the re-write too ambitious.

Perhaps implementing a system in a language and/or framework with which the development team, and perhaps also architects, are not familiar, is a really bad idea too? This will inevitably lead to mistakes being made. Perhaps that can be mitigated by having an external consultant review the designs, perform design, do code reviews, or by using a language/framework with which senior members of the team are already familiar.

Conclusion

I don’t really know why, when I hear the words “Let’s rewrite it in X!” (where X is nearly always Java!), I just know the project is going to cost the company 1-2 years and most of the development team, get thrown away at the end, bring no benefit to either the users or the development team, and cause the live product to stagnate behind the competitors for this time.

And I don’t really know what you can do about it. Not re-writing things is clearly not acceptable either (would you like your primary system to be the latest incarnations of Netscape 4 on Windows ME?). I think you need to have clear objectives (which features do you need that you can’t implement on the old system, which technology do you want to move to and why?), keep separate teams for old and new, have at least one person with years of experience in the new technology, and make sure the new vision gets wrapped up within 3-6 months to avoid the live system having acquired too many new features.

Shameless self-promotion: I work as a software architect and have been involved in many successful re-writes. We re-wrote the UCP Media Album from Perl into Java J2EE and deployed it an an American mobile operator. For Offer Ready we’ve completely re-written our calculation engine. Uboot was in a constant state of flux e.g. introducing payments into messaging, etc. At easyname I oversaw the re-write of many key features such as the way money was handled. If you are hitting a wall with your current system and want to know your options, contact me.

P.S. I recently created a nerdy privacy-respecting tool called When Will I Run Out Of Money? It's available for free if you want to check it out.