Archive for the ‘Software Design’ Category

Email Boxes need to be stored in DB, but also call IMAP, APIs, etc.

Wednesday, June 6th, 2007

I find myself often modelling the situation that there are rows in the database (e.g. “email boxes” for a user), and these rows represent things that exist elsewhere as well (e.g. IMAP accounts to back up these email boxes). There can be multiple ways of accessing these external resources, e.g. to delete an email box one does an deletes files on some server, to find out how much space is used there is an http-based protocol. And in the case of creation and deletion (and changing of password) these operations should not be done synchronously from the web interface, but are queued. This is not a contrived example, I am programming exactly this right now. All of the above are givens.

To not just stuff all the various API clients and other functionality into one huge class, there needs to be different objects representing:

  • The “email box” row in the database (and a persistence mechanism)
  • A “filesystem” object to represent operations on the filesystem such as “delete email box”. This object knows the directory layout used. This object can be shared between other objects which need to perform filesystem operations, such as a filestore accessible via FTP accounts (in this case). It’s convenient to program all these filesystem operations in one object.
  • A client for the HTTP-based protocol, to find out the box’s used size. In this case the protocol can do other functions, such as finding the space used in the FTP filestore. Again, it’s convenient to put all these operations in one class: one can create private methods to connect to the server, or for common API requirements such as response parsing which will be the same for all the commands, etc.
  • Persistable Queue objects, and QueueProcessor objects representing the programs or tasks to change the password, create/delete the boxes, etc.
  • Some Facade object to simply access to all the above?

Once one has come up with this objects in the system, there are a number of possibilities for how to combine them. E.g.

  • When one asks the HTTP protocol client object to find out the space used for a box, should one pass the parameter (of which box) as a Box object, or the name of the box and password as a String?
  • Should an application program (e.g. web interface) instanciate and use the HTTP protocol client object directly, to find the space used? Or should it call a method on the Box object, which calls the HTTP protocol object? Should both possibilities be available?

On the one hand, to simplify all objects, it would make sense to have the application program talk to the HTTP protocol object, and not to have this code in the Box object at all. And to always pass Box object, as this encourages strict typing.

However, I have found time and time again that the following solution works best:

  • Not have multiple ways of performing the same action.
  • Have a main “Box” object, which acts as a Facade. This represents a particular box. (i.e. not a BoxService stateless facade object, which each time takes a BoxId as a parameter to every function.)
  • Optionally have other objects to delegate to, concerning persistence of the box and its attributes to the database (although I prefer not)
  • A Box object knows the life cycle of a Box, and knows when to write things to queues etc. This will also need to be exposed in its interface (e.g. addCreationRequestToQueue) and explained in the class Javadoc. If this lifecycle changes (e.g. queue introduced for a certain operation) the interface will change and clients will have to be updated. But that’s OK, as probably there will be a requirement in the front-end to display “performing…” as long as the operation is in the queue. So lots will have to change if you change the life cycle.
  • This object also knows how to perform the operations which are normally queued, e.g. “delete”, in terms of simply calling the “filesystem” object. It may also need to update some internal flags to note that the filesystem no longer exists. These methods are normally only called from QueueProcessor objects, but are also handy to call from JUnit test scripts (e.g. in case of “create”), to put the system in some state that is necessary for further tests. The QueueProcessor does not do much, apart from just call the methods on the Box to perform the operation.
  • Applications call Box for all its requests and never call Filesystem. That way if the implementation changes (no longer direct “rm” but now over the HTTP API) the application does not need to change (note that such changes are ones which do not affect the life cycle of the Box, or introduce extra states such as “in queue but not done yet”). But more importantly I just think it’s a lot more readable to say “Box b = getBox(); b.getUsedSizeBytes(); b.deleteFromFilesystem()”.
  • The individual objects such as the “filesystem” object take Strings not Boxes as parameters. This makes those classes marginally simpler. More importantly one doesn’t feel right when there’s a two-way dependency, i.e. Box needs Filesystem (to call it to implement “delete” methods) and Filesystem needs Box (in its method signatures). And the only place that the Filesystem is going to be called is from Box instance methods, and the Box has all the information such as username, password, and any other information, within its instance variables.

When is a software project done?

Tuesday, May 29th, 2007

A software project is defined, for the purposes of this blog entry, as a set of people working to produce a new software system, or to modify an existing software system.

The result (exit condition) of a software project is a set of artifacts and other assertions:

  1. Document (or wiki etc) describing what the software should do, i.e. requirements. This will include subtle details, about what the system does, that will not immediately be obvious by looking at the front-end, or reading software design documentation. This should be a complete description, which is useful for the future, not just a “delta” from the last version.
  2. Software architecture documentation, in words. Simply looking at 1,000 Javadocs will not enable a new team member to understand the system. Documentation should also include which other options were evaluated and not chosen, and why, to avoid future teams considering the same things.
  3. (Obviously) the source code for the software. Including the front-end, back-end, any HTML, etc.
  4. Unit test scripts for all back-end classes needing them.
  5. Front-end tests. Either a document (simple statements such as “Click on Submit without enough money on account. See error message”), or configuration of a front-end testing program.
  6. Performance tests done and the software to perform them, if appropriate.
  7. Configuration (or creation) of a monitoring system to monitor the system once it’s live, if it’s a service (e.g. web site).
  8. Administration system for customer care, if it’s a service.
  9. Management reporting. Especially just after a system goes live, management are always very curious about key statistics, such as number of users, number of items sold etc. That needs to be analyzed in advance and the system in place when the system goes live.
  10. Class diagrams
  11. Javadoc to describe the purpose of individual classes and methods (where this is not obvious from the names). For scripting languages: parameter and return types (as this cannot be deduced from the source code).
  12. If this is not the first version of a system, migration concept including scripts to install software, migrate schema, filesystems containing user data, and anything else.
  13. System uses appropriately international character set such as UTF-8. (This is not particularly modern, the WinNT team decided to do this in 1988.) Java does this out of the box, but it’s more than just the programming language. This includes any database, any data stored in flat-files, any APIs (within the system or to/from external systems), and so on.
  14. All of the above under version control
  15. Not only the software installed on a live system, but also the existence of test and staging systems. If one uses the live systems for testing, then, once one’s gone live, one has no way to fix bugs in a testing environment. And bugs will happen, and they need to be fixed fast, so one had better have thought of this in advance.
  16. Bug tracking, or wiki system, or some way that the team is trained and rehearsed in using, to track and assign errors as they occur.
  17. Understood and tested data backup and recovery process. (What happens if the live DB crashes? Better have thought about recovery before that happens.)
  18. The team must sleep e.g. 2 days before a release. After a release (bug fixing) is the most stressful time of a project and where the team must be at its most alert (as fixing is time-critical). It’s important to sleep beforehand, and not e.g. work 7 days a week then in the evening finally release, then go to bed. (You can be certain that 1 hour after you’ve gone to sleep the site will be offline due to some problems, and you weren’t there to fix them.)

Memorable URLs

Thursday, May 24th, 2007

One thing I have to say I really like about uboot is that I can always remember the URLs to the various places in my nickpage. I can just type them into an IM conversation and don’t even have to click on them to make sure I got them right. (I don’t have to go to the nickpage, copy-paste the URL)

I appreciate uboot isn’t the only website to have URLs which one can remember, but it’s something that Uboot’s really got right.

gettext is so broken

Friday, May 18th, 2007

Working on a PHP project recently, there was the requirement for text localization. The standard way to do this in PHP is to use the standard way to do this in C, which is gettext.

I’ve worked with various translation systems, including one I built myself for uboot, involving a hierarchy of languages going from most specific to most “international”, and with each string having a hierarchical id such as “myprogram.errors.disk-full”.

Java Properties files are simple but also work well (simplicity being a positive thing in this case). The lines are key-value pairs, and using a convention such as “myprogram.errors.disk-full” the key is almost as good as if it actually were a key hierarchy. The file is in Latin1 but Unicode characters can be used via an escape syntax, and there are many editors where one can just type Unicode text and which take care of the escaping.

So I was looking forward to using gettext. This format was created by GNU, the creators of GCC (a highly respected program). gettext is itself well respected and authors of systems such as PHP have chosen it as their localization system.

But alas, it is broken in so many ways.

(1) The file format. Whereas Java’s file format is to have lines such as “key=value”, gettext’s “.po” format (where did that extension come from?) has two lines for every string, like

msgid “key”
msgstr “value”

As one inevitably places a blank line between one key-value pair and the next, the file is immediately 3 times as long as a Java properties file storing the same information. And what if you want to have double-quotes within your string?

(2) Compilation (for performance reasons). I work with scripting languages, where there is no compiler. This can be a good or a bad thing; but independent of that, it is a fact. However the editable “.po” files of gettext have to be converted into binary “.mo” files before they work. Thus I have to introduce a compilation step into my otherwise compilation-free edit-and-that’s-it test environment.

In fact I don’t understand this compilation requirement at all. According to the gettext manual, gettext was developed in 1994. Surely computers were fast enough back then to parse the gettext format, store the whole lot in a hash?

And what I further don’t understand is how/if GNU programs were localized before then. I suppose they just weren’t.

(3) What about Unicode? I have no idea how to introduce Unicode characters into the editable “.po” files of gettext. The manual doesn’t help me. Supporting only 8-bit characters, and assuming/hoping that the encoding of the “.po” file is the same as the encoding that the user is using in viewing the output of your program, is simply a terrible solution. Microsoft designed Windows NT to use Unicode internally in 1988. Java uses only Unicode since its inception in 1991.

Unbelievably there is a reason given for not using Unicode.

However, we don’t recommend this approach for all POT files in all packages, because this would force translators to use PO files in UTF-8 encoding, which is - in the current state of software (as of 2003) - a major hassle for translators using GNU Emacs or XEmacs with po-mode.

(4) Using natural language keys. The “best practices” usage of gettext have English texts as the keys. This is supported by the utility tool “xgettext” which extracts strings automatically from your source.

This sounds nice, but I don’t like having English-text (or, in our case, German text) as the keys for translation files. If the text is e.g. “Click here for more info” and then the new style guideline for the site becomes “More Information”, then you end up having

// mypage.php
echo gettext(”Click here for more info”); // prints “More Information” # mypage.po
msgid “Click here for more info”
msgstr “More Information”

I dunno, that’s just confusing for me. I’d much rather have a text-neutral key such as “more-info”.

Update: This article also shows why you can’t use English-langauge text as translation keys.

(5) Referencing usages from the translation file. The “xgettext” utility writes lines such as the following into the “.po” file

#: mypage.php:47
msgid “Click here for more info”

msgstr “Click here for more info”

I don’t in any way like having the source file name and line number in the translation files. In principle it looks like it helps you to find the usage of a particular string, but in fact:

  1. It is not hard to find all the usages of the key “myprog.error.disk-full”. That string is hardly going to appear in a non-translation context by accident. A recursive search will tell you where its usages are.
  2. What if I change “mypage.php”? (which is pretty likely). For example inserting some lines before line 47. Then the information is not only irrelevant, but in addition wrong.

It is a principle of mine that not only should databases be normalized, but software source also. Every piece of information should be in exactly one place. And that place is where it’s technically needed (in this case, in the PHP file, as otherwise the string wouldn’t get displayed). As that’s (the only place) where it’ll get updated.

(6) Parameters. We all need strings such as “The file ‘$FILE’ has been successfully deleted”. It seems that the standard way to do this in gettext is to use sprintf-type placeholders (e.g. “%s”). However as soon as you have more than one of those, and you translate the string into French, you’ll find you need the parameters the other way around. Oops. That didn’t work. So gettext is only suitable a) for Western European languages (due to character set constraints) and b) only for the subset of those languages which have grammars where placeholders will be needed in the same order.

The first thing I did was write a wrapper around gettext to accept $0, $1 style parameters, so one could swap their order on a per-translated-string basis. (Although $FILE named parameters might have been better; but that would have made the calling code longer.)

So nice one, they managed to invent, for the purposes of translation, a system which has a file format more difficult to use than a simple key-value pair, yet offering no advantages. It can’t handle Unicode. Good work.

GUI Programming: Always perform network requests asynchronously

Monday, April 23rd, 2007

Why does one feel ones so much more in control, when using Firefox, than Internet Explorer?

When you select a slow link in Internet Explorer, the whole program hangs for about 1-2 seconds. Firefox doesn’t. Although 1-2 seconds is hardly a large % of ones life, it makes a big difference to the experience one has when using Firefox.

Recently I wrote something similar to an IM client (written in Java). It sits on the Windows tray. You can log in, and open a window where you can do various things. The data stored on a website (communication over XML-RPC).

MSN Messenger has one tray-icon for the user being logged out, a different one for the user being logged in, and amazingly (I thought), a third one for during the time the program spends communicating with the servers to log the user on.

In my system, log on is just one single XML-RPC call, with all the necessary data returned in the response. This was a design goal, to never have more than one client-server request to represent a particular user action.

The back-end to this XML-RPC call is a simple perl script which uses a few objects to represent things such as Users. These objects are simple enough, they just make a few SELECTs against our super-fast database. So I thought, as any request to our back-end takes say max 0.2 seconds, I needn’t make that asynchronous to the UI of the Java program. And I certainly don’t need a separate icon to display during that time!

If I’d ever stated that decision out loud, I would have heard myself saying it, and realized what a nonsense that is.

  • While it may only take 0.2 seconds on the server, there’s latency to consider, i.e. the time for the packets to flow from the client to the server and back again.
  • One can’t take into account how slow the user’s network connection might be.
  • There may well be more than one request from client to server, multiplying the latency. Just because there is one XML-RPC request doesn’t mean there are no other requests going on underneath, for example DNS lookup of the hostname to connect to.
  • If there is a queue of HTTP requests in Apache, waiting for the FCGI to answer the XML-RPC request, then the time the HTTP request to wait in the queue will also be added to the duration of the call perceived by the user.
  • What if there is a server-problem, and all requests take 2 seconds? A design not tolerant of things going wrong is a bad design.
  • Even 0.2 seconds is noticeable in a front-end.
  • Programming asynchronously in Java is not difficult. So it need not be avoided.

So now, every time I log on using that program, the dialog to log in opens, one clicks connect, and … wait … until the success or failure response is shown. And in that time, the program is just dead. It doesn’t even redraw its windows. It may only be 0.5 seconds, but you notice it.

Lesson - it may sound obvious - but it’s still worth stating: In a GUI Program (Windows, Mac OS X, etc.), any user interaction over a network, must be performed asynchronously (i.e. in a thread or in a separate process).

Releasing working code

Tuesday, April 17th, 2007

I spend a lot of my time getting annoyed by errors in other people’s software (e.g. Windows). Errors which, when you see them, you wonder how on earth they could have been overlooked. But recently I released of a piece of software which contained a major bug (it was only a small mistake, but the consequences were big).

So I set about thinking, what sequences of actions lead, in my experience, to software which works? A lot of these are obvious, yet I’ve found myself often enough not following them, due to time or pressure reasons. And the result: is stuff which doesn’t work.

(1) Unit test scripts: Make them easy to run. For one product I work on a lot, there’s are a whole bunch of test scripts, testing all sorts of classes. In fact there are over 21k lines of unit tests! This is a good thing. But sometimes the person running them has to compare the value printed by the program with the expected value (i.e. has to know the expected value, not easy 2 years after the program was written). And not all classes are tested at all. But there are still a good few which do good tests and print “ok” if the result is correct. This is good, but it’s so much work to run them all. The solution is to chain them altogether, as is easy to do with JUnit, and create one simple command or click to test them all. If it’s simple and convenient and creates value, people will do it.

Also, having a framework into which to put tests - for example, having a convention that a class called “X” has a test class called “XTest”, and that methods like “operationY” on “X” have a method “testOperationY” in “XTest” - encourages people not to be scared to write tests. (But forcing people to write tests, e.g. one test per method, is a waste of time. Not every method needs a test.)

(2) Know what the important features are. Most websites really have many many features. It’s impossible to test them all, without restricting oneself to 6 month release cycles and 1 month test phases. But there are usually a bunch of features would would be show-stoppers if they didn’t work. Can a new user register? Can they upload a photo? (For a photo website). Can they send an SMS (For an SMS website). Write these show-stopping features down. Before the release, go through and test them on the pre-production server. After the release, test them again on the live system. Writing them down helps one not to forget the ones one can’t be bothered to test.

It doesn’t matter if this list is long. Maybe there really are a ton of features which simply cannot not work. Then you’d better have tested them all.

(3) For unimportant operations, ignore failure. Recently I wrote a program which writes a ZIP file. As a small extra feature, if a file in the ZIP hasn’t changed since the last time the program ran, the timestamp of the file in the new ZIP file should be the same as in the old one. This isn’t a very important feature, but it’s there. Once, when it ran, there was a file I/O problem reading the old file, and the program aborted. But this isn’t an important enough feature to abort execution: the program should have continued, and just given all files in the new ZIP a new timestamp.

Consider this when writing all code: if this goes wrong, does it matter? If not, when something goes wrong (any Throwable), log the exception and continue. You’ll kick yourself if failure of one part doesn’t matter, yet it brings down the whole program.

(4) Restart everything. You’ve only change one small piece of code, why incur the cost of restarting all Apaches and all robots? Well, software’s strange, and any change, however localized, can break any functionality. Any programmer knows this to be true. If you don’t restart everything, how will you everything still works? How will you test it?

(5) Look at the log files after releasing. Even if, after a restart of the live servers, everything seems fine, what are the users seeing? They’re testing different paths than you. If you log uncaught Exceptions, take a look at the log file before the release, and again after the restart after the release, and see if there are more errors. For example, SQL errors which weren’t there beforehand. This could alert you to a problem you’ve overlooked.

(6) Static checks are good. Programming languages such as Smalltalk and LISP popularized the notion that it’s cool to do everything, such as method lookup, at runtime. “It’s gives you more flexibility.” While this is certainly true, there are a lot of errors which you’ll then only find at runtime. (The same is true of SQL strings in program code: You will only know if you’ve misspelled a column name in the SQL when you run the particular piece of code.) This is not helpful to minimize your errors. I appreciate that taking code online which hasn’t even been run once is hardly a good idea, but I’ve seen it happen often enough.

Java and Hibernate are a good combination in this respect. If the Java program compiles then you know you’ve got all your variable names, function names, type-casts and Exception checking right. If the Hibernate program starts then you know the classes map to existing tables correctly. (But HQL, represented as strings within your program, are bad again, as you could make a spelling mistake, and it will only cause an exception when the particular code is executed.)

If one has to have SQL strings in the program, and thus an error in it will only be detected once the code path is executed, maybe a prepare of the statement can be placed in a static constructor of the class? That way at least when the class is loaded (at the start of the program’s execution, most likely) one will find out about the problem.

(7) Be aggressive about cleaning old code. The more code there is, the more complex a system is to understand. If one has a new chat system, why is code which communicates with the old chat system still there? What if that code relies on classes which you’re about to change? What if it communicates with an old chat server and the results aren’t displayed anywhere any more, and then that old chat server goes away? The motto “clean code that works” does not involve having 100k lines of old junk around, which no one understands, no one wants to take the time to learn (as it’s no longer relevant), and will break randomly.

(8) Compile all of the program. It’s obvious, before one releases a Java program, one does a “clean all” then a compile, just to check that one hasn’t changed a class and forgotten to recompile a client of it, which will result in a runtime error, e.g. a MethodNotFoundException. Why doesn’t one do the same in scripting languages? Admittedly scripting language compilers don’t check as much, but they still check some things (e.g. syntactic correctness). One “unit test” should be to go through every program file - every library, every CGI, every PHP page, and do a compile check on it.

(9) Release emails. It’s easier to delete an email which one’s not interested in, than to find out information from an email one didn’t receive and doesn’t know was ever sent. If a service like a website suddenly breaks, it’s important to fix it as soon as possible, and that probably means contacting the person who caused it to break. Before (not after) a release, write an email to all concerned - operations engineers, software developers, support agents, managers - and let them know that a change is going live.

(10) Be contactable. There’s nothing worse, for creating a perception of negativity, than when someone’s made an error, and you can’t contact them. Make sure mobiles are on loud. If you’re not reading email for some reason, make sure you’ve told everyone in advance, and set up an auto-responder. Want to be contacted less? Make fewer errors.

(11) Monitoring. For each robot and front-end program, one needs to define what acceptable conditions are and what not. E.g. what logs must be written by the correctly-running program, and which logs must not be written. Monitor them. This takes quite a lot of effort, a) the monitoring software b) defining what are acceptable and unacceptable conditions c) tuning the software to actually produce logs which are usefully monitorable. But it’s necessary. If it’s not done, there will be errors written to the logs and nobody will see them.

We don't need these users - let's move them to an "archive" table!

Thursday, March 29th, 2007

For one of the customers I currently work for, when we first designed the platform in Q1/2000, there was the "account" table, there we stored our users. There were always various pressures to move "inactive" users to a separate "archive" table. I was always against this decision.

In Q4/2005, during a period of my absence, it was decided to implement this decision. A bunch of users were to be deleted, but "not quite", in case we needed their data again. Their data was to be moved from the "account" table to an "account_archive" table.

This was really the worst decision ever made. I said that before, and now I see the consequences. I want anyone who considers such an operation good, to understand the consequences. So I list them here.

  • More and more, bosses and business people require we do operations on "all" users, which includes the "account_archive" table. This generally involves a "union" of both tables.
  • Now I have to create a real-time data interface to a slave system. This also including archived users. That means I have a "who has changed" table (input queue for the process exporting changed users to the slave system). This table references account_ids, but I can't create an FK from this table to "account", as sometimes an "account_id" references "account" and sometimes "account_archive".
  • There are classes which model a User, and this uses the "account" table as the underlying table. This enables me to build logic functions on the User class, and this has been done. However, at the time the class was built, there was only "account", so I can't use this class to model users who are stored in the "account_archive" table. (I'm not going to extend the User object to include the "account_archive" table, that will make this critical code too complex)
  • Now I have to allow users to "unsubscribe" from a newsletter, and "archive" people can receive newsletters, if they elected to receive them while they were active. Again, I can't use the User objects to do that. So I have to just program in plain SQL in an fcgi (or create a second class MaybeArchivedUser to model a user which could be in either table, and then duplicate some instance methods - that's what I chose to do).
  • It was suggested "maybe we archived the wrong users". But it's nearly impossible to re-create them as the schema is different, and some information has not been kept on in "account_archive". Their nicknames, which are unique amongst active users (but not amongst archive users) might have been reused in the meantime.

It would have been necessary to decide on one of the two courses of action:

  • We actually will never need these users again: we delete them
  • We might or do need them in the future. In which case we set a special "status" in the "account" table. They can't log in. But we can build User objects. We can re-enable them if necessary. We can even let them log in to some mini-platform where they can do a few things such as delete themselves or request their reactivation.

It has been said that removing users from account "increases performance". However:

  • It's more probable that two accounts will be read from the same disk block, after a defragmentation has occurred (did any defragmentation run? I don't think so)
  • If there are half the number of users, that's one less binary-index level. If there are 2M users that's 21 index branches. If 1M users that's 20 index branches. Hardly a big saving.
  • Although backup (and recovery) no doubt became quicker
  • This hardly makes up for the other disadvantages

Splitting a table up into two tables, for "performance", or whatever reason, is never a good thing to do. Add a status flag.

MySQL's "enum" datatype is a good thing

Friday, March 23rd, 2007

I’ve often had discussions with people about whether the “enum” type in MySQL is a good thing or not. Basically there are two ways to use your database:

  1. As an unstructured bunch of “stuff” to store whatever the software needs to persist. Such databases use lots of “blob” data with serialized objects (it’s easy to program), tables with multiple functions (”object” table with “type” column), few constraints, and so on.
  2. As a representation of the data the program is trying to model. Such databases have meaningful column names, two different types of things are two different tables. Adding constraints is easy.

There are good arguments for both. As one is programming in a programming language, whatever is most convenient for that programming language (e.g. serialize a bunch of objects to a blob) means less effort, less code: which means less cost and less bugs. Those are all good things.

But for some reason I’ve always been the 2nd type. I like to look at a database and understand what data is being modeled. It creates a certain self-documentation which can often be lacking from complicated software. Constraints can be added which acts like functional assertions (functional in the sense that they involve no state: you say that this value must be between 1 and 10 and then it is that way. You don’t have to program any “path” or state to check that).

That an item can be in exactly one a distinct set of states is a fact of life in all types of domain modeling:

  • What state is this invoice in? Is it “paid”, “open”, “paid+processed”?
  • Is this item deleted or not? (Or marked temporarily suspended, pending administrator checking the content)
  • Is this photo public, or does it belong to a user?

To explicitly work with enumerations is to acknowledge their existence within the domain, and to create a more 1:1 mapping between the domain and its implementation. That’s what all software should be about, modeling. Ideally a program is nothing more than a model and a few rules for how that model transitions between one consistent state and another.

If you program databases like #2 above, like I do, then enumerations really do make the data model richer.

  • The database maps to the domain easier (thereby documenting the domain, in case this is not done elsewhere)
  • The database maps to a domain-consistent implementation in the programming language easier
  • The database implicitly then has a constraint, as you can’t set the column to be some value which makes no sense for the domain (and therefore the program)

So enums in DBs are a good thing, and should be used wherever possible. Oracle should implement them just like MySQL does. And enums should be used in programming languages too. Why enums appeared only since Java 1.5 I don’t know. And the fact that php, perl etc do not support them, is yet another reason not to use those programming languages to try and attempt to build any software system whose function it is to model a domain.

It can be said that then it makes it “more difficult” to add a new state, as now you have to “change the database as well”. But if the model changes, and the database reflects the model, then that’s a good thing, not a bad thing. And it isn’t even much effort: If you say that invoices can be in a new state, then there’ll be a lot of programming work to support that change (UI, billing logic, robots, test scripts). The “alter table” statement is no work at all proportionally, and you’ll also be thankful for every extra constraint the database or programming language can offer you (which such a change feels very scary in a Perl program, but not so much in a Java program).

(Enums are supported in PostgreSQL since v8.3)

Concurrency control using Oracle’s “Select for Update”

Monday, March 19th, 2007

There are times when one need to prevent a certain “critical section” of code from being executed by more than one process on the same object at the same time. For example, if the requirements state that a user cannot have two subscriptions of the same type active at the same time. So to enable a subscription, if one does “a) check if user can have the subscription, b) enable the subscription”, one need to make sure that there aren’t multiple processes doing that simultaneously.

Oracle (and thus MySQL InnoDB and PostgreSQL) support row-level write locks. This means that (in contrast to other databases) if one database connection, in a transaction, writes to a row (but has not commited yet) then other connections can freely read the row: they get the version which was most recently commited, which was the version before this transaction. Only if a second connection tries to write to the same row, will the connection wait for the first connection to perform a commit or rollback.

The “select for update” statement acts like a “write” above. All rows which are returned from the select to a first connection are not being written to by any other connection, and any other connection which later attempts to write (or “select for update”) the rows will wait until the first transaction has commited or rolled back. Locking rows using “select for update” is thus an appropriate mechanism to implement locking.

As “select for update” still can only lock existing rows, one needs to decide what the appropriate row(s) are to lock for a particular operation. Remember that the rows are only locked to other connections using “select for update” or writing to the row: anyone can still read the rows. If one wishes to exclude the possibilty of a user having two subscriptions of the same type simultaneously, one cannot lock the “zero rows” the user must have in the subscriptions table beforehand, as only existing rows can be locked.

It thus makes sense to lock the “user” row representing the user (or some other table containing exactly one row per user).

Therefore, for certain critical operations, such as adding a new subscription line for a user, I shall perform the following operations:

  1. Open transaction
  2. “Select account_id from account where account_id=?”
  3. Does the user already have a subscriptions row of this type currently active?
  4. If not, create the new row for the user
  5. Commit

Note that with other database engines, simply reading the row within a transaction may be enough for it to acquire an exclusive lock. Or worse, for the entire page (Informix?) or entire table (MySQL MyISAM?) to acquire an exclusive lock.

And even if it’s just the row being locked (best case), other reads (e.g. a user just wants to view some data) will cause the reading process to block, increasing the number of processes waiting from just those needing to wait due to execution of a critical section, to all processes accessing the table.

Store currency amounts as number of cents, pence, etc.

Thursday, March 8th, 2007

The way numbers are stored in C and thus in PHP, Perl etc., and also Java, is with a binary system. So if you have 2 it's "10" in binary, 5 is "101" and so on. And the same is true for fractions. So a half is "0.1" and a quarter is "0.01". That means that just like "one third" is not exactly representable in our decimal system (0.3333) numbers like "one hundredth" which are easy to represent in decimal cannot be represented exactly in binary.

Oracle and the MySQL "numeric" type stores data as decimal. Meaning if you store "a third" to two decimal places, they get stored as "0.33". And if you try and add 3 "one thirds" together you get 0.99 not 1.00.

So those Oracle/MySQL data types, using decimal, are good for representing money, as you can exactly store "one hundredth". And adding 100 "one hundredths" gives you 1.0 exactly. However that doesn't help one much, as all programming language in common use today only support a binary floating-point representation - which can't store "one hundredth" exactly.

This is more of a philosophical issue than a piratical one. Because even if one does add 100 "one hundredths" together, one gets a result like 0.9999 and if one tries to print that to two decimal places, then rounding will take place, and "1.00" will be displayed, i.e. the right answer. So it's not really a practical problem.

However, the solution to the problem is easy to implement: just store a whole number of cents, pence, etc. So there's no reason to accept any inaccuracy when it comes to storing monetary amounts.

This was the way my Mother programmed, when she had to deal with pounds, shillings and pence in the old UK pre-decimalization monetary system: she simply stored the number of pence as an integer. If it was good enough for her, it's good enough for us now. After all, software development doesn't change that much over the generations.