Archive for the ‘Software Design’ Category

Programming with unique constraints

Friday, June 13th, 2008

If you’re using a remote database system, your application doesn’t have access to all the data at any point in time. (I.e. you just load and save the rows you’re interested in in a particular transaction). Therefore if you want to do some database-wide operation, you need to ask the database to do it.

When you want to enforce uniqueness, for example across a whole table (for example a document name needs to be unique), or across a particular part of a table (for example each user must have documents with unique names), you need to do that in the database.

There is only one acceptable way to do this with a SQL database:

  1. Insert a new row
  2. If the insert succeeds, there wasn’t a row there before, and now there is
  3. If the insert throws a unique constraint violation, the row is already there
  4. If you want to update the row (i.e. an “insert or update” operation to maintain a “lazy singleton” in the database), you can update the row with safety after the unique constraint violation, as you can be certain the row is already there.

The following methods are all not acceptable:

  • Do a “select” to find out how many rows there are. If there aren’t any, do an “insert”. However someone may have inserted a row between your “select” and your “insert”.
  • Do an “update” and if the database says that 0 rows have been updated, do an “insert”. Again, someone may have inserted a row between your “update” and the “insert”.
  • Do a “select for update” statement (Oracle, Postgres, InnoDB) to check that there aren’t any rows while creating a lock, and then do an insert. However that statement only locks the rows it returns, so if it doesn’t return any rows, it doesn’t create any locks, so you still can’t be certain that no one has inserted a row between the “select” and the “insert”.
  • Lock the whole table and do one of the above. This works, but it means that all write access is “serialized” i.e. happens after one another. Any other operation, writing something completely irrelevant, will now also have to wait until the end of your transaction, whereas it shouldn’t have. This reduces concurrency.

The way I program this is the following. On the “insert” statement, I catch the database error, and see if it’s a “unique constraint violation”-type error. If it is, I throw an (unchecked) Exception. The calling code can catch that and do something with it (or not, if the statement should not generate such an error, in which case it will propogate to the main loop like any other database exception). I have had the pleasure of introducing this to easyname.eu and now also the pleasure of introducing this to the WebTek framework.

It is extremely frustrating working with big complex frameworks, whose usage is much more complex than just writing SQL manually, and not being able to do the above properly.

  • Hibernate clearly states in its documentation that if any database error occurs, the Session (main object managing all persistence) must be destroyed as its state will be out-of-sync with the database. But there is no way other than the above to do this sort of check (as far as I know).
  • OptimalJ generates code that, if a database error occurs, sets the transaction to rollback. Including any other work you may have done.

I mean checking unique constraints is something every database application needs, so the fact it’s not supported by major frameworks is just unbelievable.

Oracle, Nulls and the empty string

Thursday, June 12th, 2008

Oracle has an unusual feature, which attracts it a lot of criticism. If you try to insert the empty string into a column marked “not null”, you get an error. The empty string is treated the same as “null” by Oracle.

This is different to programming languages (and indeed other databases, at least MySQL), which means one has to be careful not to make a mistake when using a programming language to talk to the database. That it’s different from other systems is the main reason for the criticism.

However, how many times have you wanted the following to be allowed in your data schema, for example on a “first name” column:

  • Writing nulls is not allowed
  • Writing the empty string is allowed

I just wrote a program and the front-end framework helpfully noticed that the field was “not null” in the database and gave the user an error in the front-end if the field was empty. However when I altered the code slightly, it no longer gave an error. Because the field in the program was the empty string, and not null.

However, I assert, when dealing with data, checking only for not null is not useful; you also want to check that the string contains some data in that case.

I have now updated the framework, so that if the field is marked “not null”, then the error is presented to the front-end not only if the variable in the program is “null”, but also if it is the empty string.

(Note: I am not advocating e.g. Java lose the distinction between ==null and .isEmpty(): for some reason this is a useful distinction when doing programming and data manipulation—such as null indicating that a variable isn’t initialized yet—I just don’t think it’s a useful distinction in a system solely designed to model persistent data.)

Encapsulation or public attributes: but nothing inbetween

Monday, May 19th, 2008

This post asked the question:

Whenever a class in my model contains a collection which requires that particular care be taken with its items, there’s an internal debate regarding how to expose it to other classes. And with this, there are two major schools: one, the paranoia-based approach which doesn’t allow external code to touch the collection’s internal items and two, the trusting approach which just returns the collection for everyone to deal with.

What are your thoughts on the matter? What do you use, when and why?

Definitely paranoia.

It may make certain things more difficult or require more code, but one of the key cornerstones of object-oriented programming is encapsulation and not exposing your internal data in a way that means that others can break it.

If one wants to go for the trust approach – sure it’s easier – but if easiness is the objective one can write a C program and just declare a struct. Then anyone can access the data anyway they want and there’s absolutely no code to write (not even getters and setters). But the world moved away from that model towards encapsulation as with N lines of code (or Nk LOC or NM LOC), without encapsulation, any of those can be responsible for the creation of inconsistent data.

Explicit vs Implicit data typing

Thursday, April 10th, 2008

I was reading this article about how certain data gets messed up when one imports it into Excel (certain data looks like a date and thus gets converted into one), and it reminded me of a problem I had when transferring data over an XML protocol from Perl (the SOAP library was inspecting the hex data I was transferring, but a small percentage of hex numbers look like “123e123″, which looked like a floating point number to the library)

I think both problems are actually the same problem. It can be traced back to the necessity to make exactly one of the following two decisions when creating data-processing systems:

  1. Either you try and work out what the datatype of a piece of data is by looking at e.g. the data’s string representation. E.g. this data is “abcd” so it’s a string, this data is “123″ so it’s a number.
  2. Or you explicitly store and state, external to the data, its data type. E.g. store not only that the data is “123″, but that it’s a number.

Option 1 seems attractive as it’s simpler as you only need to store one piece of data. It also feels more normalized, as one piece of data is generally better than two (e.g. what if they are inconsistent, e.g. data is “abcjzh” and type is “number”?)

But the trouble is you option 1 doesn’t work (see above.)

But it gets worse. Option 1 seems to work, yet does not actually work in all case (and you want your software to work in all cases). That’s more dangerous that if it simply and clearly didn’t work.

The authors of the SOAP library in my example presumably believed their software worked. And I believed my software, built on top of the SOAP library, worked. It worked in my unit tests and when I tested it by clicking-through the front-end. Only 0.6% of users had a code with a hex string that looked like an exponent, so it’s understandable that I just didn’t hit it when testing. But with e.g. 2M users in the database, some of my users will hit it. And that means that the software I released didn’t work (working meaning working 100% for everyone.)

But I like my software to work. The way I achieve that is to avoid errors which are difficult to detect. Making errors is human; if they are easy to catch, one can spot them and then correct them.

XML in the database

Monday, March 3rd, 2008

As I do a lot of programming using open source databases, I couldn’t help but notice that inclusion of XML in the database seems to be a hot topic. Postgres 8.1 (just released) and MySQL 5.1 (not released yet) will allow XML documents to be stored as the value of cells, and allow those documents to be manipulated and inspected. (Oracle has had this feature since 2001)

I generally don’t like XML (as Robin put it: most formats are designed to be either conveniently readable by humans, or by computer. XML is conveniently readable by neither.)

So I especially wouldn’t like to store XML in the database. However, there are some times when one would naturally want to store some hierarchical data and the relational model doesn’t really allow that conveniently. Compared to:

  • Storing e.g. Java serialized objects as a BLOB in the database (with no chance of ever doing a “select” against that data, or migrating that data using a SQL migration script);
  • Or storing direct XML without manipulation functions (slightly more readable, but still not programatically manipulatable using SQL)

I suppose using the XML features provided in the database is better than either of the above solutions.

The following table contains documentation copied from the documentation or the changelog of the database in question:

New in Postgres 8.3
(4th Feb 2008)
“XML Support: New XML data type fully supports the SQL/XML specification of ANSI SQL:2003, including well-formedness checks, type-safe operations, SQL/XML publishing and XPath queries. Version 8.3 also includes additional functions for XML data export.”
New in MySQL 5.1
(In development)
“XML functions with XPath support. ExtractValue() returns the content of a fragment of XML matching a given XPath expression. UpdateXML() replaces the element selected from a fragment of XML by an XPath expression supplied by the user with a second XML fragment (also user-supplied), and returns the modified XML. See Section 11.10, ‘XML Functions’.”
New in Oracle 9.0.1
(Released 2001)
“The introduction of Oracle XML DB and the XMLType datatype provides new techniques that facilitate the persistence of XML content in the database. These techniques include the ability to store XML documents in an XMLType column or table, or in Oracle XML DB Repository. Storing XML as an XMLType column or table makes Oracle Database aware that the content is XML. This allows the database to: Perform XML-specific validations, operations, and optimizations on the XML content; Facilitate highly efficient processing of XML content by Oracle XML DB.”
New in MS SQL Server 2005
(Released 2005)
“SQL Server 2005 introduces a native data type called XML. A user can create a table that has one or more columns of type XML in addition to relational columns. … Together with a large set of functions, embedded XQuery and data modification languages provide rich support for manipulating XML data.”

Which log levels to use when?

Tuesday, February 5th, 2008

I’m sure there are a lot of opinions in the world about which log levels to use for which errors. Log levels in the sense of if a text destined for a logfile should be prefixed with “Info”, “Warning”, “Error” etc.

There is even great debate about which log levels there should be. E.g. should there be a “Debug” level or a “Trace” level? Or both? In which case what’s the difference?

I tended to ignore those debates—for me a log statement was simply a log statement—but recently I was deploying an application, and various people had worked on it and used log levels inconsistently. Hardly a surprise as I’d never set any guidelines on how they should be used, due to aforestated ignoring. So I started to think about how I would have wanted the log levels to have been in the application I was at that moment deploying.

Firstly, what does a log level actually influence?

  • Different texts are stated in the log file. The only differences this makes to anyone is that operations teams, not too familiar with the software, tend to understandable freak out when one has lots of ERRORs in the log file; whereas they tend to freak out less with lots of INFOs.
  • You can do monitoring based on log files. E.g. one can create assertions which can be monitored, such as “INFOs may be in the log file, but ERRORs may not.”
  • You can set a log level and state that errors below a certain level will not be displayed, e.g. “ERRORs should be displayed but WARNINGs should not”. So the log level of an error may influence whether it gets logged at all.

And how would those differences influence ones decision about which level to make a particular log?

I came up with the following scheme for me:

TRACE For things that should happen, but one doesn’t want to see live. (For example SQL statement logs, etc.)
INFO For things that should happen, but one does want to log live. (For example, “writing file to /x/y.pdf…”) It’s important to state IDs and paths in log files, for if something goes wrong, one needs to be able to work out exactly what the robot did with what rows and objects. And if the robot crashes, a log, written before an action is done, such as “Writing PDF file…” can help to identify what part of the process caused the error.
WARNING   For things that shouldn’t happen, but where the user sees the correct result. For example files that should always exist, but if they don’t, they can be automatically recreated. If those happen live, one might want to look at them. But as long as the user sees the right thing, one can look at them in ones own time.
ERROR For things that shouldn’t happen, and where the user sees an incorrect result. Including “fatal” errors.

And in that case, one would monitor ERRORs (or maybe WARNINGs if one was otherwise bored), and would set the log level to TRACE on all test servers and INFO on all live servers.

The Internet: It’ll get slower before it gets faster

Thursday, January 31st, 2008

For the 3 weeks I was in the UK recently I used a UMTS modem (i.e. like a 3G phone) to surf the web and do all my work. Going round to my friend Robin’s house, who also works in IT, he does all his surfing through a cable from his phone to his computer: i.e. also UMTS.

At least in the UK, this is extremely popular. Also in Asia it makes a lot of sense; they have excellent high-speed mobile phone networks there and all ones preconceptions about the Asians having the latest handset devices: I can confirm first-hand that they’re all true.

As we all know and have been experiencing since about 2000, more and more phones are going to get more powerful and have larger screens. Full browsers will (and do) run on them. They will also be UMTS devices.

And for those people who don’t surf via UMTS, nearly everyone I know surfs at home using WLAN. A lot of offices use WLAN too. And obviously all the surfing at airports, coffee houses, hotels, conferences etc. all goes on via WLAN.

UMTS and WLAN have high bandwidth, but they have extremely high latency compared with a cable connection. That means that although the bytes flow fast once they’ve started, it takes a long time for the first byte to arrive.

I am quite proud of the fact that when I designed the “Uboot Joe” software (Windows software which ran on the user’s PC, sat on the notification area by the clock, and communicated with Uboot) I took this into account. Every action you do with the Joe is at most one client-server round trip. For example to view all the thumbnails in a folder, there is a single request from Joe to the server like “get all data in folder_id” and the return structure is a) information about the folder, b) information on all the photos within the folder, and c) all the binary JPEG data for the thumbnails of all those images. You can try using the Uboot Joe on a UMTS link, and it works faster than any website.

Contrast this design with HTML. The first response from the server contains <img src=xx> tags and only once that has been received can the browser make the further requests necessary to retrieve the images. If the first bytes of every response take a long time to arrive, then the user experiences that “long time” twice before they see the data they requested; first to get the HTML page then second to get the images.

In fact it’s worse. If a page has 50 embedded images, it doesn’t open up 50 concurrent connections to the server (for good reason). Instead it opens e.g. 4 connections. Which means that e.g. image number 5 has to wait for the “long time” of fetching image 1 to complete. (Some sites try to get around this by having lots of servers with different names e.g. img341.domain.com and distributing the images over these servers.)

And it’s even worse than that. Even if the application only does one round-trip to the server, the underlying protocols might do more round-trips, for example firstly to contact the DNS server to get the IP address for the domain name used in the URL; and then secondly to request the data from the server.

In addition to this being a problem with UMTS and WLAN, one also has to take into account that the Internet is global. When I’m in Macau accessing European servers I get a round trip of about 300ms. So if one adds three “long times” to an otherwise extremely fast request—easily done—one has added a whole second on to the time the user has to wait. And Jakob Nielsen says that after 1 second in total, users start to lose focus on what they’re doing.

So to design applications in this age, one needs to be aware of the number of serial server round-trips (i.e. the number of times you need to ask the server for something, and only once it’s been delivered, must you ask the server for something else).

For example:

  1. An HTML page which contains an external CSS file, and this CSS file contains URLs to images.
  2. Pages with many images. The browser only requests a few files from the same server at once, so again the response to image number 1 must be finished before the request for image number 3 can begin.
  3. Javascript software which does multiple serial calls to the server, e.g. “get session token for username/password” then “get info to display on page for session token”.
  4. A form which submits data to a piece of software. The software does something but instead of returning a result page, returns a redirection command to a “real” result page. Often done to allow one to hit “refresh” safely on the result page, or make the URL of the result page look nicer.

GWT is excellent in this regard. It has the ability to download lots of those small icon-size images in one request (it makes one big image on the server and chops them up again on the client) and it makes you explicitly aware of the number of server round trips by forcing you to define interfaces for client-server interactions – as opposed to some automated scheme where you write code and the framework decides when to insert client-server round-trips. (Wicket makes client-server round trips easy with AjaxLink; my fear is it might be too easy, and one might do them too often, and lose the overview of how many are happening).

Pre-caching is a good idea too. E.g. if you are a photo viewer application, with a photo shown full screen with a “next” button, it makes sense to load the image on the “next” page even before the user’s clicked on it. That download won’t interfere with the rest of the activity the user is doing, as the bandwidth is not the bottleneck, just the time between starting the download and the bytes starting to arrive at the client. (Although one can’t download too much without the user noticing, as some people pay per MB!)

But the most important point, I think, is: these days, one must test ones web applications on a high latency connection. Generally speaking, historically I have tended to develop locally (everything installed on my laptop), or I develop in an office with a network cable and high-speed Internet and a link to the data center where the test server sits—and the office is in the same country as the server. Maybe this sounds strange, but I think one should develop web applications while using a UMTS card.

Web software front-end test cases

Wednesday, January 16th, 2008

Recently I was working on a website which was developed in PHP without a web framework. A lot of things were programmed manually which would normally be taken care of by a web framework (for example things like: in case of an error, the HTML fields on the form are re-populated on the response page).

So I came up with an extensive set of test cases, and made sure I used them all on every field on every page.

These are the tests, in no particular order.

  1. On a form with radio buttons and text fields next to the radio buttons, clicking on the radio button should position the text cursor in the text field (as typing there is the next thing you’re always going to want to do).
  2. Clicking on the text field by a radio button should select the radio button.
  3. Checkboxes and radio buttons which have text beside them must have the text in a <label> so that clicking the text selects the checkbox or radio button.
  4. On a <form>, pressing return must do the same as clicking “OK”.
  5. Pressing the TAB key on the keyboard should progress from one field to the next in a reasonable order.
  6. Every action where something is done (e.g. delete an FTP account) should have a confirmation text on the result page like “ftp account XYZ deleted”. I think it’s important to include the name of the object being acted upon in the message.
  7. All forms should use “post” and “get” appropriately. I mean various people have various strong views about when to use one and when to use the other. But for me the difference is the browser’s “are you sure you want to repost?” message when you click refresh. Do you want it? If so use “post”, otherwise use “get”. Also bookmarkability.
  8. Is there a reasonably small amount of HTML generated? E.g. in uboot to display the address book page requires about 1MB of HTML to be downloaded to the browser (not including any external CSS, graphics etc.) That’s too much.
  9. The back button should work everywhere. (E.g. Uboot multimedia gallery: click on a picture, click back, you’re at page 1 of the gallery rather than the page you were on.)
  10. Strings to long? Then “…” should be displayed, to avoid breaking the layout.
  11. If breadcrumbs are used, (i.e. main page > hosting > ftp accounts > ftp account ‘x’) then they must work. Ideally they should contain data such “ftp account ‘x’”.
  12. While loading a page on a slow connection, is the text readable before the background image loads? E.g. white text on a black background image may not be readable before the image loads. There should be an alternative flat-colour background of a similar colour. (Thanks Helge for pointing this out a few years ago!)
  13. While loading the page, does the page move around a lot due to width=x height=x attributes missing on an <img> tag? It’s annoying when you start to read the text on the page and then it moves a few seconds later just because some logo has finally loaded.
  14. Every time there is an error in the form and the page is reloaded with the error: is the original data still in the form?
  15. In all places where the user may enter free text: Do weird non-Latin1 characters work correctly, both entering them and displaying them? Do ” ‘ work correctly? Are < > & displayed correctly? Is all of this written to the database correctly? (Sometimes the browser sends &#123; style code to the server. If this isn’t escaped on the display code then the character may appear to have been processed correctly. But you don’t want &#123; strings in your database data.)
  16. Type in long values into text fields. Values should be neither truncated nor should an internal error be produced (default behaviour if you’re using MySQL or Oracle respectively)
  17. If there are any id=nn type parameters in the URL, then adding or subtracting 1 from the URL should not allow you to view other people’s content.
  18. Is the <title> tag set usefully? This is necessary when you use the down-arrow by the “back” button on the browser, to determine how far back you want to go. A whole lot of options such as “MyWebsite”, “MyWebsite”, “MyWebsite” is not very helpful.
  19. Test in a high-latency environment. Such as over a UMTS connection. If there are a lot of redirects, or images referenced from CSS referenced from HTML, or Javascript making AJAX calls, processing the result then making more calls, then the page will be slow, but work fine over a LAN connection.
  20. Test in an unreliable environment, e.g. where packets get lost. Google Spreadsheets, when you type in a value, lets you continue editing the page while it’s sending the value to the server. But if there’s an error or timeout with the sending you see an error, and the cell is reverted to its previous value. You simply have to type in the same data all over again. Instead of that it should remember your data, and display an warning “can’t connect to server right now; retrying…”
  21. If you type in URLs in capitals or mixed case: do they still work? (Thanks again Helge!)
  22. AJAX progress indicators: Is it the case that you’ve written an AJAX site, and when the user clicks an action, absolutely no visual feedback is given to the user that he’s clicked? You need to have some kind of feedback, e.g. the “loading…” of gmail.
  23. Are the buttons large enough to be easily clicked on? E.g. confirmation page with “Yes”, “No” options displayed as links in a tiny font. They’re hard to click!
  24. Do the colours work even if you view your laptop at a weird angle? A site I was using recently used a white background (what a surprise..) and to highlight the tool you had selected used a light-grey background. Worked fine on their monitors I’m sure, but at the airport with your laptop on your lap, they’re difficult to see.
  25. Does the site work, or at least fail gracefully on old browsers? An error message immediately is preferable to allowing the user to type in lots of text then lose it when the user presses “OK”, due to some browser incompatibility issue (e.g. MediaWiki on Safari 3 beta for Windows)
  26. What about small screens? My parents computer uses 800×600 and I use my Laptop in 1064×600 normally. In Google Reader I can hardly see any feeds at that size. The whole screen is taken up with toolbars, menus, etc.
  27. Is the session timeout compatible with the company’s policy? E.g. do you really need the user to log in again after 10 minutes, i.e. when he just had to nip off for a meeting?
  28. If the session expires, what happens to the user’s data? Composing a long email and clicking “send” only to receive the response page “please log in again!”, and losing the email, is wrong.
  29. If there is a possibility to log in on every screen (e.g. “logged out” at the top-right of the screen, or like on uboot), then logging in should take you back to the screen where you were. Because that’s what the user would want.
  30. If you are logged out and go to a particular screen e.g. via URL or bookmark sent while a user was logged in, do you get useful information? A redirect to the homepage is also OK, but “general error” is not.
  31. If you go to a page with a form, is the text cursor already in the first field? Or do you have to reach for the mouse and click on the first field in order to use the form on the page?
  32. Does the site look good on both LCD monitors and conventional monitors? Some colour combinations (e.g. dark green on a light green background) are perfectly readable and look nice on LCDs, but are completely unreadable on conventional monitors.
  33. If a browser window is open, and a user is logged in, and from another browser (or directly in the database) that user’s password is changed, the user deleted, or disabled, is the first browser immediately logged out?

Unit testing and configuration files

Sunday, September 9th, 2007

I used to think of a function as something which would convert some input value into some output value (potentially with some side-effects). And thus unit testing a function would involve passing particular inputs into the function and checking that the results were as expected (potentially setting up some database rows or something to test that the side-effects were executed properly).

But sometimes a function relies on a particular piece of global configuration. That’s an input to the function too. For example the tax rate.

public int calculateVat(int cents) {
    double vat = config.getDouble("vatRate");
    return (int) Math.round(cents * vat);
}

Initially I would just test the function with the current settings of the config file.

// VAT is currently 20% in Austria
assertEquals(20, obj.calculateVat(100));

However that’s obviously not a great solution as that will break when the config file changes. And after all, configuration files are there to extract the things that likely will change from the otherwise often very long but hopefully reasonably static domain logic.

So the solution I use now is to extend such configuration accessing classes with methods such as “setValueForTesting”. The “forTesting” part of the name indicates clearly its purpose is for test programs only.

config.setDoubleForTesting("vatRate", 0.2);
assertEquals(20, obj.calculateVat(100));

That code feels much better. There are actually two advantages:

  1. Obviously the test code will not break if the config file changes.
  2. But also there is more locality. Everything you need to understand about that test is there in the test program’s source file, in two easy-to-read lines.

But this approach feels somewhat unorthodox. How do other people do it?

3-dimensional photo organization

Monday, September 3rd, 2007

I have just viewed some photos on Facebook. They were of a friend's trip to Malaysia.

  1. Facebook has a limit of 60 photos per album; meaning you have to split photos up into albums with names like "Malaysia 1", "Malaysia 2" etc if you want to upload more than 60 photos in total.
  2. Each album, as is current practice in web design, is divided into pages with "page next" buttons to get to the next page.
  3. Each page of each album, as was introduced with windowing systems, has a scroll bar (vertical only, unless one makes the window really small)

OK now fundamentally a set of photos from a holiday are one-dimensional. I can think of many ways to lay out photos but I'm sure these three dimensions would not be the dimensions I would choose.

The scroll bar is quite a good device. It was well thought through. It was specifically developed to solve the problem of "you have more data than can fit on the screen". You can move slowly up or down using the arrows at the end which are deliberately easy to understand even for novices unfamiliar with windowing systems. You can see how far down the available data you are. You can drag the bar with your hand/mouse to move either fast or slow in a natural motion.

I have heard that some web novices find "next page" easier to use than using the scroll bar. But this wouldn't be the case if there were no "next page" links. And knowing how to use scroll bars is non-optional, if you want to use any other system other than photo browsing websites. For example when using the compose interface of an email website, there is no "next page" button once you've typed text equal in length to the size of the window the user interface designers assume you are using.

Scroll bars are so much better than "next page" links, and even if they weren't, displaying 1-dimensional data using 1 data navigation tool is better than displaying 1-dimensional data using 3 different navigation tools.