Archive for the ‘Broken’ Category

Too much data for the browser window? Solution: iframe

Wednesday, April 16th, 2008

Recently I was asked to put a webpage online which contained some navigation and some content. There was more content than could fit the browser on an average screen. So the content looked basically like this:

But then I was sent a new version which was cleverer,

  • The content was stored in an iframe
  • There was Javascript which dynamically altered the size of the iframe to be the size of the monitor resolution (i.e. independent of current browser size)

The result of this complexity was that the iframe nearly—but not quite—fitted into the browser’s content area (assuming you had your browser maximised).

Because the iframe didn’t sit quite at the top of the window, this meant:

  • The bottom “scroll down” arrow button of the inner iframe (the one you need to actually see more content) was, by default, off the bottom of the screen
  • As there was content beyond the bottom of the screen (the “scroll down” button of the iframe) the main browser window also displayed a scroll bar
  • To scroll to see more data, you need the inner iframe. But instinctively one reaches for the right-most scrollbar, as that’s normally what you need to see more data, e.g. in a browser, Word document, etc.
  • It only got worse if you don’t use the browser maximised.

I mean I don’t know what aspect of “the browser will display a scroll bar if there’s more data than will fit in the window, without needing Javascript and iframes” they didn’t understand.

I suppose the navigation will always be shown, but I think people are used to navigation scrolling with the content these days. And if the objective was to keep the navigation on the screen, one could have used a normal frame, which would have required no Javascript, and still resulted in only one scrollbar, in the place where the user expects it.

Really the whole “scrollbar within a scrollbar” concept (which unfortunately is pretty much mandatory if you have a text area within a webpage) is really so nasty.

Programming Languages: Is newer always better?

Wednesday, March 26th, 2008

I constantly hear the belief that modern programming languages and environment are better than older programming languages. More productive, easier to user, and so on. It would stand to reason: nobody would make a new programming language with worse features than an already existing programming language. Or would they?

Everyone seems to think that this is fact. But surprisingly it’s not. There are many features in older programming languages which are not present in today’s languages. I predict these features will be re-invented by the next generation of programming languages authors, and everyone will think they are geniuses for having come up with these ideas. But at the same time those new languages will omit most of the good points of today’s languages. This cycle can go on forever.

It’s like the cycle that tends to take place of “the network” vs “the standalone computer”.

  • Central - IBM used to make mainframe computers, which one would access from terminals, i.e. central computing power, distributed usage.
  • Local - But those computers were slow because they were remote. Then e.g. Sun invented the “workstation”. The PC then followed. Local power to everyone.
  • Central - Then the web happened. Suddenly everything was remote again. “All you need is a browser!”. No local software installation nightmare. (Perhaps) independence from the single operating system vendor.
  • Local - And now “using the web offline” is back in fashion. So that’ll be local computing again then.

A few facts, for those who think there was no programming before Javascript, the web:

  • 1957 - Fortran released: expressions, variables, loops, subroutines
  • 1959 - LISP released: treating functions as data, enabling higher-order programming
  • 1967 - Simula 67 released: Object-oriented programming

Consider the following:

  • Variable Bounds. Ada, developed for the American military, with high emphasis on program correctness, allows one to define bounds to variables. For example “array with index between 1 and 100″ or “0 and 10″ or number “not more than 5″. Most variables, in reality, have allowed ranges. Why not express it in the program, it’s more self-documenting and it allows the run-time, and to an extent the compiler. to check the constraints. Isn’t minimization of bugs something that affects not just the military?
  • Strict typing. If you know an object being passed to a function is a “User”, it’s no good being passed an “Email Address”. The set of operations those objects can perform are completely different, so even if the programming language is “advanced” enough to be able to accept the parameter, the first method call to the object will fail. Why not express that and let the compiler check that. C++ can do it (since 1983) so let’s use that not Perl which can’t do it. Recently I read an article making a joke about casting everything to a string, but in reality that’s the default behaviour (in fact the only behaviour) of all scripting languages.
  • Knowing what’s going on. In C, it’s well defined what “0″ means or what the string “abc” in a program means, and so. Ask a C programmer if 0==NULL and as a PHP programmer if 0==null and see a) their reaction times b) if they’re correct. The C programmer will know fast and be correct, the PHP programmer will not. Who do you think writes programs with fewer subtle bugs?
  • Enumerated types. Is a user “active”, “disabled”, “inactive”? Having such options are common to all domains. C can define an enumerated type since ANSI C (1989) and Lisp since 1959. Why did Java have to wait until Java 5.0 (in 2004), and why do we have to create unreadable programs with languages like Ruby which can’t do them at all? For example what does the function error_log(”user not found”, 2) do in PHP, what does the 2 mean?
  • No compiler. Every byte in an interpreted language costs time to interpret. So it makes sense to have short variable names, fewer comments, for run-time efficiency. Is this the sort of programming style one should be encouraging?
  • No linker. You can build big libraries in a linked language, and only those functions used by the program (or used by the functions used by the program) will be included in the final executable. In Java, PHP etc, all the code you use is available all the time, taking up memory. I am often criticized for writing “too many libraries”, or code being “too object-oriented” in scripting languages, which is a fair criticism, as that code will run slower. However is it really an improvement to remove this function-pruning feature, which means bad programming practices will produce more efficient code?
  • Multiple compile errors. Why do modern programming languages such as PHP only tell you the first error in your program, then abort? This is laziness on the part of the compiler writer. Old compilers tell you all the errors in your program, so you can correct them all, without having to correct one, retry, correct next one, retry, and so on.
  • Formatted strings. There is nothing wrong with the format concept behind C’s “sprintf” command, originating from 1972. You can print numbers, strings, specify precision, field length and so on. (Apart from the inability to reorder parameters.) Why did C++ introduce the “<<" notation? (At least you can still use printf in C++). Why is this re-invented, worse, in .net? Why did Java have to wait until Java 5.0 to get this feature? Why do we have to reinvent the wheel (worse) all the time?
  • Auto-creation of variables. When programming languages like C were created, the authors made the decision that it was an error to use a variable without declaring it. This caught all sorts of errors such as misspellings of variables. Why have these decisions been forgotten, and every scripting language allows you to just use variables without declaring them? This means hours of searching for bugs when you simply misspell a variable name, something that’s going to happen to everyone at some point. We’re only human and we have to take that into account.

The above is a list of things that have got worse over the last 2 decades, I.e. they haven’t just not got better by staying the same, but these things have actually got worse.

Two great comments about Windows

Monday, March 10th, 2008

Two great comments from people about their experiences with Windows. I can really sympathize with these people!

From http://blog.seattlepi.nwsource.com/microsoft/archives/132891.asp#102626

Yesterday, it started an update I didn’t even want to initiate on shut-down so I had to walk to my car & drive off with this stupid laptop running in my hand to get to work. Is MS trying to get me fired?

From http://blog.seattlepi.nwsource.com/microsoft/archives/132891.asp#102708

I’ve come very close to picking up my Vista computer and throwing it out the window on several occasions. I got so tired of it constantly running something in the background and me not being able to stop it, that I punched the machine the other day. Yes, I know, not terribly intelligent, but man does Vista frustrate me. It hangs all the time, FOR ABSOLUTELY NO REASON. I could go on, let’s just say that when using Vista you feel like MS just didn’t really care whether or not it works.

Monitor troubles

Wednesday, February 27th, 2008

The monitor I am using at work suddenly went white. I.e. every pixel went white; not black, and not blue. A white screen of death, as it were.

I was in the middle of programming an algorithm requiring some degree of thought, so I wasn’t really up for being interrupted by unreliable technology.

The monitor a 22″ wide-screen flat monitor. My initial suspicion obviously lay with Windows.

For some reason, some instinct told me to turn the monitor off and on. I pressed the on/off button but amazingly nothing happened, and the “on” light stayed on. The on/off button is hidden in some inconvenient place, and one can’t see it unless one gets up and peers round the side of the monitor. It has no real tactile feedback when pressed, so initially I assumed, when the button did nothing, that I’d pressed it wrongly.

But getting up and looking at the button and pressing it a few more times confirmed that it was in fact doing nothing.

Again, some instinct told me to hold the button down for 5 seconds. Amazingly, that did turn the monitor off.

I had managed to crash my monitor !

To quote my friend Robin:

As life goes on, one owns more and more devices which need rebooting.

Copy/paste between Excel and MSN

Wednesday, January 30th, 2008

Not even the simplest things work with computers these days.

I have an Excel sheet and I want to copy a value into an MSN conversation. On Windows. Notice the vendor of all these products.

  1. I copy the cell and in the middle of the sentence I’m tying into MSN I press Ctrl-V. MSN hangs for about 5 seconds. Then a notification is sent to the other party that I want to transfer a file, something .tmp.gif, i.e. an bitmap image of the value I’m trying to my sentence. Of course the other party hasn’t seen the first part of my sentence yet, as it’s still in the message composer and I haven’t pressed Return yet, so this file transfer request would come as a bit of a surprise to the other party.
  2. No problem - I can double-click the cell in Excel to edit it, and copy the characters from the cell, as opposed to copying the cell itself. However, the cell is a formula. That means that when I edit the cell I get text such as =D1+E9 as opposed to the numeric result value which I wanted to paste into the MSN conversation.

So what is the solution? As far as I can see I have to have both windows open side-by-side on the screen, and type in the value into the MSN window that Excel is displaying….

Random unreproducable Java error of the day

Monday, January 21st, 2008

I mean I’m really kind of of the opinion that Java Sevlets, at least when using Tomcat and the other open source tools, don’t work. I mean surely it can’t be difficult to implement a Servlet container or logging framework!

I just tried to start Tomcat and it refused to start because of the following error:

log4j:ERROR Error occured while converting date.
java.lang.NullPointerException
  at java.lang.System.arraycopy(Native Method)
  at java.lang.AbstractStringBuilder.getChars
  at java.lang.StringBuffer.getChars
  at org.apache.log4j.helpers.ISO8601DateFormat.format
  at java.text.DateFormat.format
  ...
  at org.apache.log4j.Category.log
  at org.apache.commons.logging.impl.Log4JLogger.error
  ...
  at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt
  at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run
  at java.lang.Thread.run
log4j:ERROR Error occured while converting date.

So I just hit “start” again, and this time it starts without error.

And people trust their mission-critical server architecture to this stuff!

New Year, New Blog

Tuesday, January 8th, 2008

I shall be blogging here henceforth. I have moved all my old articles over from my previous uboot.com blog.

The reason for the move was multiple:

  • It’s important to use the software you write, to experience its successes and limitations. I am a contractor for uboot and have been using their blog; however I am also a contractor for easyname and in December we took the hosting features online we’d been developing in 2006. I’m glad to report they work just great!
  • I wanted more control over the design. The text was small at the uboot blog and didn’t invite reading.
  • I have discovered hierarchical categories! So one can view just my software design posts for example. They are cool. Did uboot have them? If so, I never found them.
  • I wanted a facility for seeing new comments without checking all posts over all pages, and comparing the current number of comments on that post with my memory of the previous number of comments.
  • Trackback wasn’t fully working with uboot. Although I suppose in the time it took me to set up my own blog, I could have repaired the uboot feature!

It wasn’t easy easy as I had hoped to set up this blog. I imagined just FTPing over a Wordpress installation and that would be it. While I don’t want to sound ungrateful to the open-source programmers who made both the blogging software, and the hosting software possible, there are a sufficient number of small problems - both in implementation and in architecture - with the internet, web servers, web protocols, and in every piece of software, as to make the seemingly simple process of installing some blogging software annoying and time-consuming. I am writing this at the end of the second day of full-time work creating this blog.

  • My plan to import my own data was to import the RSS feed from the old Blog. However, that RSS import software had three separate bugs. I have fixed these problems now in the source, and will submit them in due course.
    • The RSS importer didn’t use an XML parser, but instead regular expressions. Thus it required an <item> tag to look exactly like “<item>”; whereas the uboot RSS feed includes attributes of the item tag, i.e. <item x=”y”>. So it didn’t match and simply reported that it had “successfully” imported 0 posts.
    • Newlines in the HTML content were turned into <br> characters. My HTML content had a lot of newlines (that’s how the gmail WYSIWYG editor produces the content). These are ignored by the browser, so shouldn’t be turned into <br>s which are not ignored by the browser. I solved the problem by replacing all newlines in the HTML with spaces, before the <br> conversion happened.
    • HTML escaping was being performed needlessly on the article titles. The titles were already in HTML in the RSS file. So “&quot;” text was introduced into the user-visible titles. I know the RSS feed is correct in this regard as it renders correctly on Google Reader, Bloglines, etc. I have removed this conversion.
  • Alas I lacked sufficient knowledge of CSS so getting the style correct was a great pain. Yet I didn’t have particularly ambitious style requirements, as any viewer of the new blog can confirm.
    • One problem that took ages was the removal of some two-coloured vertical lines. Were they images? Clever CSS borders? I couldn’t find the border commands in the CSS file but also couldn’t find any image commands. Nor were they referenced from the HTML file. Finally I checked the images directory and found an image; then full-text searched all files. I found the image referenced in a <style> element at the <head> of one of the HTML files.
    • Various IE7 problems. I even had to insert a “if browser=IE” Javascript in one place.
  • All embedded image references and intra-blog links had to be changed. (I couldn’t even just leave them pointing to the old blog, as they were relative links, i.e. <img src=”/x/y.jpg”> so didn’t work after the new content had been imported.)
  • I created a “.htaccess” file to password-protect the website while it was under development. Later I deleted the files. However Wordpress had written some rules into the file (without it being obvious to me) so that URLs like “/post-name” would be mapped to the correct PHP files. So after I deleted the “.htaccess” file to give everyone access, the blog no longer worked (it took me some time before I discovered that, as the URL “/” still worked; so it was not obvious which action had led to the pages stopping working)
    • Let us not forget that the syntax of “.htaccess” and “.htpasswd” files is far from obvious in the first place (But thankfully my hoster has a tool to write this files - actually I wrote that piece of software!)
  • I tested the RSS feeds from the new blog in Google Reader just at the moment when the .htaccess file was broken. Thus, Google Reader cached an non-working version of the page with 0 posts. And as Google Reader shares that cache between its users, I knew that anyone trying to subscribe to the feed would see the same thing. It’s fixed now though (by time).
  • I’m sure there were more problems but I can’t remember. I should have written them down as I was working; after all, the probability of me not writing a blog entry about the difficulties of installing the new blog software were clearly not particularly high.

So essentially had I not been deeply familiar with PHP, HTML, Javascript, .htaccess files, FTP, XML and (to an extent) CSS, I would not have made it. This is not something I would recommend for “Aunt Tillie”.

Anyway, now it’s done, and as I now maintain this software in contrast to before, I look forward to also having to fix it when it breaks randomly in the future (as inevitably software always does).

3-dimensional photo organization

Monday, September 3rd, 2007

I have just viewed some photos on Facebook. They were of a friend's trip to Malaysia.

  1. Facebook has a limit of 60 photos per album; meaning you have to split photos up into albums with names like "Malaysia 1", "Malaysia 2" etc if you want to upload more than 60 photos in total.
  2. Each album, as is current practice in web design, is divided into pages with "page next" buttons to get to the next page.
  3. Each page of each album, as was introduced with windowing systems, has a scroll bar (vertical only, unless one makes the window really small)

OK now fundamentally a set of photos from a holiday are one-dimensional. I can think of many ways to lay out photos but I'm sure these three dimensions would not be the dimensions I would choose.

The scroll bar is quite a good device. It was well thought through. It was specifically developed to solve the problem of "you have more data than can fit on the screen". You can move slowly up or down using the arrows at the end which are deliberately easy to understand even for novices unfamiliar with windowing systems. You can see how far down the available data you are. You can drag the bar with your hand/mouse to move either fast or slow in a natural motion.

I have heard that some web novices find "next page" easier to use than using the scroll bar. But this wouldn't be the case if there were no "next page" links. And knowing how to use scroll bars is non-optional, if you want to use any other system other than photo browsing websites. For example when using the compose interface of an email website, there is no "next page" button once you've typed text equal in length to the size of the window the user interface designers assume you are using.

Scroll bars are so much better than "next page" links, and even if they weren't, displaying 1-dimensional data using 1 data navigation tool is better than displaying 1-dimensional data using 3 different navigation tools.

Paper jams

Friday, June 15th, 2007

Why does my printer always assert it has a paper jam? Why do other (personal) printers actually have paper jams the entire time?

Most cheap lasers, and now cheap inkjets (the one I have at home in Macau) seem not to be able to handle paper correctly. More expensive lasers (like at offices) and more expensive inkjets (the one I have at home in Vienna) seem not to have this problem.

In fact with the ink jet printers, I must observe that the printers are from the same manufacturer and are essentially the same printer (this was not by accident). The difference being the design isn't as nice on the cheap one, it feels cheaper when you open the lid, and it has a single digit LCD display, whereas the expensive one has a colour pixel LCD display which has error messages in a language of the user's choice. But the print quality is the same (according to the specifications and in reality). And the software one installs on ones PC is the same.

The paper jam isn't even really a paper jam. After printing about 1 or 2 sheets, it claims to have a paper jam (although everything is physically fine), and instructs you to press the "ok" button. Once the "ok" button is pressed, it continues printing. I mean this paper jam is essentially a software paper jam:

  If product = cheap Then
    If (rand mod 2) = 0 Then
      Call paper_jam
    End If
  End If

I used to have a dot-matrix printer with a tractor feed. I could buy A4 paper with holes on the side. I could print A4 from the software. After it had printed I could separate the pages and tear off the holes left and right and there would be perfect pages of A4. It never had paper jams (how could it?)

Surely tractor feeds are a better solution? Why has the world chosen to have paper jams instead?

gettext is so broken

Friday, May 18th, 2007

Working on a PHP project recently, there was the requirement for text localization. The standard way to do this in PHP is to use the standard way to do this in C, which is gettext.

I’ve worked with various translation systems, including one I built myself for uboot, involving a hierarchy of languages going from most specific to most “international”, and with each string having a hierarchical id such as “myprogram.errors.disk-full”.

Java Properties files are simple but also work well (simplicity being a positive thing in this case). The lines are key-value pairs, and using a convention such as “myprogram.errors.disk-full” the key is almost as good as if it actually were a key hierarchy. The file is in Latin1 but Unicode characters can be used via an escape syntax, and there are many editors where one can just type Unicode text and which take care of the escaping.

So I was looking forward to using gettext. This format was created by GNU, the creators of GCC (a highly respected program). gettext is itself well respected and authors of systems such as PHP have chosen it as their localization system.

But alas, it is broken in so many ways.

(1) The file format. Whereas Java’s file format is to have lines such as “key=value”, gettext’s “.po” format (where did that extension come from?) has two lines for every string, like

msgid “key”
msgstr “value”

As one inevitably places a blank line between one key-value pair and the next, the file is immediately 3 times as long as a Java properties file storing the same information. And what if you want to have double-quotes within your string?

(2) Compilation (for performance reasons). I work with scripting languages, where there is no compiler. This can be a good or a bad thing; but independent of that, it is a fact. However the editable “.po” files of gettext have to be converted into binary “.mo” files before they work. Thus I have to introduce a compilation step into my otherwise compilation-free edit-and-that’s-it test environment.

In fact I don’t understand this compilation requirement at all. According to the gettext manual, gettext was developed in 1994. Surely computers were fast enough back then to parse the gettext format, store the whole lot in a hash?

And what I further don’t understand is how/if GNU programs were localized before then. I suppose they just weren’t.

(3) What about Unicode? I have no idea how to introduce Unicode characters into the editable “.po” files of gettext. The manual doesn’t help me. Supporting only 8-bit characters, and assuming/hoping that the encoding of the “.po” file is the same as the encoding that the user is using in viewing the output of your program, is simply a terrible solution. Microsoft designed Windows NT to use Unicode internally in 1988. Java uses only Unicode since its inception in 1991.

Unbelievably there is a reason given for not using Unicode.

However, we don’t recommend this approach for all POT files in all packages, because this would force translators to use PO files in UTF-8 encoding, which is - in the current state of software (as of 2003) - a major hassle for translators using GNU Emacs or XEmacs with po-mode.

(4) Using natural language keys. The “best practices” usage of gettext have English texts as the keys. This is supported by the utility tool “xgettext” which extracts strings automatically from your source.

This sounds nice, but I don’t like having English-text (or, in our case, German text) as the keys for translation files. If the text is e.g. “Click here for more info” and then the new style guideline for the site becomes “More Information”, then you end up having

// mypage.php
echo gettext(”Click here for more info”); // prints “More Information” # mypage.po
msgid “Click here for more info”
msgstr “More Information”

I dunno, that’s just confusing for me. I’d much rather have a text-neutral key such as “more-info”.

Update: This article also shows why you can’t use English-langauge text as translation keys.

(5) Referencing usages from the translation file. The “xgettext” utility writes lines such as the following into the “.po” file

#: mypage.php:47
msgid “Click here for more info”

msgstr “Click here for more info”

I don’t in any way like having the source file name and line number in the translation files. In principle it looks like it helps you to find the usage of a particular string, but in fact:

  1. It is not hard to find all the usages of the key “myprog.error.disk-full”. That string is hardly going to appear in a non-translation context by accident. A recursive search will tell you where its usages are.
  2. What if I change “mypage.php”? (which is pretty likely). For example inserting some lines before line 47. Then the information is not only irrelevant, but in addition wrong.

It is a principle of mine that not only should databases be normalized, but software source also. Every piece of information should be in exactly one place. And that place is where it’s technically needed (in this case, in the PHP file, as otherwise the string wouldn’t get displayed). As that’s (the only place) where it’ll get updated.

(6) Parameters. We all need strings such as “The file ‘$FILE’ has been successfully deleted”. It seems that the standard way to do this in gettext is to use sprintf-type placeholders (e.g. “%s”). However as soon as you have more than one of those, and you translate the string into French, you’ll find you need the parameters the other way around. Oops. That didn’t work. So gettext is only suitable a) for Western European languages (due to character set constraints) and b) only for the subset of those languages which have grammars where placeholders will be needed in the same order.

The first thing I did was write a wrapper around gettext to accept $0, $1 style parameters, so one could swap their order on a per-translated-string basis. (Although $FILE named parameters might have been better; but that would have made the calling code longer.)

So nice one, they managed to invent, for the purposes of translation, a system which has a file format more difficult to use than a simple key-value pair, yet offering no advantages. It can’t handle Unicode. Good work.