Archive for the ‘Perl’ Category

2 of 30

Monday, January 27th, 2014

How would you diagnose the following bug?

  • A number of checkboxes representing user interests (Football? Music?); user can select/deselect their interests.
  • Software has worked well for years in production.
  • Suddenly intermittent reports start coming in that sometimes some checkboxes get unchecked by themselves.
  • You test live, you test on the test server, all is good.
    But the reports keep on trickling in, leading you to suspect that it isn’t just user foolishness (e.g. not understanding how to use a checkbox, which I wouldn’t normally put past users..)

This happened at Uboot around 2005.

I’ve forgotten how it came to pass that we fixed the bug. But the bug was this. Perhaps it sounds obvious once explained, but it was anything other than obvious at the time, especially given the fact it was completely unreproducable.

The software was coded in Perl5, and I hadn’t coded this particular screen myself. But, I had no reason to suspect that anything was wrong with the code, as, as I say, it had worked well live for years.

In Perl, “everything is a string”, apart from that, in fact, that’s not true at all. “everything appears to be a string; but might not be” describes the situation better. If you have a string “foo” then it’s stored as a string. If you have a string like “45” then it’s stored as an integer internally and converted to/from a string as needed. If you have a string like “45.6” then it’s stored as a double internally and converted to/from a string as needed. (Or it might be more complex than that, I’m not sure, perlnumber)

Supposedly this makes everything “easier” if you treat everything as a string. But I have no anger towards the junior developer who coded this screen who believed, as things appeared to be strings, things actually were stings. I mean, why wouldn’t you think that?

The checkboxes were stored in the software as a huge bit field. Perhaps this wasn’t the best representation, as that doesn’t scale (if you want to store 100 interests, you’re going to have to change the approach). But, that was the representation that had been chosen and, as I said, this had been online and worked well for years.

At some point, someone had decided to augment our 64-bit servers with some 32-bit servers. So you can imagine the rest. 2 out of 30 servers were 32-bit, 2 out of 30 clicks went to those servers, our dev server was the older 64-bit server, all our software had been developed on the old 64-bit servers. So 2 out of 30 clicks, all interests apart from 32 of then got lost.

Lesson learned (or not, as the lesson would have to be learnt by the programming language community): If things appear to be strings, make sure they actually are strings. Or, make sure it’s obvious that they’re not strings.

Had the software been written in Java, this couldn’t have happened as, independent of machine word size (32-bit or 64-bit) each Java numeric data type has a defined width, and is guaranteed to behave identically on any JVM implementation.

Underscores in numbers

Thursday, April 26th, 2012

One feature I miss from my Perl days, which I’ve never seen in another language*. I am writing a test loop and I want it to run 10k times. So I write, in Java:

for (int i = 0; i < 10000; i++) { ... }

Nothing surprising about that. But in Perl I could write:

for (my $i = 0; $i < 10_000; $i++) { ... }

You can put underscores anywhere when writing number literals, and they will get ignored by Perl. So you can make such large numbers a lot more readable.

It isn’t a huge feature, it could easily be implemented by other languages such as Java, only….. it isn’t. The fact that it would be easy to implement only adds to the existing annoyance created by the lack of the feature itself.

However I only have one criticism of this feature and that’s that you can put the underscores anywhere. Therefore it’s possible to write 10_0000 which looks a lot like 10k. I think I would limit the use of underscores only to every 3 digits. Otherwise the feature (increased readability) would have exactly the opposite effect.

* I’m sure it exists though, don’t cuss me up for this statement, I’ve only programmed C, C++, Perl, PHP, Java, Objective-C, Javascript for longer periods of time!

Klingon programming proverb

Tuesday, February 17th, 2009

From the autodie manual:

It is better to die() than to return() in failure.
  — Klingon programming proverb.

(“die” in perl is like throwing an exception.)

(via $foo magazin)

“Annotating” things mid-unit-test

Monday, February 9th, 2009

I use JUnitPHPUnit and Perl’s Test::Unit. Perl’s version has a very cool feature. You can call annotate(string) at any point during the test, and this text will get recorded, and only outputted in the case of the test’s failure. 

sub test_foo {
   my $self = shift;
   my $value = setup_something();
   $self->annotate("value = $value \n");
   $self->assert_equals(12, do_something($value));
}

So you can just log arbitrary stuff during the test, and it will be output when you need it (when the test fails), but won’t clutter up the test output (when the test succeeds).

I wish this was available from the PHP and Java unit testing frameworks!

Making progress with introduction of unit tests to Uboot

Monday, May 14th, 2007

The old uboot code had, amazingly enough, 21k lines of unit tests. But they were not useful unit tests, as one had to run each program individually, and they each had a bunch of (different) prerequisites, such as account_id 3 existing and having an empty inbox, and so on. And with the older tests, their output would be a bunch of print statements (e.g. insert message; print count of messages), and one would have to compare the printed output with the expected results (which weren’t documented anywhere).

I am converting them to PerlUnit (which is a clone of JUnit) so that we can automatically and easily run as many tests as possible before each release. This is an incredibly productive task, as I don’t even need to write new unit tests (and think about testing strategy), I’m just converting the lines to a format enabling them to be convenient to run!

So far 3.6k lines in 86 test functions in 33 test classes :)

$ ./test.pl
...................................................
...................................
Time: 48 wallclock secs ( 8.65 usr  0.56 sys +  0.02 cusr  0.25 csys =  9.48 CPU)

OK (86 tests)

Transfering some hex. Sometimes gets replaced by string “INF”. Why?

Thursday, May 10th, 2007

Update: See Explicit vs Implicit  Data Typing.

This was never going to work out. Data transfer interface. Our side in Perl and their side in PHP. Both scripting languages (bad) and not even the same scripting language (incompatible badness).

Over the data transfer interface, we are transferring users. Including a code to enable them to unsubscribe from an email newsletter. The first 7 characters of the code identify the users (digits) and the rest of the code is a hex string containing some security information.

All works great. But some users can’t use the code? It turns out on the destination system they have “INF” in the field instead of the code.

It turns out that some of these users have e.g. 1234567 to identify the user, and e.g. 123e1234567 as their hex code. That makes the security code “1234567123e1234567″. And that “looks like” a floating point number to Perl. But quite a big one. Almost as big as Infinity in fact, so might as well call it that.

I hardly think the flexibility we “won” through every data instance having its own type based on what its data “looks like” hardly compensates the anger of a segment of our users not being able to unsubscribe from their newsletter, or the extra expense to the company of the time to debug this problem (which was then an urgent problem, as it was only discovered after the system went live, as it only affected 0.6% of our users).

P.S. my solution was to put a space in front of the code, which is taken off by the receiving system, so the data always “looks like” a string. But I wouldn’t like to guarantee that what “looks like” a string won’t change with the next version of the Perl SOAP client libraries we are using.

perl switch statement limitation

Wednesday, May 2nd, 2007

Look at the documentation for the Perl switch statement. Look down the bottom at the “limitations” section. Look at the last limitation.

If your source file is longer then 1 million characters and you have a switch statement that crosses the 1 million (or 2 million, etc.) character boundary you will get mysterious errors. The workaround is to use smaller source files.

Using UTF-8 and Unicode data with Perl MIME::Lite

Tuesday, February 27th, 2007

MIME::Lite predates Perl 5.8 which supports Unicode and UTF-8. But it’s easy to get MIME::Lite to work with Unicode bodies and subjects.

To attach a plain text part to a message, with a string which contains unicode characters, use:

$msg->attach(
   Type => 'text/plain; charset=UTF-8',
   Data => encode("utf8", $utf8string),
);

To set the subject of a mail from a string containing unicode characters, use:

use MIME::Base64;
my $msg = MIME::Lite->new(
   ...
   Subject =>   "=?UTF-8?B?" .
      encode_base64(encode("utf8", $subj), "") . "?=",
   ...
);

Note that the above methods also work even if the strings do not contain unicode characters, or do not have the UTF-8 bit set.

It would be better to change MIME::Lite such that subject and data strings are accepted and the above code happens inside MIME::Lite. I’ve filed a bug report. It was rejected.