<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>

<channel>
	<title>Databases and Life</title>
	<atom:link href="http://www.databasesandlife.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.databasesandlife.com</link>
	<description>Adrian Smith's blog</description>
	<pubDate>Wed, 02 Jul 2008 12:38:03 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.5</generator>
	<language>en</language>
			<item>
		<title>Programming with unique constraints</title>
		<link>http://www.databasesandlife.com/unique-constraints/</link>
		<comments>http://www.databasesandlife.com/unique-constraints/#comments</comments>
		<pubDate>Fri, 13 Jun 2008 08:30:47 +0000</pubDate>
		<dc:creator>adrian</dc:creator>
		
		<category><![CDATA[Databases]]></category>

		<category><![CDATA[Software Design]]></category>

		<guid isPermaLink="false">http://www.databasesandlife.com/?p=305</guid>
		<description><![CDATA[If you&#8217;re using a remote database system, your  application doesn&#8217;t have access to all the data at any point in time. (I.e. you  just load and save the rows you&#8217;re interested in in a particular transaction).  Therefore if you want to do some database-wide operation, you need to ask the database to [...]]]></description>
			<content:encoded><![CDATA[<p>If you&#8217;re using a remote database system, your  application doesn&#8217;t have access to all the data at any point in time. (I.e. you  just load and save the rows you&#8217;re interested in in a particular transaction).  Therefore if you want to do some database-wide operation, you need to ask the database to do it.</p>
<p>When you want to  enforce uniqueness, for example across a whole table (for example a document  name needs to be unique), or across a particular part of a table (for example  each user must have documents with unique names), you need to do that in the  database.</p>
<p>There is only one  acceptable way to do this with a SQL database:</p>
<ol class=tight>
<li>Insert a new  row</li>
<li>If the insert  succeeds, there wasn&#8217;t a row there before, and now there is</li>
<li>If the insert throws  a unique constraint violation, the row is already there</li>
<li>If you want to update  the row (i.e. an &#8220;insert or update&#8221; operation to maintain a &#8220;lazy singleton&#8221; in  the database), you can update the row with safety after the unique constraint  violation, as you can be certain the row is already  there.</li>
</ol>
<p>The following  methods are all not acceptable:</p>
<ul>
<li>Do a &#8220;select&#8221; to find  out how many rows there are. If there aren&#8217;t any, do an &#8220;insert&#8221;. However  someone may have inserted a row between your &#8220;select&#8221; and your  &#8220;insert&#8221;.</li>
<li>Do an &#8220;update&#8221; and if  the database says that 0 rows have been updated, do an &#8220;insert&#8221;. Again, someone  may have inserted a row between your &#8220;update&#8221; and the  &#8220;insert&#8221;.</li>
<li>Do a &#8220;select for  update&#8221; statement (Oracle, Postgres, InnoDB) to check that there aren&#8217;t any rows while creating a lock, and then do an insert. However that  statement only locks the rows it returns, so if it doesn&#8217;t return any rows, it doesn&#8217;t create any locks, so you  still can&#8217;t be certain that no one has inserted a row between the &#8220;select&#8221; and  the &#8220;insert&#8221;.</li>
<li>Lock the whole table and do one of the above. This works, but it means that  all write access is &#8220;serialized&#8221; i.e. happens after one another. Any other  operation, writing something completely irrelevant, will now also have to wait until the end of your transaction, whereas it shouldn&#8217;t have. This reduces concurrency.</li>
</ul>
<p>The way I program  this is the following. On the &#8220;insert&#8221; statement, I catch the database error,  and see if it&#8217;s a &#8220;unique constraint violation&#8221;-type error. If it is, I throw an  (unchecked) Exception. The calling code can catch that and do something with it  (or not, if the statement should not generate such an error, in which case it will propogate to the main loop like any other database exception). I have had the pleasure of introducing this to <a href="http://www.easyname.eu/">easyname.eu</a> and now also the pleasure of <a href="http://max.xaok.org/weblog/Unique%20constraints%20and%20the%20alreadyexists%20error">introducing</a> this to the <a href="http://max.xaok.org/webtek">WebTek</a> framework.</p>
<p>It is extremely  frustrating working with big complex frameworks, whose usage is much more  complex than just writing SQL manually, and not being able to do the above  properly.</p>
<ul>
<li><a href="http://www.hibernate.org/">Hibernate</a> clearly states in its  documentation that if any database error occurs, the Session (main object  managing all persistence) <a href="http://www.hibernate.org/hib_docs/v3/api/org/hibernate/Session.html">must  be destroyed</a> as its state will be out-of-sync with the database. But there  is no way other than the above to do this sort of check (as far as I know).</li>
<li><a href="http://www.compuware.com/pressroom/news/2006/6152_ENG_HTML.htm">OptimalJ</a> generates code that, if a database error occurs, sets the transaction to  rollback. Including any other work you may have done.</li>
</ul>
<p>I mean checking  unique constraints is something every database application needs, so the fact  it&#8217;s not supported by major frameworks is just  unbelievable.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.databasesandlife.com/unique-constraints/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Oracle, Nulls and the empty string</title>
		<link>http://www.databasesandlife.com/oracle-nulls-and-the-empty-string/</link>
		<comments>http://www.databasesandlife.com/oracle-nulls-and-the-empty-string/#comments</comments>
		<pubDate>Thu, 12 Jun 2008 10:53:18 +0000</pubDate>
		<dc:creator>adrian</dc:creator>
		
		<category><![CDATA[Databases]]></category>

		<category><![CDATA[Software Design]]></category>

		<guid isPermaLink="false">http://www.databasesandlife.com/?p=304</guid>
		<description><![CDATA[Oracle has an  unusual feature, which attracts it a lot of criticism. If you try to insert the  empty string into a column marked &#8220;not null&#8221;, you get an error. The empty string  is treated the same as &#8220;null&#8221; by Oracle.
This is different to  programming languages (and indeed other databases, at [...]]]></description>
			<content:encoded><![CDATA[<p>Oracle has an  unusual feature, which attracts it a lot of criticism. If you try to insert the  empty string into a column marked &#8220;not null&#8221;, you get an error. The empty string  is treated the same as &#8220;null&#8221; by Oracle.</p>
<p>This is different to  programming languages (and indeed other databases, at least MySQL), which means  one has to be careful not to make a mistake when using a programming language to  talk to the database. That it&#8217;s different from other systems is the main reason  for the criticism.</p>
<p>However, how many  times have you wanted the following to be allowed in your data schema, for  example on a &#8220;first name&#8221; column:</p>
<ul class=tight>
<li>Writing nulls  <strong>is not </strong>allowed</li>
<li>Writing the empty  string <strong>is </strong>allowed</li>
</ul>
<p>I just wrote a  program and the front-end <a href="http://max.xaok.org/webtek">framework</a> helpfully noticed that the  field was &#8220;not null&#8221; in the database and gave the user an error in the front-end  if the field was empty. However when I altered the code slightly, it no longer  gave an error. Because the field in the program was the empty string, and not  null.</p>
<p>However, I assert,  when dealing with data, checking <em>only</em> for not null is not useful; you  also want to check that the string contains some data in that  case.</p>
<p>I have now updated the framework, so that if the field is marked &#8220;not null&#8221;, then the error is presented to the front-end not only if the variable in the program is &#8220;null&#8221;, but also if it is the empty string.</p>
<p>(Note: I am not  advocating e.g. Java lose the distinction between ==null and .isEmpty(): for  some reason this is a useful distinction when doing programming and data  manipulation&#8212;such as null indicating that a variable isn&#8217;t initialized yet&#8212;I  just don&#8217;t think it&#8217;s a useful distinction in a system solely designed to model  persistent data.)</p>
]]></content:encoded>
			<wfw:commentRss>http://www.databasesandlife.com/oracle-nulls-and-the-empty-string/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Using fixed-width fonts for data entry fields</title>
		<link>http://www.databasesandlife.com/fixed-width-fonts-for-data-entry/</link>
		<comments>http://www.databasesandlife.com/fixed-width-fonts-for-data-entry/#comments</comments>
		<pubDate>Wed, 04 Jun 2008 16:54:39 +0000</pubDate>
		<dc:creator>adrian</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.databasesandlife.com/?p=303</guid>
		<description><![CDATA[I&#8217;m sure that Joel on Software is well read: his views on  user interfaces are certainly well worth reading.
But there&#8217;s a point  hidden down at the end of Chapter  7 which is well worth repeating. And that&#8217;s that using a wide fixed-width  font for text entry is much better than a [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m sure that <a href="http://www.joelonsoftware.com/">Joel on Software</a> is well read: his <a href="http://www.joelonsoftware.com/uibook/chapters/fog0000000057.html">views on  user interfaces</a> are certainly well worth reading.</p>
<p>But there&#8217;s a point  hidden down at the end of <a href="http://www.joelonsoftware.com/uibook/chapters/fog0000000063.html">Chapter  7</a> which is well worth repeating. And that&#8217;s that using a wide fixed-width  font for text entry is much better than a thin proportionally-spaced  font.</p>
<p>Recently I noticed that the invitation panel on Google Spreadsheets, allowing you to type in email  addresses of people to invite to collaboratively edit the document, used a  fixed-width font. And conveniently they didn&#8217;t alter their word processing  program&#8217;s similar facility, making it easy to compare the two.</p>
<p>It really does just  feels so much nicer to enter text using a fixed width font. So all projects I&#8217;m  doing now, I make sure I use a fixed-width font for data entry. And I don&#8217;t even  think it has to look ugly!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.databasesandlife.com/fixed-width-fonts-for-data-entry/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Encapsulation or public attributes: but nothing inbetween</title>
		<link>http://www.databasesandlife.com/encapsulation-or-public-attributes-but-nothing-inbetween/</link>
		<comments>http://www.databasesandlife.com/encapsulation-or-public-attributes-but-nothing-inbetween/#comments</comments>
		<pubDate>Mon, 19 May 2008 09:18:34 +0000</pubDate>
		<dc:creator>adrian</dc:creator>
		
		<category><![CDATA[Software Design]]></category>

		<guid isPermaLink="false">http://www.databasesandlife.com/?p=301</guid>
		<description><![CDATA[This post asked the question:
Whenever a class in my model contains a collection which requires that particular care be taken with its items, there’s an internal debate regarding how to expose it to other classes. And with this, there are two major schools: one, the paranoia-based approach which doesn’t allow external code to touch the [...]]]></description>
			<content:encoded><![CDATA[<p>This <a href="http://chaoticjava.com/posts/exposing-collections-paranoia-vs-trust-approaches/">post</a> asked the question:</p>
<blockquote><p>Whenever a class in my model contains a collection which requires that particular care be taken with its items, there’s an internal debate regarding how to expose it to other classes. And with this, there are two major schools: one, the paranoia-based approach which doesn’t allow external code to touch the collection’s internal items and two, the trusting approach which just returns the collection for everyone to deal with.</p>
<p>What are your thoughts on the matter? What do you use, when and why?</p></blockquote>
<p>Definitely paranoia.</p>
<p>It may make certain things more difficult or require more code, but one of the key cornerstones of object-oriented programming is encapsulation and not exposing your internal data in a way that means that others can break it.</p>
<p>If one wants to go for the trust approach - sure it&#8217;s easier - but if easiness is the objective one can write a C program and just declare a struct. Then anyone can access the data anyway they want and there&#8217;s absolutely no code to write (not even getters and setters). But the world moved away from that model towards encapsulation as with N lines of code (or Nk LOC or NM LOC), without encapsulation, any of those can be responsible for the creation of inconsistent data.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.databasesandlife.com/encapsulation-or-public-attributes-but-nothing-inbetween/feed/</wfw:commentRss>
		</item>
		<item>
		<title>The next few months</title>
		<link>http://www.databasesandlife.com/2008-05-the-next-few-months/</link>
		<comments>http://www.databasesandlife.com/2008-05-the-next-few-months/#comments</comments>
		<pubDate>Tue, 06 May 2008 07:23:00 +0000</pubDate>
		<dc:creator>adrian</dc:creator>
		
		<category><![CDATA[Life]]></category>

		<guid isPermaLink="false">http://www.databasesandlife.com/?p=300</guid>
		<description><![CDATA[I&#8217;ve been feeling fairly ill recently. I don&#8217;t know what it is, and I&#8217;ve been to see various doctors about it. The effects are being constantly so tired that I&#8217;m pretty much unable to concentrate on anything. As a consequence I&#8217;ve not done much work, and not done much of anything else either.
Plan for the [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been feeling fairly ill recently. I don&#8217;t know what it is, and I&#8217;ve been to see various doctors about it. The effects are being constantly so tired that I&#8217;m pretty much unable to concentrate on anything. As a consequence I&#8217;ve not done much work, and not done much of anything else either.</p>
<p>Plan for the next few months:</p>
<ul class=tight>
<li><strong>8th May - 12th May: </strong>break in Thailand</li>
<li><strong>13th May - 20th May: </strong>back to Macau: Christina and I are packing her belongings to move to Europe</li>
<li><strong>20th May - approx 1st June: </strong>in London with my parents. Various visa things have still to be organized.</li>
<li><strong>Whole of June: </strong>in Vienna</li>
<li><strong>Approx 1st July - 19th July: </strong>more preparations for the wedding</li>
<li><strong>20th July - 28th July: </strong>honeymoon Maldives</li>
<li><strong>Start of August onwards: </strong>Vienna</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.databasesandlife.com/2008-05-the-next-few-months/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Too much data for the browser window? Solution: iframe</title>
		<link>http://www.databasesandlife.com/too-much-data-for-the-browser-window-solution-iframe/</link>
		<comments>http://www.databasesandlife.com/too-much-data-for-the-browser-window-solution-iframe/#comments</comments>
		<pubDate>Wed, 16 Apr 2008 09:33:01 +0000</pubDate>
		<dc:creator>adrian</dc:creator>
		
		<category><![CDATA[Broken]]></category>

		<guid isPermaLink="false">http://www.databasesandlife.com/?p=299</guid>
		<description><![CDATA[Recently I was asked to put a webpage online which contained some navigation and some content. There was more content than could fit the browser on an average screen. So the content looked basically like this:

But then I was sent a new version which was cleverer,

The content was stored in an iframe
There was Javascript which [...]]]></description>
			<content:encoded><![CDATA[<p>Recently I was asked to put a <span id="misp_compose_1" class="hm">webpage</span> online which contained some navigation and some content. There was more content than could fit the browser on an average screen. So the content looked basically like this:</p>
<p><img src="http://www.databasesandlife.com/blog-attachments/20080416-scrollbar-in-scrollbar-orig.png" alt="" width="159" height="126" /></p>
<p>But then I was sent a new version which was cleverer,</p>
<ul class="tight">
<li>The content was stored in an <span id="misp_compose_2" class="hm">iframe</span></li>
<li>There was Javascript which dynamically altered the size of the <span id="misp_compose_3" class="hm">iframe</span> to be the size of the monitor resolution (i.e. independent of current browser size)</li>
</ul>
<p>The result of this complexity was that the <span id="misp_compose_4" class="hm">iframe</span> nearly&#8212;but not quite&#8212;fitted into the browser&#8217;s content area (assuming you had your browser maximised).</p>
<p><img src="http://www.databasesandlife.com/blog-attachments/20080416-scrollbar-in-scrollbar-new.png" alt="" width="160" height="126" /></p>
<p>Because the <span id="misp_compose_5" class="hm">iframe</span> didn&#8217;t sit quite at the top of the window, this meant:</p>
<ul class="tight">
<li>The bottom &#8220;scroll down&#8221; arrow button of the inner <span id="misp_compose_6" class="hm">iframe</span> (the one you need to actually see more content) was, by default, off the bottom of the screen</li>
<li>As there was content beyond the bottom of the screen (the &#8220;scroll down&#8221; button of the <span id="misp_compose_7" class="hm">iframe</span>) the main browser window also displayed a scroll bar</li>
<li>To scroll to see more data, you need the inner <span id="misp_compose_8" class="hm">iframe</span>. But instinctively one reaches for the right-most <span id="misp_compose_10" class="hm">scrollbar</span>, as that&#8217;s normally what you need to see more data, e.g. in a browser, Word document, etc.</li>
<li>It only got worse if you don&#8217;t use the browser maximised.</li>
</ul>
<p>I mean I don&#8217;t know what aspect of &#8220;the browser will display a scroll bar if there&#8217;s more data than will fit in the window, without needing Javascript and <span id="misp_compose_11" class="hm">iframes</span>&#8221; they didn&#8217;t understand.</p>
<p>I suppose the navigation will always be shown, but I think people are used to navigation scrolling with the content these days. And if the objective was to keep the navigation on the screen, one could have used a normal frame, which would have required no Javascript, and still resulted in only one <span id="misp_compose_12" class="hm">scrollbar</span>, in the place where the user expects it.</p>
<p>Really the whole &#8220;<span id="misp_compose_13" class="hm">scrollbar</span> within a <span id="misp_compose_14" class="hm">scrollbar</span>&#8221; concept (which unfortunately is pretty much mandatory if you have a text area within a <span id="misp_compose_15" class="hm">webpage</span>) is really so nasty.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.databasesandlife.com/too-much-data-for-the-browser-window-solution-iframe/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Explicit vs Implicit data typing</title>
		<link>http://www.databasesandlife.com/explicit-vs-implicit-data-typing/</link>
		<comments>http://www.databasesandlife.com/explicit-vs-implicit-data-typing/#comments</comments>
		<pubDate>Thu, 10 Apr 2008 04:37:26 +0000</pubDate>
		<dc:creator>adrian</dc:creator>
		
		<category><![CDATA[Software Design]]></category>

		<guid isPermaLink="false">http://www.databasesandlife.com/?p=298</guid>
		<description><![CDATA[I was reading this article about how certain data gets messed up when one imports it into Excel (certain data looks like a date and thus gets converted into one), and it reminded me of a problem I had when transferring data over an XML protocol from Perl (the SOAP library was inspecting the hex [...]]]></description>
			<content:encoded><![CDATA[<p>I was reading <a href="http://www.theregister.co.uk/2004/07/16/excel_vanishing_dna/">this article</a> about how certain data gets messed up when one imports it into Excel (certain data looks like a date and thus gets converted into one), and it reminded me of a <a href="http://www.databasesandlife.com/transfering-some-hex-sometimes-gets-replaced-by-string-inf-why">problem</a> I had when transferring data over an XML protocol from Perl (the SOAP library was inspecting the hex data I was transferring, but a small percentage of hex numbers look like &#8220;123e123&#8243;, which looked like a floating point number to the library)</p>
<p>I think both problems are actually the same problem. It can be traced back to the necessity to make exactly one of the following two decisions when creating data-processing systems:</p>
<ol class="tight">
<li>Either you try and <span style="font-weight: bold;">work out</span> what the datatype of a piece of data is by looking at e.g. the data&#8217;s string representation. E.g. this data is &#8220;abcd&#8221; so it&#8217;s a string, this data is &#8220;123&#8243; so it&#8217;s a number.</li>
<li>Or you <span style="font-weight: bold;">explicitly</span> store and state, external to the data, its data type. E.g. store not only that the data is &#8220;123&#8243;, but that it&#8217;s a number.</li>
</ol>
<p>Option 1 seems attractive as it&#8217;s simpler as you only need to store one piece of data. It also feels more normalized, as one piece of data is generally better than two (e.g. what if they are inconsistent, e.g. data is &#8220;abcjzh&#8221; and type is &#8220;number&#8221;?)</p>
<p>But the trouble is you option 1 doesn&#8217;t work (see above.)</p>
<p>But it gets worse. Option 1 <span style="font-style: italic;">seems </span>to work, yet does <span style="font-style: italic;">not </span>actually work in all case (and you want your software to work in all cases). That&#8217;s more dangerous that if it simply and clearly didn&#8217;t work.</p>
<p>The authors of the SOAP library in my example presumably believed their software worked. And I believed my software, built on top of the SOAP library, worked. It worked in my unit tests and when I tested it by clicking-through the front-end. Only 0.6% of users had a code with a hex string that looked like an exponent, so it&#8217;s understandable that I just didn&#8217;t hit it when testing. But with e.g. 2M users in the database, some of my users will hit it. And that means that the software I released didn&#8217;t work (working meaning working 100% for everyone.)</p>
<p>But I like my software to work. The way I achieve that is to avoid errors which are difficult to detect. Making errors is human; if they are easy to catch, one can spot them and then correct them.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.databasesandlife.com/explicit-vs-implicit-data-typing/feed/</wfw:commentRss>
		</item>
		<item>
		<title>UK Fiancée Visa application successful</title>
		<link>http://www.databasesandlife.com/uk-fiancee-visa-application-successful/</link>
		<comments>http://www.databasesandlife.com/uk-fiancee-visa-application-successful/#comments</comments>
		<pubDate>Wed, 09 Apr 2008 07:46:47 +0000</pubDate>
		<dc:creator>adrian</dc:creator>
		
		<category><![CDATA[Life]]></category>

		<guid isPermaLink="false">http://www.databasesandlife.com/?p=297</guid>
		<description><![CDATA[On Monday we went over to Hong Kong to make the application for the Fiancee Visa, for Christina to enter the UK to get married. She is allowed to enter the UK anyway for tourism, but to get married the visa is required. The visa lasts 6 months, does not entitle Christina to work in [...]]]></description>
			<content:encoded><![CDATA[<p>On Monday we went over to Hong Kong to make the application for the Fiancee Visa, for Christina to enter the UK to get married. She is allowed to enter the UK anyway for tourism, but to get married the visa is required. The visa lasts 6 months, does not entitle Christina to work in the UK, and after marriage the next visa (Spouse Visa) has to be applied for (although we will do this for Austria instead of the UK).</p>
<p>I had prepared <a href="http://www.databasesandlife.com/documents/">lots of documents</a> for the application process; however by the time we went over there this set of documents had grown to at least twice or three-times this amount.</p>
<p>Ironically there was some repeating-DVD on the TV screens in the waiting room, where some applicant actress says &#8220;wow, I didn&#8217;t know it would be that fast and easy!&#8221;. I thought this was extremely ironic.</p>
<p>But in fact, contrary to our expectations, it was indeed both fast and easy (not including preparation of the documents). They called us 45 minutes after we left the building to tell us Christina&#8217;s passport containing the visa could be picked up.</p>
<p>We had prepared all the documents with the originals and copies collated (as they said they needed to take both away, and would give us the copies back). I had thus prepared a large spreadsheet listing all the documents that I wanted back (e.g. original bank statements). I thought merely sorting all this stuff out to give back to us would take half a day! But when we got the documents back they were all collated just as we&#8217;d given them to them, i.e. they hadn&#8217;t even taken the copies out. I suppose they didn&#8217;t look at the documents that much, or even at all?</p>
<p>The only things they took out were the copy of the passports (incl. stamps of our entry/exit to/from Macau/Europe) and my covering letter explaining my financial and employment situation, our plan to live with my parents initially when in the UK etc.</p>
<p>So maybe we could have spared most of the effort of the preparation of the documents? (Or maybe not, maybe a precondition of granting the visa was a certain mandatory documentary weight?)</p>
<p>Anyway, it&#8217;s good news. We have to make an application in the UK to &#8220;give notice&#8221;. But I&#8217;ve had chats with the office that do that, they imply it&#8217;s easy, and that the Fiancée Visa is the difficult one. So let&#8217;s hope that&#8217;s true.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.databasesandlife.com/uk-fiancee-visa-application-successful/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Programming Languages: Is newer always better? (Part 2)</title>
		<link>http://www.databasesandlife.com/programming-languages-is-newer-always-better-part-2/</link>
		<comments>http://www.databasesandlife.com/programming-languages-is-newer-always-better-part-2/#comments</comments>
		<pubDate>Fri, 28 Mar 2008 07:49:29 +0000</pubDate>
		<dc:creator>adrian</dc:creator>
		
		<category><![CDATA[Coding]]></category>

		<guid isPermaLink="false">http://www.databasesandlife.com/programming-languages-is-newer-always-better-part-2/</guid>
		<description><![CDATA[Let me respond to some of the comments left at &#8220;Programming Languages: Is newer always better?&#8221;
First up, Knowing what&#8217;s going on:
This is a terrible example. You are really arguing that PHP programmers don&#8217;t know how their language works while C programmers do. This is a horribly wrong-headed assertion. How about I counter your straw man [...]]]></description>
			<content:encoded><![CDATA[<p>Let me respond to some of the comments left at &#8220;<a href="http://www.databasesandlife.com/programming-languages-is-newer-always-better/">Programming Languages: Is newer always better?</a>&#8221;</p>
<p>First up, <span style="font-weight: bold;">Knowing what&#8217;s going on:</span></p>
<blockquote><p>This is a terrible example. You are really arguing that PHP programmers don&#8217;t know how their language works while C programmers do. This is a horribly wrong-headed assertion. How about I counter your straw man with one of my own. I know plenty of new (as of the last 5 years) C programmers who have no idea that 0 is equivalent to NULL.</p></blockquote>
<p>Yeah you&#8217;re right, this point is probably untrue.</p>
<p>At the time I wrote it I was getting frustrated with PHP programmers who didn&#8217;t know the difference between == and ===. I still have the feeling that Java and C books tend to concentrate firstly on the fundamentals of the available data types and operations, whereas introductions to PHP tend to focus on just writing code that looks OK and seems to do the right thing (an attitude which leads one to write programs with subtle bugs).</p>
<p>But, having thought about that a bit more, that probably has more to do with my exposure to books written for people who can already program, vs articles about PHP on the web. And probably really does have nothing to do with the language whatsoever.</p>
<p><span style="font-weight: bold;">Strict typing</span></p>
<blockquote><p>You want the compiler to check that a method can only receive an object of type SomeObject while I want any method to be able to receive any object as long as it responds to (or has the same interface) as SomeObject.</p></blockquote>
<p>I used to think this way for quite a time, when I was programming Objective-C: that it was cool to write code which took any object as long as it responded to a certain set of methods. And that asserting an object must be of a particular class or respond to a particular interface made my code less flexible and reusable.</p>
<p>However after a time, looking at both my own Objective-C code and that written by colleagues, you would see methods like saveToDb:anObject. That method assumed that the parameter anObject responded to certain methods (by virtue of the method&#8217;s body calling those methods on its parameter), yet this was not documented in the method&#8217;s prototype (although it could have been placed in a comment had the programmer decided to), and could not be checked at compile-time. It gets worse when anObject is simply passed to some other function, so you have to open that in the editor to determine what type of object you can pass there. And you&#8217;re out of luck if you don&#8217;t have the source code. And even if you do document the type in a comment, you can&#8217;t build an IDE where you can just click on the type and it opens the definition, immediately listing its methods and documentation.</p>
<blockquote><p>C, Fortran, C++, Java and Pascal require static definitions and suffer greatly for it. C++ (again) and Java (again) have templates/generics to fake this kind of feature and suffer horribly for it.</p></blockquote>
<p>I have to agree that what really has improved in modern languages and runtimes (post concerning improvements in the future) is that the runtime knows what type of object a reference points to. Using void* in C is nasty.</p>
<blockquote><p>No, Perl isn’t strictly typed and can’t do what you’re saying. But once again, you can check things. You can validate that an Object is a particular class or descendant of a particular class. As with the variable bounds, you can validate your data.</p></blockquote>
<p>This is true, you can do that. But it doesn&#8217;t happen at compile-time (which means if you didn&#8217;t unit test or click-through that code path, you don&#8217;t see the error), and other programmers may choose not to even put the acceptable ranges or types in comments, and then you&#8217;ve got code which takes $x and then you&#8217;re really stuck. (Although I suppose if you work with programmers who don&#8217;t like to make readable code, you&#8217;re stuck no matter what language they&#8217;re programming in; I mean you can make unreadable code in any language.)</p>
<p><span style="font-weight: bold;">Enumerated types</span></p>
<blockquote><p>This is a great feature modern day languages have though maybe it isn&#8217;t called &#8220;enumerated type.&#8221; Ruby has symbols so you can say your types are :hot, :warm, :lukewarm, :cold. These symbols mean the same thing everywhere. To use your PHP example in Ruby, how about error_log(&#8221;user not found&#8221;, :user_not_found). In this example, you don&#8217;t know the languages you are criticizing.</p></blockquote>
<p><span>Well that&#8217;s great that Ruby has such a feature, but Perl and PHP still do not have such a feature. If they did, PHP wouldn&#8217;t have defined its error_log function that way. So when I&#8217;m programming those two languages (which I do a lot, alas) I am forced to write less readable code. (Even after defining constants, i can still pass <span style="font-style: italic;">gender_male </span>with a value of 3 to a function expecting a <span style="font-style: italic;">state </span>where 3 means the user has been deleted, and it won&#8217;t even exit with an error, let alone give me a compile error: it will simply do the wrong thing.)</p>
<p><span style="font-weight: bold;">No Compiler</span></p>
<blockquote><p>Please point me to a modern language that is slower with longer variable and method names. Ruby, Perl, Python, OCaml and Erlang all &#8220;compile&#8221; the code to an intermediate form (bytecodes) and then execute those.</p></blockquote>
<blockquote><p>What? Are you suggesting that a comment in a procedure is parsed every time the procedure executes? I don’t know a single interpreted language implementation that would do that. The only exception are calls to &#8220;eval&#8221; or similar functions.&nbsp;</p></blockquote>
<p>As Perl, PHP etc all take plain-text files as their input, it follows that they have to process these files, byte per byte. Agreed, the better ones parse the source to an intermediate form where e.g. execution of loops will not be slower for longer variable names or a more complex programming style, but they still have to take the hit once, during the conversion from the text form to the intermediate form.</p>
<p>I have experienced this first hand. <a href="http://www.uboot.com/">Uboot</a>  has about 350k lines of code (which is not unreasonable, the system provides mail, sms, photo galleries, blogs, subscriptions, and many more features, some of which are not active any more.) That takes about 4 CPU-seconds to convert to intermediate code (maybe faster these days, that was about 2 years ago). On each server we have about 30 instances of that code running. That means when we restart a webserver, it&#8217;s down for about 2 minutes. It does 2 minutes of useless work!</p>
<p>I have been told often enough, since working at Uboot, that I use the language wrong, that my programming is too &#8220;Java style&#8221;. The solution, I&#8217;m told by experienced Perl web developers, is simply not to write 350k lines of reusable library code, but instead write a simple large script with all the code rolled together. It starts faster, runs faster, and consumes less memory. And I&#8217;ve tried it: on some performance-critical sections I have indeed manually copy-pasted sections of code together to form one simple script, and it really does compile and run orders of magnitude faster.</p>
<p>I&#8217;ve essentially manually done what I would like a compiler to do. But that&#8217;s not the way I want to program. I do not want to be rewarded at runtime for bad programming practice!</p>
<blockquote><p>*Every* language bears this cost because they *all* to have to parse the code at some point to either turn it into bytes or machine code.</p></blockquote>
<p>That is very true, but some languages do this on your build machine, not on your production machines when you start the service.</p>
<p>Also, doing this on your build machine means you can perform more expensive optimizations, as you don&#8217;t have to worry about how long those optimizations take, which you do if the compiling means your service starts slower.</p>
<p><span style="font-weight: bold;">No linker</span></p>
<blockquote><p>Your argument here is about memory footprint. This is a total non-starter on any modern operating system that does demand paging. If huge sections of your ruby/perl/python/whatever library are not used, the OS will never page them into RAM.</p></blockquote>
<p>This depends where you wish to deploy to. For sure, on a web-server, this doesn&#8217;t matter.</p>
<p>On Uboot I wrote the &#8220;Uboot Joe&#8221; which is a program you can download to your Windows computer. I made the mistake of writing it in Java. To distribute it, I distributed the whole JVM (as most users won&#8217;t have one) which includes all sorts of things I never used, I included XML-RPC libraries (which no doubt include methods I never used), as well as my own code. The entire bundle came to 15MB. Our users had to download that just to get a program sitting on the tray, connecting to the Uboot servers, and popping up a few notifications. The size of this download file was attributed to one of the reasons why the program was not successful.</p>
<p>Yet cutting out unused functions via a linker is not rocket science. All C linkers do this (as far as I know).</p>
<p>I don&#8217;t think including the JVM was an incorrect decision; the file would not have been so excessively big if the download had included the Java runtime, but only those classes and methods of the JVM which I, or the libraries I had used, could actually possibly call at runtime.</p>
<blockquote><p>I don’t write massive GUI apps in Perl.</p></blockquote>
<p>Unfortunately I do write massive apps in Perl (albeit not GUI ones). And I did use Java to write a downloadable GUI app (albeit a simple one).</p>
<p><span style="font-weight: bold;">Multiple compile errors</span></p>
<blockquote><p>I prefer to write a test, watch it fail, write the code to make it pass. </p></blockquote>
<p>Right, but I&#8217;m tired of having to write test cases for trivial methods.</p>
<p>If I write a setter, I have to write a test case in Perl, otherwise it might fail because I made a spelling mistake. (I know from experience, writing test cases for even such trivial things really does actually help in Perl.)</p>
<p>In Java I don&#8217;t bother testing trivial methods; they just work.</p>
<p><span style="font-weight: bold;">Formatted Strings</span></p>
<blockquote><p>I went through a long period of time wondering this myself. I thought sprintf was good enough all these years, why should I bother with iostreams. Well, I experienced one too many crashes from the simple error of mismatching the printf format specifier with argument type (%s -&gt; int). These instances usually occur in logging statements that you don’t always encounter in normal code paths. This problem goes away completely with iostreams, as the most important benefit is type safety.</p></blockquote>
<p>Ah that&#8217;s true. And one of the good things about modern systems (article forthcoming) is that they know what the types of things are at runtime. If they don&#8217;t (C++ by default), then I agree with you completely.</p>
<p>I suppose my point more related to the needless leaving out of good things which existed in the past. Java had to wait till 1.5 to get printf (and 1.4 for regular expressions). One should be more aware of the history of programming languages, and what things have already been thought of.</p>
<p><span style="font-weight: bold;">Auto-creation of variables</span></p>
<blockquote><p>I agree with you on this one. It should be noted this is considered horribly bad practice in Perl now. Adding one line, &#8220;use strict;&#8221;, stops this from happening and every program I write begins with that. I think the PHP folk have long since started declaring and initializing variables for the most part. So it didn’t work.</p></blockquote>
<p>That is true, that &#8220;use strict&#8221; helps.</p>
<p>Alas many languages such as PHP do not have such a &#8220;use strict&#8221;.</p>
<p>However, even in Perl with &#8220;use strict&#8221;, you can still misspell a function/method name and that will only get picked up at runtime (assuming you unit-test or click-through that path, otherwise it will go unnoticed), and if you misspell an attribute name in a $self hash, that only gets picked up at runtime.</p>
<p>I mean the flexibility that Perl offers (i.e. you can fill the $self hash with anything, and write an AUTOLOAD method which gets called when a method does not exist) would mean that it would not be possible to check those things at compile-time. However for me the benefit of catching errors at compile-time outweighs the benefits of the flexibility. But that is a matter of opinion, for sure.</p>
<blockquote><p>Several features are dropped from new languages because the designers consider it &#8220;very dangerous, no _real_ programmer would ever use that&#8221;. As that’s a matter of opinion, we lose several powerful features just because they are&#8230; hmm&#8230; powerful. For example: GOTOs and Multiple Inheritance.</p></blockquote>
<p>That&#8217;s for sure true. However I would use that argument to say that the power which one gains from the totally dynamic runtimes and languages (such as Perl $self hash and AUTOLOAD mentioned above) are too powerful (and means certain static checks cannot be done). But that&#8217;s a matter of opinion for sure.</p>
<blockquote><p>If it’s Turing-complete, your language is ultimately fine.</p></blockquote>
<p>I&#8217;m not sure about that. For me, a programming language is firstly a communication tool from one programmer to another programmer (or to the first programmer, but later). Secondly it is a way to express as many invariants as possible. Only thirdly is it a way to command the machine (which, as you say, all languages, including assembler, are capable of).</p>
<p>In that respect, one should choose a language firstly giving you maximum expressiveness (e.g. using an object-oriented language to program an object-oriented design, using a language which does not penalise you for creating libraries even if not all functions in the library are used in every program, etc.).</p>
<p>And secondly one should choose a language which enables you to express as many invariants as possible (e.g. the object being passed here should <span style="font-style: italic;">always</span> be a User, this number should <span style="font-style: italic;">always </span>be between 2 and 20, this reference should <span style="font-style: italic;">never</span>  be null), serving both as mandatory documentation and as a way for a computation process (e.g. compiler) to check as many of these invariants as possible.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.databasesandlife.com/programming-languages-is-newer-always-better-part-2/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Programming Languages: Is newer always better?</title>
		<link>http://www.databasesandlife.com/programming-languages-is-newer-always-better/</link>
		<comments>http://www.databasesandlife.com/programming-languages-is-newer-always-better/#comments</comments>
		<pubDate>Wed, 26 Mar 2008 09:58:03 +0000</pubDate>
		<dc:creator>adrian</dc:creator>
		
		<category><![CDATA[Broken]]></category>

		<category><![CDATA[Coding]]></category>

		<guid isPermaLink="false">http://www.databasesandlife.com/programming-languages-is-newer-always-better/</guid>
		<description><![CDATA[I constantly hear the belief that modern programming languages and environment are better than older programming languages. More productive, easier to user, and so on. It would stand to reason: nobody would make a new programming language with worse features than an already existing programming language. Or would they?
Everyone seems to think that this is [...]]]></description>
			<content:encoded><![CDATA[<p>I constantly hear the belief that modern programming languages and environment are better than older programming languages. More productive, easier to user, and so on. It would stand to reason: nobody would make a new programming language with worse features than an already existing programming language. Or would they?</p>
<p>Everyone seems to think that this is fact. But surprisingly it&#8217;s not. There are many features in older programming languages which are not present in today&#8217;s languages. I predict these features will be re-invented by the next generation of programming languages authors, and everyone will think they are geniuses for having come up with these ideas. But at the same time those new languages will omit most of the good points of today&#8217;s languages. This cycle can go on forever.</p>
<p>It&#8217;s like the cycle that tends to take place of &#8220;the network&#8221; vs &#8220;the standalone computer&#8221;.</p>
<ul class=tight>
<li><strong>Central</strong> - IBM used to make mainframe computers, which one would access from terminals, i.e. central computing power, distributed usage.</li>
<li><strong>Local</strong> - But those computers were slow because they were remote. Then e.g. Sun invented the &#8220;workstation&#8221;. The PC then followed. Local power to everyone.</li>
<li><strong>Central</strong> - Then the web happened. Suddenly everything was remote again. &#8220;All you need is a browser!&#8221;. No local software installation nightmare. (Perhaps) independence from the single operating system vendor.</li>
<li><strong>Local</strong> - And now &#8220;using the web offline&#8221; is back in fashion. So that&#8217;ll be local computing again then.</li>
</ul>
<p>A few facts, for those who think there was no programming before Javascript, the web:</p>
<ul class=tight>
<li><strong>1957</strong> - Fortran released: expressions, variables, loops, subroutines</li>
<li><strong>1959</strong> - LISP released: treating functions as data, enabling higher-order programming</li>
<li><strong>1967</strong> - Simula 67 released: Object-oriented programming</li>
</ul>
<p>Consider the following:</p>
<ul>
<li><strong>Variable Bounds.</strong> Ada, developed for the American military, with high emphasis on program correctness, allows one to define bounds to variables. For example &#8220;array with index between 1 and 100&#8243; or &#8220;0 and 10&#8243; or number &#8220;not more than 5&#8243;. Most variables, in reality, have allowed ranges. Why not express it in the program, it&#8217;s more self-documenting and it allows the run-time, and to an extent the compiler. to check the constraints. Isn&#8217;t minimization of bugs something that affects not just the military?</li>
<li><strong>Strict typing.</strong> If you know an object being passed to a function is a &#8220;User&#8221;, it&#8217;s no good being passed an &#8220;Email Address&#8221;. The set of operations those objects can perform are completely different, so even if the programming language is &#8220;advanced&#8221; enough to be able to accept the parameter, the first method call to the object will fail. Why not express that and let the compiler check that. C++ can do it (since 1983) so let&#8217;s use that not Perl which can&#8217;t do it. Recently I read an article making a joke about <a href="http://thedailywtf.com/Articles/Type-Safety-Considered-Harmful.aspx">casting everything to a string</a>, but in reality that&#8217;s the default behaviour (in fact the only behaviour) of all scripting languages.</li>
<li><strong>Knowing what&#8217;s going on.</strong> In C, it&#8217;s well defined what &#8220;0&#8243; means or what the string &#8220;abc&#8221; in a program means, and so. Ask a C programmer if 0==NULL and as a PHP programmer if 0==null and see a) their reaction times b) if they&#8217;re correct. The C programmer will know fast and be correct, the PHP programmer will not. Who do you think writes programs with fewer subtle bugs?</li>
<li><strong>Enumerated types.</strong> Is a user &#8220;active&#8221;, &#8220;disabled&#8221;, &#8220;inactive&#8221;? Having such options are common to all domains. C can define an enumerated type since ANSI C (1989) and Lisp since 1959. Why did Java have to wait until Java 5.0 (in 2004), and why do we have to create unreadable programs with languages like Ruby which can&#8217;t do them at all? For example what does the function error_log(&#8221;user not found&#8221;, 2) do in PHP, what does the <a href="http://php.net/error_log">2 mean</a>?</li>
<li><strong>No compiler.</strong> Every byte in an interpreted language costs time to interpret. So it makes sense to have short variable names, fewer comments, for run-time efficiency. Is this the sort of programming style one should be encouraging?</li>
<li><strong>No linker.</strong> You can build big libraries in a linked language, and only those functions used by the program (or used by the functions used by the program) will be included in the final executable. In Java, PHP etc, all the code you use is available all the time, taking up memory. I am often criticized for writing &#8220;too many libraries&#8221;, or code being &#8220;too object-oriented&#8221; in scripting languages, which is a fair criticism, as that code will run slower. However is it really an improvement to remove this function-pruning feature, which means bad programming practices will produce more efficient code?</li>
<li><strong>Multiple compile errors.</strong> Why do modern programming languages such as PHP only tell you the first error in your program, then abort? This is laziness on the part of the compiler writer. Old compilers tell you all the errors in your program, so you can correct them all, without having to correct one, retry, correct next one, retry, and so on.</li>
<li><strong>Formatted strings.</strong> There is nothing wrong with the format concept behind C&#8217;s &#8220;sprintf&#8221; command, originating from 1972. You can print numbers, strings, specify precision, field length and so on. (Apart from the inability to reorder parameters.) Why did C++ introduce the &#8220;<<" notation? (At least you can still use printf in C++). Why is this re-invented, worse, in <a href="http://msdn2.microsoft.com/en-us/library/txafckwd.aspx">.net</a>? Why did Java have to wait until Java 5.0 to get this feature? Why do we have to reinvent the wheel (worse) all the time?</li>
<li><strong>Auto-creation of variables.</strong> When programming languages like C were created, the authors made the decision that it was an error to use a variable without declaring it. This caught all sorts of errors such as misspellings of variables. Why have these decisions been forgotten, and every scripting language allows you to just use variables without declaring them? This means hours of searching for bugs when you simply misspell a variable name, something that&#8217;s going to happen to everyone at some point. We&#8217;re only human and we have to take that into account.</li>
</ul>
<p>The above is a list of things that have got worse over the last 2 decades, I.e. they haven&#8217;t just not got better by staying the same, but these things have actually got worse.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.databasesandlife.com/programming-languages-is-newer-always-better/feed/</wfw:commentRss>
		</item>
	</channel>
</rss>
