“--” in XML comments

I was slightly surprised today to check XML containing the following for validity,

<!-- "svn info --xml" and "svn log --xml" command -->

and to see the following error from the validity check:

Low-level XML well-formedness and/or validity processing output
Error: -- in comment

I mean basically this error does what it says on the tin, I mean, the string "--" is indeed not allowed in comments. To quote Wikipedia:

The string "--" (double-hyphen) is not allowed inside comments; this means comments cannot be nested.

I mean while technically correct, the implication seems to be that the objective was to disallow the nesting of comments so therefore "--" had to be disallowed. Ignoring the reason why nesting of comments should be disallowed (why?) and the fact that one can disallow nested comments without disallowing "--" (one need only disallow "<!--"). (Did I mention Wikipedia is a site I don't really like, and I have a tendency to believe it is written by fools?)

Also, the error that I originally encountered, I mean I just so got the feeling there was an "if" statement to detect exactly that situation. You know, if you'd removed that "if" statement, everything would have worked fine. There was no need for it. But they had to put it in there, because that's what the XML specification demanded. I am glad I was not the programmer who had to implement that "if" statement, for I fear I would have become annoyed (i.e. more annoyed than I am now, simply encountering it.)

http://www.howtocreate.co.uk/SGMLComments.html explains "why", at least from an SGML perspective:

To put it simply, the double dash at the start and end of the comment do not start and end the comment. Double dash indicates a change in what the comment is allowed to contain. The first -- starts the comment, and tells the browser that the comment is allowed to contain > characters without ending the comment. The second -- does not end the comment. It tells the browser that if it encounters a > character, it must then end the comment. If another -- is added, then it goes back to allowing the > characters:

<!-- this can contain > characters -- this can not,so the comment ends here>

So I tried exactly that example comment in my XML file:

Error: -- in comment

Brilliant. XML "nearly works". In the sense of "doesn't work". See CDATA.

P.S. I recently created a nerdy privacy-respecting tool called When Will I Run Out Of Money? It's available for free if you want to check it out.

This article is © Adrian Smith.
It was originally published on 10 Oct 2012
More on: FAIL | Coding | Language Design