Monday, 28 September 2009

Documentation rant - part two

Last time I desperately tried to convince you to take source code documentation seriously and not treat it as a hurried afterthought, lest the technical debt management unit catches up with you and demands an explanation as to why you executed an explicit commit on that database handle, when it is running in auto-commit mode. Believe me, in a year you will have forgotten the completely valid reason for doing so unless you document it today.

I would like to dwell a little on the difference between API documentation and the rest, because it's an essential difference. In Java, API documentation can be generated from your source files, provided your code comments are properly formatted for the javadoc standard. Modern development studios like Eclipse and Netbeans make life very easy in that regard.

On the other hand we have these little one-word or one-line comments scattered throughout. Let's call them inline comments. Whereas these are an integral part of your source code, API docs are extracted to be read separately, intended for your users. Users of a third-party software library are programmers, but users nonetheless insofar as they should be expected to use your API without reference to the sources. In many commercial software libraries they will not even have access to them. Therefore make sure that the documentation for your classes and methods are mini-manuals.

The manual tells you what your library (or toaster) does and how to operate it. It only touches on technical details where they are relevant for the operation. Documentation for a class or method should cover behaviour (what goes in, what comes out, what can go wrong) but no explanations as to why. If it becomes necessary to explain something unintuitive in your design it's your job to fix the design. Unless of course you're not in a position to do so.

If you find yourself writing a lot of inline documentation, one or more of the following may be the case:

  • // loop though all Address objects
    for ( Address adr : getAddresses()){

    //throw a ValidationException if the postcode doesn't match the regular expression
    if ( !POSTCODE_PATTERN.matches(adr.getPostcode()) ){
    throw new ValidationException(...)
    }
    }
    You underestimate the intelligence of your fellow programmers. Assume that the reader of your code is a competent programmer. If it doesn't pass you own 'duh-test', leave it out.

  • Your typical opening and closing braces are regularly more than a few hundred lines apart. You keep getting tangled up in your own spaghetti and have to tell yourself what you're doing every three lines. This is really bad and I'm not even going to explain why. In fact, I won't even give you an example. It's time for some serious refactoring.

  • You're programming against an undocumented, unintuitive or buggy API (probably all three at the same time). It has methods called doStuff or patrick_solution. You have my sympathy. This is where inline comments are indispensable, because you can tell the the world it's not you who is incompetent.
    Dog fido = zoo.getDogByName(“Fido”);
    //Yeah, it says Id, but it's actually a lookup by name
    Cat minou = zoo.getCatById(“Minou”);
    proxy.insertAndCommit(newRecord);
    //insertAndCommit doesn't seem to autocommit on MySQL4.X . //AARRGH!!
    proxy.getConnection().commit();

  • You get paid by lines of code. Where do you work, and are they hiring?

As far as inline comments go I see a lot of type one and type two, and not nearly enough of type three, especially in code that makes frequent use of open source libraries. There are a of unintuitive and badly documented libraries around that actually work fine once you know where the pitfalls are. Why any programmer would not run that extra mile to make their stuff actually usable is beyond me. Unless they really are only working for their own pleasure.

Friday, 18 September 2009

The passport from hell

This week I was going to post part two of my course in source code documentation, but something far more important has come up to rant about. It's the new Dutch passport, which will hold the owner's digitally encoded fingerprints, in time amounting to a huge biometric database. The year is 1984 again.

This post is not going to be about hackers logging in to the monster database with password "change_on_install"*. I don't doubt the system can be compromised. It's only a matter of time. Those responsible for guarding sensitive data in the Netherlands have proved themselves shockingly cavalier and nothing but an embarrassment of epic proportions is likely to effect a change. Apparently with a cheap set-up and some patience you can produce your own forged prints on plastic foil and wear them to the scene of the crime. It must be true, because I read it on the Internet.

You know, forget about fingerprints. They can be forged and therefore in court they don't always stand up. We all know the holy grail of forensic conclusiveness is DNA. Unless you have an identical twin sibling, your DNA is intimately yours. It's impossible to produce your own fake DNA to throw forensics off the scent. Next time you get a passport you'll be handing over a saliva sample, mark my words.

Anything that can go wrong will go wrong. Any technology or data that can be put to evil use will be abused. In the same way that the usefulness of a mobile phone network grows exponentially with each new user, so will a biometric database of all citizens.

Imagine how easy it becomes for prospective wrongdoers to incriminate someone of their personal acquaintance, secure in the knowledge that their DNA is stored and can be pulled from the database in no time at all. Consider how easy it is to obtain a DNA sample from someone you know. I don't mean a printout from the lab, but actual tissue. Any article of clothing is teeming with it. Just steal a hairbrush and carefully place a few hairs (not too conspicuous, of course) over the murdered body of your choice. Make sure the intended suspect has no credible alibi and they have been in contact with the future victim, preferably with some supporting CCTV footage. Bob's your uncle.

* This is the default administrator password for Oracle databases, and it's appalling how often I have found it still used in production systems, despite the unambiguous hint in its name...

Saturday, 12 September 2009

Brush your code after every meal

Programmers have many pet hates -- hardware and software being just two of them. There is however a bewildering paradox that I would like to talk about today. It is the anguish of documenting your own code on the one hand and the torture of having to use someone else's undocumented code on the other. Actually there is one thing worse than not having documentation. That is bad or outdated documentation and the world of open source software is rife with it. Old docs are like an old copy of the Lonely Planet where you travel half a day to visit some must-see haunt only to find out it closes on Sundays.

To many programmers writing source documentation is up there with cleaning the rim of the toilet bowl in terms of satisfaction. It shouldn't be like that if these people took a more selfish approach. Taking your documentation seriously is not for the good of mankind. It's all about doing yourself a favour in the end.

Source documentation is different from functional requirements and specifications in terms of the intended audience: it is written by techies to be read by techies. It is also different from other technical documents such as UML diagrams in terms of its purpose. Requirements and specifications describe what the software should do, whereas source documentation describes what the system actually does. As a consequence it is impossible to document beforehand what some class or method does before you have coded and run it. If you do document beforehand you'll need to check carefully that what you have coded is in line with your documentation, otherwise your carefully crafted text is instantly useless. And we already know that bad documentation is worse than none at all.

There is another good reason why you should document after coding. The writing of a piece of software is a fluid process, especially in the early stages. Over a short period of time you will throw away some classes, split them up, merge them and add or remove arguments. The less functional requirements you have, the more this will be the case. All the while you keep a clear mental image of the whole, and once you're satisfied you go over it again and describe exactly what you have done. In an ideal world, that is.


The satisfaction of seeing your own code work makes you hungry to write more. Why should I write down what happens when I can see what happens? Try to resist the urge to steam ahead. Take a step back, revise what you have written and describe it. More often than not you'll spot a bug or two in the process. More importantly though, all those classes and their interrelations make perfect sense in your brain now, but won't in year's time. Pay back your technical debt now before the interest eats up your team's budget. If you don't care for your employer's money at least protect your own future sanity. For those still unconvinced let me make it clear with a little dental metaphor. Brushing your teeth is not as much fun as the meal that preceded it, but nothing compared to the agony and expense this poor man went through:











Next week I'll share with you a way to make documentation more efficient and less of a chore. I have called it 'the duh test' – if that's too juvenile or American to your taste you may call it the “Well, obviously, my dear fellow” test. Whenever you feel you could stick a duh behind your comments, leave it out altogether. Documentation is about stating the non-obvious.

Saturday, 5 September 2009

Would you download a car?

It's already a few years old, but if you ever bought a DVD in the UK you'll remember this one:
You see nasty people stealing all aforementioned items, and then a teenage girl behind her computer downloading a film, thereby instilling the notion in us that downloading content illegally is tantamount to mugging old ladies in the park. I love the British tenuous sense of proportion in what is otherwise an annoying and superfluous yet unskippable on a perfectly legal copy, but I won't go into that paradox now. Try youtube for some of the hilarious parodies it inspired.

Propaganda works with imagery that evokes an emotional response to accompany your message, but can be completely unrelated to it. Don't underestimate the power of association. People will create a context between what they see, hear and smell, however flimsy the connection. Bad breath has nothing to do with a person's character, but it will ruin any date.

At the far end of the propaganda spectrum we have the infamous anti-Semitic pamphlets of the Third Reich and in a less pernicious form we have Michael Moore's controversial editing of George Bush's finest moments in Fahrenheit 9/11. Even the fact of me mentioning Moore in the same paragraph with the Nazis is purposely creating a connection in your brain right now. Objection, your honor!

I'm not a lawyer, but I did study Dutch law for one year and I will tell you what you and the people who commissioned this silly bit of agitprop already know.

Yes, copyright infringement is against the law in most countries, but if you equate downloading with stealing you practise justice by analogy. You may claim that the effect of sneaking a physical disc out of the Virgin Megastore or downloading it from the Pirate Bay boils down to the same thing – leaving Richard Branson out of pocket. You may even have convincing evidence that verbal abuse can be as bad as physical assault. However, a criminal act is defined by what people do, while the harm it causes to society is (or should be) expressed in the punishment.

There's no people more fussy about wording than lawyers and judges, with the exception of good translators. Informally put, in Dutch law stealing means removing (1) a physical item (2) belonging to someone else (3) without permission (4), with the intention of keeping it (5). If the prosecution can't persuade the judges of all five elements that means you're off the hook.
How about not returning my library books? They're physical, they're not mine, I took them out from the library and I don't intend to bring them back. Ah, but my client did have permission to remove them from the library, your worship. He's not a thief, he's a rotten embezzler of books.

If you want a better analogy, then downloading is like getting on a train or in a cinema without a ticket. You're enjoying something for free that other people paid for. Provided there are enough seats, you don't impede their enjoyment.

So much for this overly long pedantic preamble. I have a confession to make. I count many illegal downloaders among my friends, colleagues and acquaintances. None of them steal cars or beat their spouses as far as I know. So why do they do it?

Downloading is just too easy to do and too easy to get away with. Historically, when a crime is ubiquitous and the perpetrators tough to track down the law retaliates with excessive punishment. Charging Jammie Thomas two million dollars ($80,000 per song) reminds you of the practice of killing horse thieves in the Old West. You wouldn't break the speed limit if it cost you your car and your house. In the Netherlands no civilians are bankrupted for ripping a few albums. Although the political climate is set to change, right now the most compelling incentive for people not to download content would be their conviction that it is simply wrong. It appears most people, especially the young, don't feel that strongly.

The effect that being completely invisible would have on a person's morality has been argued by philosophers and explored in literature and film. I myself believe that the getting-away-with-it part of the attraction is weaker than the it's-not-that-big-a-deal conviction. The effect of illegal downloads is not tangible, like punching somebody in the face. When you don't see your victim you cannot contemplate the harm you have caused. Some will argue that their downloading does no harm at all. They still buy as many CDs as before and will claim it is a victimless crime.

If you want to convince people that downloading hurts, show some of the small record stores and video rental shops going out of business. I don't grudge Metallica drummer Lars Ulrich his wealth, but when he spoke out against Napster I didn't feel sorry for him. The average punter has no compunctions about making millionaires a little less wealthy.

I hope the Capitol v. Thomas case will prove to be Pyrrhic victory in the end for MegaCorp Inc and that the Internet will prove to be a blessing, not a curse. Authors become publishers through print on demand. Bands let you download their music from their own web sites and sell must-have limited editions straight to the fans. Power to the people!