The Robustness Principle

It seems I’m becoming a man of principles. Last week I wrote about the principle of least surprise. This week I was reading this article about martians over at Joel it reminded me about my previous job in some non-obvious ways. And before you say it, no I didn’t work with martians or any sort of aliens for that matter. The main topic of Joel’s blog is very interesting but a side issue is the discussion of the robustness principle as defined by Jon Postel in RFC 793.

2.10. Robustness Principle

TCP implementations will follow a general principle of robustness:  
be conservative in what you do, be liberal in what you accept 
from others.

Whilst Joel is talking about HTML standards Jon is talking about TCP standards. Anyway, whilst this difference is minor the pedant in me is forced to point it out ;-).

However it’s fair to say that such a principle exists and that I’ve been living by it, without actually having a name for it, for some time. The reason I started doing it was because I inherited a large code-base for a distributed system once and I discovered that occasionally components would simply terminate, users would complain and I would have to go and invent some bullshit to explain why our system wasn’t working.

I felt like I had to invent some bullshit because I’d found that some components had statements in them like the following:

    if (such and such)

The ‘such and such’ that caused the termination would be some condition that should not have occurred. You see, the original programmer felt the need to have all unknown conditions delivered to him by aborting the process. This is rather like demanding a criminal’s head on a plate because aborting a process under UNIX will usually cause a core dump. This memory dump file contains the contents of all the memory and registers at the time that abort() was called. With the appropriate tools this gives something to indicate what the cause of the problem might have been. This style of programming is sometimes referred to as kamikaze programming but I think seppuku programming is probably more apt.

My sin was to systematically remove these little time-bombs and attempt to handle the condition that had occurred. The problem with this, of course, is that if any of these conditions have undesirable side-effects then they can store problems up for the future which, I guarantee, will be a lot harder to find and fix.

It seems there is no right solution to this problem either. Extending Joel’s conclusion we could say that the idealist (the original code author) wanted to control the error state and deal with each case appropriately by hand. Perhaps, eventually, all these little abort() statements would get to be unused code and the system would run very smoothly. Perhaps. The pragmatist (me) would have a hard enough time of keeping the system running as it was without having components voluntarily dying all the time.

I stand by my decision to improve stability but I think I probably should have left a couple of the more esoteric abort()’s in place. I did improve the stability of the system to some extent. But my effort to restore faith in the system by making it completely stable would be somewhat thwarted for other reasons (more closely related to Joel’s post about martians). As we shall see next week …

.NET programming

The Principle of Least Surprise

I first came across the principle of least surprise a couple of years ago when someone pointed out to me that some script that I had written had violated it. Naturally defensive, and of course mildly arrogant, I asked my critic what he was talking about. He said that a command line script should return an explanation of its accepted arguments when given a –help or -h argument. He was right and I was ashamed and so I naturally did what any respectful programmer would do, I fumed quietly for a few hours and then fixed it when he wasn’t looking.

It seems then that almost any user-interface should not violate the principle of least surprise and this is common sense. When I click on a cancel button in a dialog it should not accept my input it should return to where I had come from leaving everything unchanged. This was my first violation this week and it’s a common one, a ‘Cancel’ button in a dialogue did the opposite of what I expected. This was in a popular source control tool (that I won’t name because my version is old and the bug is doubtless fixed now) and it did in fact ‘Submit’ the broken change that I was so keen to cancel. After a short bout of tourrette’s and 5 minutes of trying to figure out how to undo the change I was back on track but my confidence shaken a little.

It’s not just user-interfaces that should not violate the principle of least surprise, APIs shouldn’t violate it either. You know the sort of thing. For instance, a ‘getter()’ should be idempotent and not change any state. Languages like C++ have support for this through the ‘const’ method modifier but I can’t think of another language that I know of that enforces this in the same way.

So I was incensed when I thought I’d discovered an example of such a violation in the .NET framework. I was having trouble using the method Uri.MakeRelative(Uri uri) because I had misunderstood exactly what it did (because I didn’t RTFM) and I expected it to simply strip the routing information from the Uri and return the document path. What it actually does is a more general case and it will effectively ‘subtract’ one Uri from the other. Which is more useful when you have things like embedded Uri’s. Anyway, the depth of my sickness isn’t important, the important thing is that I realised that I could have used the principle of least surprise to save me.

As far as I can tell I haven’t ever found a .NET framework method that didn’t do as I expected, that’s partly because of the number of people who use early-release versions of .NET and iron out these problems for me. Thanks! Not only that over the years I have entered into a love-hate relationship with Microsoft. I won’t enumerate the things I hate because disk space is finite but what I really like is the way that M$ have a vast array of tools and products that seem to co-exist with very little problems. This is no small feat, it’s a triumph of quality control and standards and I applaud them.

The conclusion is then, that I should abide and uphold the principle of least surprise wherever possible. Additionally though, I can apply the principle in reverse when I trust the source of the interface to not violate the principle. Moreover if I can succeed in not violating the principle myself then people who use my and my team’s work will have more confidence in it … I can dream, can’t I?


Virtual Host Development: Part 2

Last week I made the case for using virtual hosts for software development purposes. So this week we’re going to actually try it out. There are a number of considerations to make before we get stuck in so let’s go …

Choosing a Virtual Host Manager

As far as I can tell there are two main equivalent software products for virtual host solutions. They are: Microsoft Virtual PC 2007 (MVPC) and VMware Workstation 6 (WS6). The most notable difference between the hosts is that WS6 costs (at time of writing) $189 and MVPC is free.

However, free comes at a cost. There are a number of disadvantages (for me at least) with MVPC that mean that I will be ponying up the cashwa for vmware’s offering.

  • Guest Display Hardware – Since we have a virtual host it also has a virtual display card. The one that sits inside MVPC is only capable of screen resolutions upto 1600×1200. Since I have a wide-screen display that is set to 1920×1200 it’s a bit of a waste. However vmware can also display on multiple monitors making use of dual head displays.
  • Guest Support – Yes I love Linux and I have to say (because I tried it) that trying to create an Ubuntu virtual guest on Virtual PC was like trying to herd cats. I’d get one part working only to find that something else was bust, whilst it’s possible it’s a pain. If you decide to go ahead with this you should take a look at this article which explains the hoops you must get your kitty’s through. In theory, I guess, you should be able to get any x86 OS that works in WS6 working in MVPC (as long as it isn’t 64 bit). So far though, and this is an observation based on one data point, I think the virtual hardware in WS6 must be more generic than MVPC.
  • Guest OS Tools – Both MVPC and WS6 require you to install additional tools after the core OS install that enhance the operation of the virtual machine. Why I couldn’t really figure out exactly what the tools do it seems reasonable to assume that they’re useful. Indeed it’s not until you install the tools on WS6 that you can access the higher resolutions. With WS6 then it provides tools for Linux where as MVPC does not. Whether this makes an observable improvement to the WS6 experience for Linux would require more time to tell ..
  • Host OS Support – VMware will run on Linux. So should I decide that I want to take my removable disks and put them somewhere else I could. At least in theory
  • Converters – WS6 has a free ‘converter’ which allows the creation of WS6 virtual windows (only) hosts from MVPC ones. Which is nice. Not only that though the converters are bootable so that you can create a virtual ‘copy’ of an already installed non-virtual OS. Whilst I did successfully convert MVPC hosts I didn’t try and virtualise an existing host
  • Snapshot Manager – vmware has a snapshot manager which makes the management of snapshots a lot simpler to comprehend. You get a sort of snapshot time-line that you can chose from and cycle through. MVPC has a similar concept but it’s not as clean.

Snapshot Manager

The virtual host manager you chose will depend on what you need. For me MVPC is not a choice, however there are probably a fair few people who could get good results from MVPC making the benefits of WS6 not that important. Since you can download a trial version of WS6 I would suggest you at least try it before deciding. To me WS6 product feels mature and the user interface seems more intuitive. Your mileage may vary.

For the remainder of this discussion I will talk about using WS6. Whilst you could in theory use MVPC and mostly apply the same ideas the two are not equivalent, as already discussed.

Creating The Host

Having settled on vmware I should say a little about the settings I chose. I’m not saying that these are ideal but for development purposes they seemed like sensible defaults to me. Firstly I instructed WS6 to create me a custom virtual host. For the most part I used the presented defaults but deviated on a few items:

  1. Non-Local Disk – When I say non-local I mean non-local to the host OS. If the disk of a virtual host is the same physical disk of the non-virtual host then you will have I/O performance issues. This is especially true for laptops that tend to have slower hard disks than their desktop equivalents (4200rpm vs 7200rpm). That’s why I went and bought an external drive enclosure and speedy hard disk before I started
  2. Memory – Since these hosts are development hosts I’m probably going to need some RAM, especially since I intend to run VS2005 on the windows host. Therefore I chose 1Gb RAM for the windows hosts and 512Mb for the Ubuntu host. Note that for windows changing the amount of memory after the OS is activated can force you to reactivate your license. So getting it right-enough first time would be a bonus. After running both systems on this configuration for a couple of weeks I can say that the settings are convenient enough for me and I’ve not noticed any excessive paging and performance is fine.
  3. Networking – I elected to chose NAT as my virtual networking setup. This puts my virtual hosts on their own subnet, they access the internet as the non-virtual host and then the results are routed back by vmware. The other alternative might be to bridge virtual and real host but this, I think, would make my virtual host a peer of the real one on the domain. Something I want to avoid for now.
  4. Disk Size – I allocated 80Gb disks to each host. In practice I would probably only use about 10% of that space but WS6 does not allocate the space all at once. This means I only use as much space as I need. This obviously incurs an overhead because the disk is allocated on demand. However I wanted to see if this really causes a noticeable problem before I elect to allocate all the disk at once. Keeping the space to a minimum obviously has advantages for backup purposes but also (I would guess) mean that snapshots take less time.

Building The Host

Obviously I must now load an appropriate OS onto each of my virtual hosts. You do this by simply attaching an ISO to the machine and installing in the normal way. Once installed you should install the VMware tools that improve the integration between your host and guest OS. After installing the extensions (and not before) you should activate your license on Windows based-hosts.

Next you should apply all the updates that are necessary to make your host a good netizen, either through Microsoft Update or whatever package manager your OS uses.

It was at this point that I ran into an issue with Ubuntu which relates to the networking. By default Ubuntu 7.10 turns ‘roaming’ network mode on and this doesn’t play nice with WS6. So turning it off and selecting DHCP insted was needed.

Network Roaming Mode Is Bad

Once built and patched I created checkpoints for both systems to rollback to or clone. This will be useful when I need to create a ‘test’ host for deployment or system testing.

Stack ‘Em High

One of my complaints about developing for Windows is the height of the development stack. Not only do I have to install dev studio but I have to install a bunch of service packs, add-ins and 3rd party software, and their service packs, to have anything like a workable development system. The problem is, when something goes wrong and your installation gets knackered in some way (it’s happened to me twice in the last year) you can kiss goodbye to 1 days development time whilst you sort it out.

That’s why, once I had all the development tools installed, I then took another snapshot to record the base development install so I can revert to it in seconds rather than hours. It’s my intention that when I went to install a new software tool I will clone a snapshot, install the software and see what it does before I commit to installing it into the main disk image.

Shared Disks

One important feature of having a virtual host is being able to easily share data between the guest and host OS. The clipboard works fine for a lot of things but when you want to expose a 1Gb ISO from the host to the guest it becomes a bit painful. Sure you could FTP the files from one to the other but that would be a pain. What you really need is to expose a part of the host file system to your virtual machine. Under windows and WS6 this results in the creation of \\.host URI that is accessible from the guest windows OS.

However for Ubuntu it failed because the installation of the tools failed without me noticing!! It turns out there is a bug in one of the vmware headers that incorrectly sets which kernel API to use. This is one of the problems with Linux I guess in as much as vendors have to release source code drivers because of all the different versions the guest OS could be. However, because it’s source code it’s also easy to fix! By following the instructions on the Ubuntu forums I had the matter resolved in minutes.

Accessing Domain Hosts

There’s a couple of problems with interacting with other network hosts when using virtual hosts.

  1. If you’re developing on a shared network and you are unable to make your virtual machine a peer on the network then you will probably have chosen NAT networking. This, as discussed, puts you machines on their own subnet. The consequence of this is that you will need to start using qualified domain names to be able to access the hosts that are in the same domain as your non-virtual machine. This isn’t a huge discomfort and it’s arguably the right thing to do if you’re embedding those names into software that you’re writing anyway.
  2. The second problem relates to the use of Windows authentication. Applications that use windows authentication (like SQL Server Management Studio) will fail because your virtual machine is not part of the domain. One solution is again to make your virtual host a peer on the network. Whilst this would be the ideal it might create implications for how you’re going to manage a battery of virtual machines and might result in some loss of liberty because you’re system administrator got scared. The solution is to use a feature of windows networking that allows the user to manage multiple identities. This will allow you to enter a set of credentials for each machine that you will connect to. After making these changes applications that use Windows authentication work flawlessly.

Stored Passwords


After spending a week of development on virtual machines I like it. I don’t notice any performance lags on the widows host everything works beautifully. On Linux I occasionally have mouse problems where the mouse refuses to roll over a particular screen area. This needs to be reset by disconnecting control from the VM (Ctrl+Alt) and then clicking (to regain the control) and trying again. It’s possible that this problem is something to do with known mouse problems, perhaps more investigation is required.

The WS6 ribbon makes switching between machines and displays simple. The ribbon also makes using WS6 analgous to using remote desktop which feels natural.

WS6 Ribbon

It had been my original intention to run only my development needs inside the virtual machines and keep the real machine for my office needs like e-mail, spreadsheet and word processing. However that becomes awkward to juggle between and means that I sometimes miss e-mail or IMs when the development host is maximized over the top of the host desktop. Having said that though maybe not getting distracted is a good thing 😉

However, I’m starting to think that I should adjust the windows virtual machine RAM and install all of the office applications and IM software into the windows development host. Reducing the host OS to simply a shell for the VMware software. All in all though, I’d recommend all developers to try it out at least. It ticks all the boxes I wanted and after the setup and configuration it makes development a saner process. Which has to be good, right?


Virtual Host Development: Part 1

I got a new PC this week. Oooooh. It instantly presented me with a challenge:

Q. Is it possible to make full use of a standard mid-range specification development desktop host?
A. … errr … dunno …

As a developer I have long believed that we, me included, fool ourselves into purchasing new hardware and software that we don’t need. This constant upgrade cycle blind-sides us from separating what’s important from what is aesthetically pleasing. Since 2004 (when we approached the PC clock-speed ceiling) I can’t think of any advance in software or hardware that has really compelled me to want to go and buy something new. I discovered some time back that the biggest productivity bonus you can get from any developer is to give them two monitors (oh, and a graphics card that can support them too!). Giving those developers Vista or a bigger/faster machine won’t buy you much other less help-desk time supporting old kit and a better relationship with your hardware vendor.

The things that have changed since 2004 is the cheapness of RAM & disk and the addition of multi-core CPUs. Although 64 bit has also come online too that in itself only really gives you more addressable RAM. For the moment the circa 3Gb addressable in most 32 bit OS’s is fine for me and my development needs. This means that by default I get as much RAM as the OS can handle, 2 large disks and a multi-core CPU.

So, in the absence of a new OS to tax it the new hardware was begging for a thrashing and I thought knew just the thing. I would start doing virtual host development. That is, install a VMware style virtual host manager and run my development environment entirely inside a guest operating system. The host operating system, i.e. the real one, would be used for sending email, surfing, browsing documents and running the virtual host manager.

This way of working has some desirable consequences:

  1. Snapshots – Most virtual host managers allow the taking of snapshots. So the plan is to take a snapshot after each major milestone of the virtual host build. That is:
    • Right after the install (& patch) of the base OS
    • After the install of the standard development tools
    • After the install of personal development tools that I use

    Regardless of the OS you use your development environment will usually take you a fairly long time to build. This is because you have to load: IDEs, source control, useful editors and whatever else you fancy. The installation of the DevStudio stack alone can take the best part of a day. The ability to go back in time to anyone of those points will be very useful if something goes wrong. It’s not only that though, most virtual host managers will allow you to ‘clone’ a snapshot so that you can take your current environment, clone it, fire it up, install something wacky, and see if it works. If it doesn’t work as you expected then you live to fight another day.

  2. Security – I have never particularly liked installing my custom tools and utilities on my company’s hardware. It complicates matters when system administrator’s are needed to try and resolve problems. This is because, and rightly so, the first thing they’ll blame when something goes wrong is that piece of custom software I installed last week. Therefore, I think, at least in theory that I will no longer need administrator privileges on my own machine because I will have an administrator account on my virtual hosts. I’m not convinced of this view just yet, but I aim to demonstrate it here!
  3. Testing – Deployment of my developed code is always hard to test. It’s hard because you need a clean machine to work with to be able to prove that the test was indeed a success. With the aid of snapshots I can boot a clean machine and attempt a deployment to it. If it succeeds great, if it fails I just start again safe in the knowledge that I haven’t soiled my clean machine and the test is still a good one.
  4. OS Choice – I still think UNIX is far and away the most powerful and transparent OS, but that might be because I’ve been banging my head against it for a very long time. Anyway, I like having UNIX around and when I don’t have it it makes me kind-of sad and angry. Angry that I can’t: grep a file for a regex pattern, pipe it through gawk to get a particular column, pipe that through sort, then pipe that through uniq -c and pipe the whole lot through sort -g. How else would you answer the question of how can you do a:
    SELECT Value, COUNT(*) 
    WHERE Value LIKE '%XYZ%'
    GROUP BY Value 

    … on a flat file? On second thoughts don’t answer that.

In Part 2 I will look at what I discovered. The pitfalls of running a development host within a virtual host and more on why I really do love UNIX an awful lot.

(Ok perhaps I’ll leave that bit out :-))