windows


For some reason the Git that ships with Cygwin (v1.6.6.1) won’t do the right thing with NTLM proxies. It seems that Git uses cURL underneath and that cURL can correctly handle NTLM authentication if the options are set correctly.

However, the version of Git I have isn’t capable of passing this information through. In fact, some browsing of the issues that people have with this suggests that there is more than one part of Git that isn’t able to work correctly with NTLM. So even if you’re able to get past the initial connection you probably won’t be able to fetch any of the tree.

The solution, in full then, is to use ntlmaps to act as a proxy to the proxy. All you need to do is to download the latest version of the app from: ntlmaps. Change the config file to include your authentication and proxy details, start it up and then set the proxy to be your new local one:

git config --global http.proxy http://localhost:5865

I can confirm that it works just fine. Not only that you can use it for any app that requires NTLM authentication but does not provide full NTLM support.

I am ashamed to say that I wasn’t really sure if the Mono project would ever succeed. It seemed like such an enormous task, and one that would forever be playing catch-up with M$, that it just didn’t make sense.

MonoDevelop 2.0

Even so, I’ve sort of been following it from the sidelines for a while and I’m excited to say that Mono 2.4 seems to be a drop-in replacement for the bits of the .NET SDK I was actually using. Now with the release of MonoDevelop 2.0 it seems that I’m also able to get a functioning alternative to the Visual Studio GUI I’ve been using. It’s a colossal achievement and one that makes me even more ashamed. Ashamed that I didn’t think it was possible, and ashamed that I didn’t help out.

To get it working was a bit of a trial though. It turns out that because I already had the default MonoDevelop 1.0 installed that when I started trying to build a new version things got very confused. Not only that it turns out that the build instructions for MonoDevelop 2.0 aren’t exactly correct. After a lot of messing around I think I managed to damage the default installation and cross-link a few libraries here and there. I think if I was doing it again I should probably have followed these instructions Parallel Mono Environments before doing anything. Too late now though. After much building and swearing I ended up uninstalling all Mono environments and then rebuilding everything for Mono 2.4. I didn’t put the Hardy version back yet, I’ll do that if I need to.

Here’s what I did, in case it helps anyone.


First of all I decided that this is a temporary solution and that eventually I would be able to get MonoDevelop 2.0 running from APT, therefore I decided to install everything in /usr/local.

Ubuntu Development Packages

As ever with Ubuntu you will quite probably need to install a bunch of development libraries to make it work. I probably had a lot of the dependencies I needed installed before I began but here are the ones that I didn’t have that I needed. Your mileage might vary.

sudo apt-get install bison  gnunet-gtk-dev libpango1.0-dev libgtk2.0-dev
libglade2-dev libgnome-dev libart-dev libgnome2-dev libgnome-dev libgnomeui-dev
libgnomeprint2.2-dev libgnomeprintui2.2-dev libgnomecanvas2-dev 
libpanel-applet2-dev

Mono

As per the parallel environments guide I set up this environment in a script and executed it.

#!/bin/bash
MONO_PREFIX=/usr/local
GNOME_PREFIX=/usr
export DYLD_LIBRARY_PATH=$MONO_PREFIX/lib:$DYLD_LIBRARY_PATH
export LD_LIBRARY_PATH=$MONO_PREFIX/lib:$LD_LIBRARY_PATH
export C_INCLUDE_PATH=$MONO_PREFIX/include:$GNOME_PREFIX/include
export ACLOCAL_PATH=$MONO_PREFIX/share/aclocal
export PKG_CONFIG_PATH=$MONO_PREFIX/lib/pkgconfig:$GNOME_PREFIX/lib/pkgconfig
PATH=$MONO_PREFIX/bin:$PATH
MONO_GAC_PREFIX=/usr/local
PS1="[mono] \w @ "

I then downloaded the following libraries from the Mono 2.4 sources page.

wget http://ftp.novell.com/pub/mono/sources/libgdiplus/libgdiplus-2.4.tar.bz2 

http://ftp.novell.com/pub/mono/sources/mono/mono-2.4.tar.bz2


http://ftp.novell.com/pub/mono/sources/gtk-sharp212/gtk-sharp-2.12.8.tar.bz2


http://ftp.novell.com/pub/mono/sources/gnome-sharp220/gnome-sharp-2.20.1.tar.bz2


http://ftp.novell.com/pub/mono/sources/mono-addins/mono-addins-0.4.zip


http://ftp.novell.com/pub/mono/sources/mono-tools/mono-tools-2.4.tar.bz2


http://ftp.novell.com/pub/mono/sources/mono-debugger/mono-debugger-2.4.tar.bz2


http://ftp.novell.com/pub/mono/sources/monodevelop/monodevelop-2.0.tar.bz2


http://ftp.novell.com/pub/mono/sources/monodevelop-debugger-mdb/monodevelop-debugger-mdb-2.0.tar.bz2


http://ftp.novell.com/pub/mono/sources/monodevelop-database/monodevelop-database-2.0.tar.bz2

Extract all these and then run the below command in each. Because this is a dependency stack you will need to do them roughly in the order they're listed above.

./configure --prefix=/usr/local
make && sudo make install

After that you should be left with a working MonoDevelop 2.0!

Troubles

I encountered a few problems on the way, mostly there were problems relating to not having the correct Ubuntu development libraries and headers installed when the parts of the stack were configured.

I also ran into some problems with the GAC having pre-installed versions of conflicting libraries installed. Eventually though this problem was resolved by uninstalling the default Hardy version of Mono and starting again. Hopefully you won't have this problem because you will have setup a parallel environment in the first place ...

I first came across the principle of least surprise a couple of years ago when someone pointed out to me that some script that I had written had violated it. Naturally defensive, and of course mildly arrogant, I asked my critic what he was talking about. He said that a command line script should return an explanation of its accepted arguments when given a –help or -h argument. He was right and I was ashamed and so I naturally did what any respectful programmer would do, I fumed quietly for a few hours and then fixed it when he wasn’t looking.

It seems then that almost any user-interface should not violate the principle of least surprise and this is common sense. When I click on a cancel button in a dialog it should not accept my input it should return to where I had come from leaving everything unchanged. This was my first violation this week and it’s a common one, a ‘Cancel’ button in a dialogue did the opposite of what I expected. This was in a popular source control tool (that I won’t name because my version is old and the bug is doubtless fixed now) and it did in fact ‘Submit’ the broken change that I was so keen to cancel. After a short bout of tourrette’s and 5 minutes of trying to figure out how to undo the change I was back on track but my confidence shaken a little.

It’s not just user-interfaces that should not violate the principle of least surprise, APIs shouldn’t violate it either. You know the sort of thing. For instance, a ‘getter()’ should be idempotent and not change any state. Languages like C++ have support for this through the ‘const’ method modifier but I can’t think of another language that I know of that enforces this in the same way.

So I was incensed when I thought I’d discovered an example of such a violation in the .NET framework. I was having trouble using the method Uri.MakeRelative(Uri uri) because I had misunderstood exactly what it did (because I didn’t RTFM) and I expected it to simply strip the routing information from the Uri and return the document path. What it actually does is a more general case and it will effectively ‘subtract’ one Uri from the other. Which is more useful when you have things like embedded Uri’s. Anyway, the depth of my sickness isn’t important, the important thing is that I realised that I could have used the principle of least surprise to save me.

As far as I can tell I haven’t ever found a .NET framework method that didn’t do as I expected, that’s partly because of the number of people who use early-release versions of .NET and iron out these problems for me. Thanks! Not only that over the years I have entered into a love-hate relationship with Microsoft. I won’t enumerate the things I hate because disk space is finite but what I really like is the way that M$ have a vast array of tools and products that seem to co-exist with very little problems. This is no small feat, it’s a triumph of quality control and standards and I applaud them.

The conclusion is then, that I should abide and uphold the principle of least surprise wherever possible. Additionally though, I can apply the principle in reverse when I trust the source of the interface to not violate the principle. Moreover if I can succeed in not violating the principle myself then people who use my and my team’s work will have more confidence in it … I can dream, can’t I?

What would you do? You’ve got a C# List<T> full of thousands of objects of the same class. You think there’s a few duplicates hiding in there somewhere and you want to root them out. You think that you would really like to put your objects into a dictionary because if you did you could get that shiny O(1)-ness that the cool sleek lines of a dictionary is going to give you.

Only thing is your keys aren’t exact, they’re doubles or DateTimes or some such and you can’t use them because you really need to apply a tolerance to those values. What you need is an imperfect hash function.

The solution to this problem is obvious to me now but not so obvious at the time when I was trying to speed up my painfully O(N^2) application. To add your elements into a dictionary and get faster lookup you must obviously include all the elements of the comparison in the Equals() method but exclude the non-exact ones from the GetHashCode() method.

Before I start with this analysis I’d like to apologise to all vegetarians on the planet for eating your friends. Secondly, and more importantly, I’d like to apologise to all decent-folk for producing an example so contrived that it will make you want to vomit. Please pay particular attention to avoid your shoes.

Let’s begin:

public class Beefy
{
    public string    name;
    public double   percentage;

    public override bool Equals(object o)
    {
        Beefy b = o as Beefy;
        if (b == null) return false;
        return (name.Equals(b.name) && Math.Abs(percentage - b.percentage) >= 1);
    }
 
    public override int GetHashCode()
    {
         return name.GetHashCode();
    }
}

This works by only relying on the string "name" to derive the hashcode. What this means is kind-of subtle and relies on what the Dictonary<T> does to resolve collisions. A collision occurs when two items hash to the same location. Since we know that for Beefy we're not using a unique identifier to place the items into the Dictionary<T> then that dictionary must have a way of coping with the situation that two hash-codes resolve to the same place. Now you would need to check the language you love to find out exactly what it does in this instance. But for .NET 2.0, C# uses a technique called chaining. That is it creates a secondary data-structure which is commonly referred to as a bucket but is really just another list.

Looking at the following diagram, (and again my sincere apologies) you'll see that we have 3 topsides in the same bucket. This happened because the hashcode for the string "Topside" led us to the fourth bucket. Every time we see "Topside" as the name we will end up inserting the Beefy object into its bucket. Every time we want to retrieve a "Topside" we will start at the first element of the bucket and iterate over the chain until we find a match.

Dictionary Beef

In this way then we can get the trade-off we need. Hopefully our inexact matching fields will be few and our exact ones will be many so that we will get close to O(1) performance, however if it all goes horribly badly and every piece of beef we have is "Topside" then we will tend towards an O(N) lookup time.

This started me thinking. Exactly what does this continuum look like? In other words for a trivial class like the one above what sort of read access times would I get searching for each of the 1000 objects in a variety of data structures? Now what if I vary the number of hashcodes between 1 (i.e. everything hashes to the same bucket) and 1000 (where everything hashes to a different bucket) and try to access the same 1000 objects for each hashcode distribution. I performed the process on a Dictionary<T>, my List<T> and a Hashtable, which has a different collision resolution mechanism in C# to the dictionary.

Table vs Dictionary (Beef Off!)

The results were almost as I expected. The Dictionary<T> is faster than the Hashtable because when a Hashtable has an imperfect hash the collisions 'spill' over one another which means that other future values can potentially be mis-placed, not just the current one. However, that performance converges as the quality of the hash improves. Secondly, the way I set up my List<T> the items being searched for should on average be found in the middle (requiring iteration over half of the List<T> to find a single element). You can see that for a very imperfect hash where there are less than 100 distinct hashes in 1000 items, it's better to use a List<T> than either a Hashtable or a Dictionary<T>. At a 90% perfect hash then a List is 22x slower than a Hashtable and more than 50x slower than a Dictionary<T>.

All good news for me because my real hashcode was much more discriminating than that typically sitting above the 90% perfect range. So the List<T> has gone and my hash is meaty sister. Very meaty indeed.

Model-view-controller is an old, old, old but very good idea. It encourages the separation of model, presentation and control from each other. It’s used in so many places I can’t name them but frameworks like: Struts and Ruby-On-Rails actually enforce it.

For a long time it seems to me that Microsoft has lagged behind in allowing us to use this idea. Their once flagship product, Visual Basic 6, makes it almost impossible to write good MVC code. First of all, in VB6 there is no real inheritance which makes writing good models difficult. Secondly, if those models should contain any items that generate events then those items can not be defined in a class module and must be made public. Sure you can simulate and work-around these things by various means but in the end you will just be fighting the language. And that is never good.

So it’s good to see VB.NET, or .NET 2.0 to be precise, not only has excellent object support but a mechanism that can be used for MVC is actually built into the language. (more…)