The Excel Gambit

Excel screen shot of XL-Gambit

In the spirt of shoving Lisp where Lisp should-not-go I have jammed Gambit Scheme into Excel.

If you’re interested in trying it out you should go and get the files RIGHT NOW at GitHub. You just need to add the compiled .XLL into your add-ins and use the formula =GAMBITC(“(+ 1 1)”) in a cell of your choosing.

Please note this is a proof-of-concept. You probably shouldn’t use it for anything important, most importantly because it probably leaks like a sieve. You should also know I only really tested it on Excel 2003. It should load into Excel 2007 but I didn’t try that yet.

Why? Because it’s there, and I’m here

Why did I do it? Well the rationale behind doing this was that I wanted to write an app for Excel in Scheme. I want to do that because I think it would be vastly superior to using .NET or VBA or COM. Having spent a few hours on it I think I could definitely achieve what I set out to do but whether I’ll actually go this route in the end I’m undecided about. More on that later.

Mature like a good cheese

The Excel API, as it turns out, is … well … mature. Now I like old things quite a lot. I’m old, and I like me just fine. One consequence of age is that we get a little extra baggage on the way. This is true of Excel too.

Now, IMHO one of the more challenging aspects of programming a long-lived project is version management. Especially version management of an API. Broken backwards/forwards compatibility of those APIs could be serious impediments to a new Excel’s acceptance. Unsurprisingly then Excel has a number of APIs that have all undergone various enhancements over the years. To me though the switch from Excel 2003 to Excel 2007 was the most significant, requiring the most additional cognitive load to work with. For now, and for simplicity, I have chosen to totally deny the existence of Excel 2007. But I’m coming back to it, oh yes I am.

I Eat Pain For Breakfast

I had read a bit online about the Excel C API and found it all hugely confusing. So I bought a book thinking that that would make it clearer. Although the author does attempt to explain how you might write an Excel add-in he does it in a way that makes programming from it hard. For example, the information I need is spread all-over the book. Answering a single question about Excel’s behaviour with respect to calling the core Excel4/Excel12 function necessitates flicking between the index and more than 10 different page sections.

The reason for this is probably that Excel is really a very versatile tool which is why a lot of the world’s business (large and small :)) seem to operate their database, accounts and customer details from it. So there’s not really one track through the development process. There’s literally hundreds, limited only by your imagination!

Having said that the book in combination with the Excel 2007 SDK was sufficient to pick through the rubble and build something workable. Maybe I’ll try and produce some guides/tutorials of my own to make the topic clearer. Maybe.

On the Gambit side there’s not much more documentation than what’s on the manual HTML page to help. Like the Excel book it is also very densely populated with information. IMHO its main failing is that it could really do with having more documentation of the C macro functions. Did I ever mention that I hate C macros? Well in Gambit these undocumented C macros are effectively what Gambit Scheme is written in and you kind-of need a fairly thorough explanation of how it all fits together to be able to make an effective glue with them. I will be coming back to this later when I try and construct lists from cell ranges.

Finally I decided to do this all in MinGW rather than the more usual VC++. Whilst going this route did cost me a bit of time I’d much rather use the GCC toolchain because I understand it better.


Now that I have the interpreter in Excel I will probably work on tidying it up for Excel 2007. More importantly though, I want to see if I can exploit Excel 2007’s all important multi-threading capability.

If that all works out I might use it to make a custom add-in. I will develop the app from the REPL in Emacs using Gambit Scheme. When it’s done I’ll compile it up into a standalone XLL with Gambit. That’s the plan anyway.

Let me know if you like it, or can see a use for it. Encouragement is always good.


Mersenne Twister in Clojure

The Mersenne Twiser is a random number generator that has a lot of applications, particularly in finance. After discovering that there was no Clojure implementation at Wikipedia I decided to give it a try as my first attempt at something useful in Clojure. As it turns out it’s problably not a good candidate to be implemented in a functional language because the whole thing requires modifying a mutable array for every call to (genrand).

I’m not very excited by the solution because it doesn’t seem very lispy. It much more resembles the reference implementation, and gives the same results as it for all the tests I tried.

I’m sure it could be better though.

If it’s of any use to anyone you can find it here.


REPL to the rescue!

Yesterday I watched Dan Weinreb talking about ITA, Lisp, and well, really, really complex stuff 😉

The kind-of conclusion that he drew on the future of Lisp is that Clojure is the next Common Lisp. I’ve been dodging Clojure for about a year now. I bought the book from The Pragmatic Programmers when the book was in beta and eagerly followed the examples and decided it had promise. But, I figured that if I needed to really learn it (and by that I mean use it, not just talk about using it) it would have to come to me. It would do this, I reasoned, by being hard-to-ignore.

Well there seem to be more than enough smart-folks on-board now (like: Eric Normand, Bill Clementson to name only two) so I guess I better not miss the party. Not because I’m smart but because if I don’t make too much noise I can blend in and no one will notice I’m there.

So I came to the conclusion that I should try and use it for ad-hoc things that might cross-my-dome and are hard(er) to solve in non-functional languages.

For instance, this very afternoon I wanted to know how many possible hand distributions by suit there are in the card game Bridge. So four possible distributions might be:

  • All 13 hearts
  • 12 hearts, one diamond
  • 11 hearts, one diamond, one club
  • 10 hearts, one diamond, one club, one spade

The question is how many total distributions are possible?

I banged my head on the table-hard trying to figure out the answer to this. At first it seemed like a simple counting problem, but if it is I’m too simple to see it. Then I wondered if it could be an additive patitioning problem, but ordering is important so I don’t think it is. It didn’t feel NP complete. I know one thing though, at this late hour it might as well be.

1:57 bridge=> (count (for [spades (range 0 14) 
                                hearts (range 0 14) 
                                diamonds (range 0 14) 
                                clubs (range 0 14) 
                                :when (= 13 (+ spades hearts diamonds clubs))] 
                                [spades hearts diamonds clubs]))

Functional programming rocks.


cl-mysql on github

I’ve seen the future and it is git. Some believe Mercurial is the way because it is more accessible, but I’m not so sure.

We’re programmers, we love this esoteric shit. The more hardcore the better. IMHO, the force of numbers, and its pedigree, will probably make the big fat git wade through the opposition like the monster it is.

Anyway, for a few months now I’ve been seeing this github thing linked here and there, I figured it was about time I checked it out.

I went, I saw, and it was good. Fast, easy-to-use and beautifully in the spirit of free-software (you only have to pay to use it for private repositories).

So in the spirit of free lovecode I’ve added cl-mysql.


Embedding Lisp in Mono

I have been thinking about a new project that I wanted to write with Mono. One of the things that I realised I would need is the ability to execute arbitrary code (user or wizard generated) that might be stored in a database.

Code … data … code.

A very dim light went on in my head telling me that that sounded very much like a Lispy problem.

However, I particularly wanted to use Mono (and not Java, say) for the solution and so I started writing my own interpreter. After about 4 hours of toil I realised that maybe I should finish reading Lisp in Small Pieces before getting carried away. But the urge to create was large, so I went looking for a Scheme interpreter written in .NET and found S#.

10 minutes of fiddling in MonoDevelop later and I had an R4RS interpreter running in .NET on Linux. I heart open source …

stkni@ubuntu:~/source/SSharp-Mono-0.38a/SSharp.Console/bin/Debug$ ./SSharp.Console.exe 

 This is: SharpScheme v0.38 - R4RS compatible Scheme interpreter.
 A pure C# port of Peter Norvig's Scheme for the .NET Framework.
 License here: 

 Running on: Unix  

>>> (+ 1 2)                   
database lisp

cl-mysql v0.2

I am pleased to announce that cl-mysql 0.2 is ready for use!

Here are some of the highlights of v0.2

  • Connection pooling – Thread safe allocation and release of connections from a central pool.
  • Use result/Store result – Ability to use mysql_use_result as well as mysql_store_result. This means that CL-MYSQL should be able to handle the processing of very large datasets without running out of memory.
  • Convenience functions/macros – with-rows / nth-row

The main difference between v0.1 and v0.2 is that version 0.1 didn’t really manage its connections. I decided that allowing the user to choose between pooled and non-pooled connections is a hassle. Much better then to allow the user to create as many connection pools as they want and allow them to specify the maximum and minimum number of connections that the pool can hold. After all, a single connection is simply a special case of a pool with only one connection.

However, in theory this could hurt performance when attempting to do large number of INSERT/UPDATE’s because every call would require the connection pool to be locked and a connection to be aquired. This could be overcome though by making use of the fact that CL-MYSQL will correctly pass multiple statements to the server so you could concatenate a large string of updates and execute them all at once.

The good news though is that the API has changed only very slightly in the optional arguments it accepts. However I have changed the way the result data comes back from query. Because CL-MSQL returns multiple result sets it’s necessary to place all of them into a sequence. Additionally, I did not like the way I was placing the column headers into the first item of the result data. It means you always have to allow for it. I considered doing it the way that CLSQL does it by returning the column data in a value struct but I find this awkward to manage. This is because every layer of the API (and client code) must multiple-value-bind the columns out and either repackage them as a sequence or create a new value structure to pass them up the call-chain.

Therefore I have changed the result sequence structure to be as follows:

query-result ::= (<result-set>*)
result-set ::= (<result-data> <column-data>)
result-data ::= (<row>*) | <rows-affected>
row ::= (<column>*)
column-data ::= ((<column-name> <column-type> <column-flags>)*)

I appreciate that this is a little complex, I did consider turning the result data into a struct but this complicates how the user processes the data. For this reason I have added: with-rows and nth-row to simplify the processing of this result data.

Finally, the whole thing is still only SBCL/x86 Linux compatible, that might change :-).

More information is available here. As always, any feedback is appreciated.


Announce: cl-mysql

After some deliberation I decided to try out what I was talking about in The Not So Super Super API. The idea of being able to hook a low-level API, so that it’s functionality could be tweaked later seemed quite appealing to me. In reality, whilst what I had intended was indeed achievable the performance sucked so hard as to make me rethink what I had done.

I did though, in the process, create an alternate library to using CLSQL which has support for stored procedures that return result sets, better (IMHO) type inference and is far simpler to deploy. So it wasn’t totally in vain. I intend to extend the library a little to work on the performance and provide a more faithful Common Lisp implementation in the near-future.

Full details of the cl-mysql library are available on this page.

Thanks to the power of Lisp’s macros I am able to make the low-level API hooking dependent upon a compile-time special variable. That way I can generate the additional code for the API hooking, if I want to, or leave it out whilst I’m working on other aspects of the library. Don’t you just love Lisp?! Hopefully we’ll revisit this topic some time later too.

Finally, I realise now that announcing anything on April Fool’s day is possibly a mistake. It seems that the signal-to-noise ratio in the world has gone down quite significantly in the last-few-hours :-). I guess that makes me the fool then.


Choosing a Common Lisp Unit Testing Framework

I have recently become dissatisfied with the unit testing framework I was using: LIFT. After reading Phil Gold’s fairly comprehensive Common Lisp Testing Frameworks I decided to switch to Stefil.

So what’s so wrong with LIFT? Whilst I don’t want to detract from metabangs efforts, LIFT was annoying me enough that I was considering writing my own unit-testing framework! No one wants YAUTF (yet another unit testing framework), especially mine, so I went shopping. I should also say that I’m overjoyed with other metabang creations like bind and log5 but LIFT doesn’t seem to elevate me much any more (groan).

In my experience, your mileage might vary, LIFT seems slow for what it does. Yes, my machine is a little old and beat-up but still, the unit-testing machinery should not be a significant burden to the unit testing process itself! To illustrate this point look let’s look at a highly subjective example. Suppose I want to test the plain and simple truth, but I want to do it 10,000 times – I do this because I never take “yes” for an answer. Here’s a REPL snippet doing just that in LIFT

CL-USER> (lift:deftestsuite test-lift () ()
		(lift:ensure t))))

Start: TEST-LIFT#<Results for TEST-LIFT [1 Successful test]>
CL-USER> (time (loop for i from 1 to 10000 do (lift:run-tests :suite 'test-lift)))

<snip 9,997 lines;>
Evaluation took:
  4.029 seconds of real time
  2.100131 seconds of user run time
  0.076005 seconds of system run time
  [Run times include 0.06 seconds GC run time.]
  0 calls to %EVAL
  0 page faults and
  60,780,256 bytes consed.

And then let’s do the same for Stefil

CL-USER> (stefil:defsuite* test-stefil)
CL-USER> (stefil:deftest test-true ()
	   (stefil:is t))
<snip 9,997 lines;>
Evaluation took:
  1.238 seconds of real time
  0.932059 seconds of user run time
  0.116008 seconds of system run time
  [Run times include 0.357 seconds GC run time.]
  0 calls to %EVAL
  0 page faults and
  88,813,344 bytes consed.

Part of the slowness might be that LIFT prints “Start: TEST-LIFT” 10,000 times, but I didn’t dig any deeper. LIFT seems slow when just running a handful of suites. Apart from the slowness the output produced by LIFT isn’t really particularly useful, it’s better than nothing, but I can’t really be sure of the testing progress within a suite. Ideally I would just like to see some incremental idea of progress, and a single “.” per test and a new line after each suite, like Stefil does, is much cleaner.

Secondly, and this is the kicker, I find it difficult with LIFT to find out what went wrong and where. Which is surely the whole point of unit-testing. We expect stuff to fail and hunting down the causes of failure in LIFT is a bit tiresome via the inspector. Conversely, Stefil supports test failures by dropping you straight into the debugger when an assertion fails. Which is perfect because you can look at the code that caused the error, dig about in the source, fix it and continue the test. This is a natural way to go about developing test driven software. It also leverages the REPL making it a far more interactive experience. The only snag is that this sort of behaviour is not always what you want if you want to run automated test & build environments. Stefil provides a special variable *debug-on-assertion-failure* which registers the failure but doesn’t drop you in the debugger. It seems that LIFT does have a testing parameter break-on-error? however this only catches errors, but it probably also needs a break-on-assertion? as well.

Finally, Stefil just seems more concise & natural. Since what we’re doing here is creating functions that test other functions surely we should be able call tests like functions. In my view classes are not the primary units of a test, functions are. And so it is in Stefil because every suite & test are callable functions. In LIFT you have to tell the function lift:run-test to find you a test/suite class with a specific name and then run it.

I didn’t want this blog entry to be a ‘hatchet-job’ on LIFT. I don’t want that because that’s not constructive, and there’s already too much way-too much ranting on the internet. However, in the final analysis, LIFT could be made to be a lot better than it is. Since the effort in switching wasn’t really that great I decided to switch to Stefil rather than persevere and try to directly improve LIFT.

Phil Gold actually makes two conclusions in Common Lisp Testing Frameworks , Stefil and fiveam. I would have tried fiveam, which was Phil’s framework of choice, but it wouldn’t install via asdf. Whilst not being asdf installable isn’t a huge barrier to entry it suggests something (perhaps wrongly) about the quality of the solution. So I skipped it.


Lisp. It Doesn’t Get In Your Way. Much ™

Recently I have been using Common Lisp’s eval function a bit. Since it’s eval that put’s the E in REPL it fair to say that it is a fairly fundamental part of Lisp. However, no code that I have seen appears to use it directly. I think I know why. To make (eval …) always work in the way you’d expect doesn’t appear to be that intuitive.

Paul Graham in On Lisp, has this to say about using eval in your own code:

Generally it is not a good idea to call eval at runtime, for two reasons:

  1. It’s inefficient: eval is handed a raw list, and either has to compile it on the spot, or evaluate it in an intepreter. Either way is slower than compiling the code beforehand, and just calling it.
  2. It’s less powerful, because the expression is evaluated with no lexical context. Among other things, this means that you can’t refer to ordinary variables outside the expression being evaluated

And so when I discovered a need in my project to persist and reload closures I decided that my needs would not violate either of Paul’s points. Firstly, because I don’t know what the code I’m going to persist is going to be, and secondly because no lexical context is needed to create my closures. Therefore, I would store them as strings and then I would use read, and eval to restore them. This worked fine so I put the code into the package and declared my work done.

It turned out that once the code was in a package it didn’t really work as I’d intended. When I tried to run it I got unknown symbol conditions raised when I tried to restore the closures. Qualifying all the symbols with the correct package name worked but it made my shiny new DSL all messy by requiring me to always prefix all my symbols. It turns out then that eval doesn’t work this way by design. The reason is because of this statement in the HyperSpec page of eval.


Evaluates form in the current dynamic environment and the null lexical environment .

I was expecting that since my eval was inside a package it would be able to see function symbols in that package. Not so, eval works in the dynamic environment, which implies then that the current package is a special variable and hence part of the dynamic environment.

This means my code could only ever work when the current package is the library package. Any other package and the code fails because eval is checking the dynamic environment to determine which symbols are visible without package qualifiers. Indeed it seems that, in SBCL to make my code work in the way that I should expect I need to wrap it in the following:

   (let ((*package* (find-package "MY-PACKAGE")))
      (eval ...))

And this works just fine. The most pleasing thing about this outcome is that it illuminated a point that I’d heard before but never been able to substantiate: Lisp, It Doesn’t Get In Your Way. eval has to work how it does otherwise Common Lisp would probably not work properly. However, because the package system is an accessible part of the language to the programmer it seems as if I can adapt any part of that system to suit my purpose.

You’d be right in thinking too much of this sort of thinking is bad for maintainability, but this single line hack allows me to safely persist executable-code at run time. Since there’s few languages that have closures to begin with, making a minor hack to make them easily persisted too (with the help of macro) seems a small price to pay.

“Lisp. It Doesn’t Get In Your Way. Much.” – I like the phrase so much I think I’m going to trade-mark it.

lisp programming

Know Your Idioms

A minor thought for today. Part of becoming an expert in any programming language is learning the idioms. Through experience, research and your peers you discover what makes your programming language hum. How to get the best performance and how to trade that against the best expressiveness. Since spoken language affects the way that what we think it’s reasonable to expect that programming languages affect the way we think about programming problems.

I have on a number of occassions wanted to write an expression in Lisp that will return true if all the elements are T and nil if any are not T. A sort of ‘and’ for a list. It seemed like a classic case of a ‘reduce’ or fold operation to me and so I went about doing that as a starting point.

CL-USER> (reduce #'(lambda (x y) (and x y)) '(t t t t t t t))

Hmm, because AND is a macro it can’t be an argument to reduce therefore I have to wrap it in a lambda. Which is ok but it removes the short-circuit feature from the AND.

So, I wasn’t terribly excited by this result but it served the purpose and life continued as normal. That was until today when I discovered the LOOP tutorial which inspired me with this trivial alternative …

CL-USER> (loop for x in '(t t t t t t) always (eq t x))

Yes it’s a little bit less functional but it it’s very clear what’s going on here and that was my original problem with the second reduce form. I was interested to see how the LOOP expanded too, so:

CL-USER> (macroexpand-1 '(loop for x in '(t t t t t t) always (eq t x)))
  (LET ((X NIL) (#:LOOP-LIST-1613 '(T T T T T T)))
                         (SB-LOOP::LOOP-REALLY-DESETQ X (CAR #:LOOP-LIST-1613))
                         (SB-LOOP::LOOP-REALLY-DESETQ #:LOOP-LIST-1613
                                                      (CDR #:LOOP-LIST-1613)))
                        ((UNLESS (EQ T X) (RETURN-FROM NIL NIL)))
                        ((WHEN (ENDP #:LOOP-LIST-1613) (GO SB-LOOP::END-LOOP))
                         (SB-LOOP::LOOP-REALLY-DESETQ X (CAR #:LOOP-LIST-1613))
                         (SB-LOOP::LOOP-REALLY-DESETQ #:LOOP-LIST-1613
                                                      (CDR #:LOOP-LIST-1613)))
                        ((RETURN-FROM NIL T)))))

... Jeeeeezus. No way that can be efficient surely? So I ran both arguments with a temporary list of 1,000,000 elements all equal to true:

CL-USER> (time (reduce #'(lambda (x y) (and x y)) (loop for i below 1e6 collect t)))
Evaluation took:
  0.149 seconds of real time
  0.100006 seconds of user run time
  0.040002 seconds of system run time
  [Run times include 0.068 seconds GC run time.]
  0 calls to %EVAL
  0 page faults and
  7,999,488 bytes consed.
CL-USER> (time (loop for x in (loop for i below 1e6 collect t) always (eq t x)))
Evaluation took:
  0.059 seconds of real time
  0.052003 seconds of user run time
  0.0 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  7,995,392 bytes consed.

Guess again brother. The more I thought about it the more other solutions I could find to this problem. For instance I could just count all the nulls and if I find one then the answer is false.

(eq 0 (count-if #'null 
           (loop for i below 1e6 collect t)))

This was marginally faster than the 'loop' on my implementation, but then I reasoned that on a list of a million items 'AND' should really retain that short-circuit ability to make it stop on the first false. So perhaps the correct thing to do is return the first non-true argument:

(not (find-if #'null 
          (loop for i below 1e6 collect t)))

This solution is less imperative in style and less complicated than all the other solutions. Now that I've thought of it I can't imagine how I didn't think of it sooner. A winner!

Lisp is such an old language that there seems literally dozens of ways to attack most problems and that really is the point. Since I don't know what I don't know there's probably dozens more ways I could try to solve this trivial problem that only experience, research and interaction with my peers will get me.

This is where books like the O'Reilly cookbook series should come in right? They should have all that hard work done for me so I can stop making mistakes and start making systems. Books, however, are not very easily searchable when they are on the shelf. Therefore on-line resources are best for searching for optimal solutions. So I was pleased to find that Common Lisp has an online cookbook. However you still have to KNOW where to look in a resource like this because I might not have found the 'find-if' solution if I was stuck in a rut banging on the 'reduce' solution. Not only that, every situation is different, 'reduce' definitely has its uses but perhaps not in this case.

This further suggests to me that it's only your peers and your own research (i.e. trial and error) that can show you the way. Just make sure you give yourself enough time for the research and pick your peers carefully 😉