lisp


A minor thought for today. Part of becoming an expert in any programming language is learning the idioms. Through experience, research and your peers you discover what makes your programming language hum. How to get the best performance and how to trade that against the best expressiveness. Since spoken language affects the way that what we think it’s reasonable to expect that programming languages affect the way we think about programming problems.

I have on a number of occassions wanted to write an expression in Lisp that will return true if all the elements are T and nil if any are not T. A sort of ‘and’ for a list. It seemed like a classic case of a ‘reduce’ or fold operation to me and so I went about doing that as a starting point.

CL-USER> (reduce #'(lambda (x y) (and x y)) '(t t t t t t t))
T

Hmm, because AND is a macro it can’t be an argument to reduce therefore I have to wrap it in a lambda. Which is ok but it removes the short-circuit feature from the AND.

So, I wasn’t terribly excited by this result but it served the purpose and life continued as normal. That was until today when I discovered the LOOP tutorial which inspired me with this trivial alternative …

CL-USER> (loop for x in '(t t t t t t) always (eq t x))
T

Yes it’s a little bit less functional but it it’s very clear what’s going on here and that was my original problem with the second reduce form. I was interested to see how the LOOP expanded too, so:

CL-USER> (macroexpand-1 '(loop for x in '(t t t t t t) always (eq t x)))
(BLOCK NIL
  (LET ((X NIL) (#:LOOP-LIST-1613 '(T T T T T T)))
    (DECLARE (TYPE LIST #:LOOP-LIST-1613))
    (SB-LOOP::LOOP-BODY NIL
                        (NIL
                         (SB-LOOP::LOOP-REALLY-DESETQ X (CAR #:LOOP-LIST-1613))
                         NIL
                         (SB-LOOP::LOOP-REALLY-DESETQ #:LOOP-LIST-1613
                                                      (CDR #:LOOP-LIST-1613)))
                        ((UNLESS (EQ T X) (RETURN-FROM NIL NIL)))
                        ((WHEN (ENDP #:LOOP-LIST-1613) (GO SB-LOOP::END-LOOP))
                         (SB-LOOP::LOOP-REALLY-DESETQ X (CAR #:LOOP-LIST-1613))
                         NIL
                         (SB-LOOP::LOOP-REALLY-DESETQ #:LOOP-LIST-1613
                                                      (CDR #:LOOP-LIST-1613)))
                        ((RETURN-FROM NIL T)))))
T
CL-USER> 

... Jeeeeezus. No way that can be efficient surely? So I ran both arguments with a temporary list of 1,000,000 elements all equal to true:

CL-USER> (time (reduce #'(lambda (x y) (and x y)) (loop for i below 1e6 collect t)))
Evaluation took:
  0.149 seconds of real time
  0.100006 seconds of user run time
  0.040002 seconds of system run time
  [Run times include 0.068 seconds GC run time.]
  0 calls to %EVAL
  0 page faults and
  7,999,488 bytes consed.
T
CL-USER> (time (loop for x in (loop for i below 1e6 collect t) always (eq t x)))
Evaluation took:
  0.059 seconds of real time
  0.052003 seconds of user run time
  0.0 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  7,995,392 bytes consed.
T

Guess again brother. The more I thought about it the more other solutions I could find to this problem. For instance I could just count all the nulls and if I find one then the answer is false.

(eq 0 (count-if #'null
           (loop for i below 1e6 collect t)))

This was marginally faster than the 'loop' on my implementation, but then I reasoned that on a list of a million items 'AND' should really retain that short-circuit ability to make it stop on the first false. So perhaps the correct thing to do is return the first non-true argument:

(not (find-if #'null
          (loop for i below 1e6 collect t)))

This solution is less imperative in style and less complicated than all the other solutions. Now that I've thought of it I can't imagine how I didn't think of it sooner. A winner!

Lisp is such an old language that there seems literally dozens of ways to attack most problems and that really is the point. Since I don't know what I don't know there's probably dozens more ways I could try to solve this trivial problem that only experience, research and interaction with my peers will get me.

This is where books like the O'Reilly cookbook series should come in right? They should have all that hard work done for me so I can stop making mistakes and start making systems. Books, however, are not very easily searchable when they are on the shelf. Therefore on-line resources are best for searching for optimal solutions. So I was pleased to find that Common Lisp has an online cookbook. However you still have to KNOW where to look in a resource like this because I might not have found the 'find-if' solution if I was stuck in a rut banging on the 'reduce' solution. Not only that, every situation is different, 'reduce' definitely has its uses but perhaps not in this case.

This further suggests to me that it's only your peers and your own research (i.e. trial and error) that can show you the way. Just make sure you give yourself enough time for the research and pick your peers carefully ;-)

A while’s back I wrote up my experiences of trying to learn Lisp through various books and I got this comment:

If you want to go down the rabbit hole however (and since you asked), here’s my advice: read SICP. If you’ve been programming in Algol-inspired languages for the last 15 years, your head will probably explode. It is worth it. Do the examples. Follow along with the videos: they contain all kinds of little nuggets of wisdom that may take time to sink in.

SICP, I thought, what the hell is that? It transpires that this book is, as they say in colloquial circles, like frikkin’ huge man. Huge in fame (it’s taught in a lot of places that I could only dream of attending when I was as an undergraduate) and, as it turns out, also huge in terms of sheer quantity of information inside it. So I ordered it on Amazon, waited a few weeks for it to arrive and then queued it up on my bookshelf.

About a month ago I started gently plodding my way through it, scratching my head, doing the exercises and all. It really is very good, but I have no idea how I would have done this as an introductory programming course at Uni. Let’s just say the exercises can be a little challenging, to me at least.

The amazing thing is I’ve only really just finished Chapter 1 and I feel like I’ve put my head through a wringer. So I’d really like it if you can come and visit me at the funny farm after I’ve done Chapter 5. Visiting hours are 9-5 and please don’t shout at the other patients it freaks them out. Anyway, today’s mind melt is Exercise 2.5 and it asks us to …

Exercise 2.5. Show that we can represent pairs of nonnegative integers using only numbers and arithmetic operations if we represent the pair a and b as the integer that is the product (2^a * 3^b). Give the corresponding definitions of the procedures cons, car, and cdr.

Up until now all the really hard problems had ‘Hints:’ in parentheses. That’s usually how I know when I’m going to be screwed. I wrote some numbers down, first on paper then in Egg-Smell but I just couldn’t see how I was going to extract two numbers from the encoding. Then, bizarrely, whilst reading about Church numerals (eek) I discovered that there’s a fundamental theorem of arithmetic that I never knew about. How does that happen? How did something that fundamental escape me all these years? Reading it was like being told by a German co-worker that the only time an apostrophe is appropriate in “it’s” is for a contraction of “it is”. Scheiße.

Not only is there a fundamental theorem but it suggests the answer to Exercise 2.5. Perhaps one day I’ll be able to program & punctuate. One day.

In Learning Ruby by Extension I claimed that learning a programming by writing extensions was a worthwhile exercise to really get to grips with a language. This is because to write a natural-feel extension you are forced to learn a fair bit about the internals of the language. I was wrong. Well sort of, let me explain.

Last week I bemoaned the poor network programming support in my Common Lisp implementation and so this week I resolved to get my beloved multicast sockets working by breaking the abstraction a little and just implementing the missing socket option call as a foreign function call. I was looking forward to it because I had anticipated that it would teach me something about Lisp. It didn’t. Well at least not in the way I was expecting. Absolutely nothing, and there’s a reason for this.

The foreign function interface for Lisp is at a higher level of abstraction than what I was doing with Ruby or Python. With Python, especially, I was actually creating Python objects and classes within C++ modules that Python would allow me to load as if they were native modules. This meant I had to delve into topics about Python’s garbage collection, threading and more. With Lisp’s CFFI (and all of the other FFI’s I’ve seen so far) things don’t work quite like that because they don’t need to. The language allows for much of the mechanics to be completely abstracted away and become almost transparent to the user. Let’s take a look at this by looking at what I did:


(require 'cffi)
(use-package :cffi)
(defcstruct ip-mreq
  (multiaddr :uint32)
  (interface :uint32))

(defcfun "setsockopt" :int
  (socket :int)
  (level :int)
  (option :int)
  (option-value :pointer)
  (option-length :int))

(defcfun ("inet_addr" inet-addr) :int
  (cp :string))

(defun socket-add-mcast-membership (socket-fd mcast-addr iface-addr)
  (with-foreign-object (mreq 'ip-mreq)
               (setf (foreign-slot-value mreq 'ip-mreq 'multiaddr)
                 (inet-addr mcast-addr)
                     (foreign-slot-value mreq 'ip-mreq 'interface)
                 (inet-addr iface-addr))
               ;; level = 0 = IPPROTO_IP,  option = 12 = IP_ADD_MEMBERSHIP
               (setsockopt socket-fd 0 12 mreq
                   (foreign-type-size 'ip-mreq))))

That's it, no external C modules or makefiles it's all done inside Lisp. I'd make this a bit cleaner before I incorporated it into my application because I've intentionally cut some corners to show the relative simplicity. However, it goes to show how easy it is to do. For example if you look at the inet_addr C function definition it only requires a two-line symbolic-expression to have a callable version inside my program . Indeed this solution is on about the same level of complexity as defining an entry point in .NET which couldn't really be any simpler. And like .NET there's no C 'glue' libraries to write because CFFI is doing all the heavy lifting. All arguments and return values are being marshalled into and out of dynamically allocated intermediate values. Which is pretty neat because it means all I have to do is concentrate on the abstraction. Even better is that CFFI allows the partial definition of structures which is great when you have a large and complicated struct which you only care about some of the values for. But that's another story.

Now if we're going to be fair to Ruby, Ruby actually has two ways to write non-local extensions. The first is to use the supplied C programming interface, similar (but simpler) to how Python does it. The second goes a step further and has a gem that allows you to embed inline C directly into the Ruby code. Although I don't think I could have used it for the tibcorv project (because I needed to support callbacks) that's also pretty impressive. Having said that Ruby can only achieve that by having access to a C compiler.

At the risk of stating the obvious a simple method for the definition and invocation of foreign functions is a requirement, I think, for any language to become and remain successful. The easier it is to get the foreigners in the more possibilities you can exploit because there's a C library somewhere for just about any task you can imagine. However, I qualify my original statement that writing extensions can teach us a lot about a language. More specifically I now think that writing extensions will teach you a lot about a language only if the level of abstraction that you have to work at to get the job done forces you to. And as far as I'm concerned, the more abstract the better.

Too much choice is challenging, to say the least. Well, it is to me anyway. I get hung up on trying to pick the best choice from a huge list of items and end up doing nothing at all. All too often just making a choice, any choice, would be the best idea. However when you’re writing software and trying to pick from an array of available libraries the penalties of a wrong choice can be amplified later. This penalty comes from having to bet early on your chosen horse, but here’s the kicker, you could well have a lot of capital in the race before you realise you backed a donkey.

Donkey

For instance, if you choose the wrong core library for the “Next Big Thing” now, you run the risk of tying yourself to an implementation that might have a flaw in it that you can’t fix because it’s not open source or because you don’t have the resources to undo it. So you have to adopt a work-around. Thankfully, most of the time things don’t turn out so bad. But if I stop talking to the donkeys for a minute and listen very carefully I can hear millions of developers happily using Microsoft’s tools and libraries without a care in the world. They don’t have to think about this sort of problem very often because Microsoft thought about it for them and gave them the library as part of the SDK. Those developers are real happy with what they’ve got and who can blame them, it’s like having an older brother who looks out for you. So what if he steals your lunch money, it’s part of his charm.

I don’t want to bash Microsoft here, by the way, I think they have a massively important role to play and I have to admit that some of the more recent tools they have made have been pretty awesome achievements for a gigadollar-company. It seems obvious, to me at least, that the higher Microsoft raises the bar in terms of tools the better it is for pretty much everyone. But the (real?) world outside of Redmond is a little more confusing, I know from (old) experience that when I want to use a SMTP client in Java I’ve got to go through a process of evaluation. In .NET the client that comes free in System.Net.Mail will do just fine, nothing to add. Sure sometimes .NET will sometimes be found lacking, but I’ve been pretty happy with the standard libraries in .NET so far and it’s only when I want something a little unusual that I need to go elsewhere (except for the annoying lack of a built-in Set class but that’s a different story).

Well all I wanted to do this week on my continuing Lisp learning exercise was send a multicast packet from my Common Lisp project. Guess what, network programming in Common Lisp is simply not possible. Gasp. It seems that the Common Lisp standard was set in 1994 and so predated the massive expansion of the internet. So CL projects like trivial-sockets and usocket have been created as CL vendor neutral APIs. Unfortunately, it seems trivial sockets is abandonware so I guess usockets is the way to go. But then, it seems broadcast networking is still work in progress for usocket so it seems fair to say that Common Lisp network programming for multicast will be some time off. In a platform independent way at least.

Even that’s ok. I don’t mind. I’m not producing any world-changing libraries here. Hell I might even pitch in on the usocket effort to help out. But right now I’m only messing about trying out some ideas to see what might fly, it’s what Lisp is good at after all. So I’m happy to use whatever multicast support SBCL has. Crap. They don’t seem to have implemented all the socket options needed . Then I read this very illuminating 2005 paper that lays the CL network programming problem out beautifully but when you scroll down to the “Proposed Solution” it says:

… To be written

If you stop talking to your own donkeys for a second and listen you might be able to hear me laughing. Very faintly.

Ok, I know what I’ll do. I’ll take a look at what I need to do make a foreign-function interface work. Yeah, that’s good, I’ll do that. Crap. I’ve got to chose between UFFI & CFFI. So I already know that I’m going to have to take a look at both to see what’s going on. A short while later and I’m starting to believe that CFFI is the cleaner of the two but then I stumble upon Verrazano and SWIG and my head starts to spin.

At this point it’s hard to be cross or frustrated about anything. This isn’t really anything that I can really blame CL for, it’s just how it is, if time-travel had of been available when the 1994 ANSI Common Lisp standard had been accepted then I’m sure they would have added a sockets layer. I can’t blame OpenSource because I can’t stop applauding it. I bow down to those people that have sacrificed their own personal time to make my life easier / better. I know that the fact that I have to do a little bit of work in the decision making process should make me happy because I have a choice to begin with. And anyway it’s better than having to do the whole thing from scratch myself.

But here’s the thing. When I stop shouting and beating my donkeys for being a bit lame and look up there’s millions of Microsoft developers cheerfully waving at the crazy guy who talks (and occasionally beats) his donkeys. I know if I was one of them, I’d think I was crazy too.

I think we donkey beaters can turn this around. I still believe that in general what I can hold in my hand for free is far more powerful than anything I can buy. The problem is essentially one of continuing improvement and better communication. All I have to do is stop cursing the donkeys and do something about it.

So last week I mentioned that I’d been practicing my Common Lisp by having a go at some puzzles on Project Euler.

A few days later I got an email asking me about what my experiences of the project were and how far I’d got on the puzzles. Well now I can reveal to all of you the heady extent of my genius. My correspondent also casually dropped in the fact that he was trying to do his puzzles in a functional, rather than imperative, programming style. Oh … yeah, “functional programming” I’d conveniently forgotten about that. I was so busy getting to grips with programming in CL I forgot about the important bit. I guess 15 years bashing your head against C++.Java and C# is going to have some long-term mental health effects.

Losing the plot

Now, my CS class at University (15 years ago) very briefly touched on functional languages so I’m not totally clueless. However, we learnt precious little and there is really no defense for that University to not place a larger emphasis on functional programming. I would argue, now, that it would have been a lot more useful for them to teach us a functional language than what they did teach us, which was: COBOL.

So, given my limited understanding, the point of functional programming is to write programs that behave more like mathematical functions. This has certain consequences since a mathematical function can not have side-effects like printing “Hello, World!” or delivering spam. Which surely is going to disappoint some people.

It is this “dirty” little secret of programming (i.e. that it should probably do something if we’re gonna get paid) that makes pure functional-programming a little less straightforward. Languages like Haskell have monads that address the problem of purity by effectively differentiating the pure from the impure. But I don’t think I’m quite ready to go there just yet. Maybe in another 15 years.

Taking the tablets

Ok, so where was I? Well, I’m going to try to make my code more functional too. Which means a diet of higher order functions and recursion for me. So let’s pick a trivial example and see how we do.

Problem #6 in Project Euler asks us:

Find the difference between the sum of the squares of the first one hundred natural numbers and the square of the sum.

My first solution looked like this:

(defun sum-of-arithmetic-series (n)
  (/ (* n (1+ n)) 2)

(- (expt (sum-of-arithmetic-series 100) 2)
   (loop for i from 1 to 100 sum (expt i 2)))

Which I was proud of because I’d figured out that I could use an equation to solve the first part. This isn’t terribly functional though because the second part uses a loop. So I tried again, this time I thought I could do better. I decided that what I was really doing was summing a series of natural numbers in the range 1..n. I could generalise this into: “apply a function (or the identity) over a series of numbers and sum the results”.

Looking at the problem this way generates an abstraction that did not previously exist in the earlier solution, making it a more general result and increasing the possibility of its reuse (perhaps!):

(defun sum-of-series (n &optional f)
  (let ((n-dash (if (null f) n
		    (funcall f n))))
    (if (= n 1) 1
     (+ n-dash (sum-of-series (1- n) f)))))

CL-USER> (- (expt (sum-of-series 100) 2)
	    (sum-of-series 100 (lambda (x) (* x x))))
25164150
CL-USER>

Ok, I know it's trivial but it made me feel better. There was only one problem though. When testing this for large n (> 30000) I got stack-overflows. Which surprised me because I was expecting it to be tail recursive and hence a tail-optimization to happen. I tried playing around with the debug settings and trying different variations but no amount of playing could make SBCL do the right-thing and optimize the recursion away.

Guess I need to do some more investigation into functional programming and that's probably an imperative! (groan)

« Previous PageNext Page »