Simple Observers in Haskell

I’ve just uploaded version 0.0.1 of the simple-observer package to hackage, the Haskell package repository.

This is an implementation of the Observer design pattern in Haskell. I am not the originator or even main author of the code in this package; that is Dutch computer scientist Bastiaan Heeren, who wrote the original as part of an exercise for some students. But recently I was doing some GUI programming with wxHaskell (which I’ve found to be great, by the way) and found myself hankering for an observer or something MVC-ish, and my googling led me to his code. I used it, and then thought it might be worth releasing to the world at large — particular as it’s not wx-specific, or even GUI-specific, really. Bastiaan tells me he’s happy for me to do so, so here it is. I’ve changed names, separated it into two modules, and replaced IORef with MVars in order to better support multi-threaded use — but at heart it’s identical to Bastiaan’s code.

The basic idea is that an observable is a piece of mutable state, and when it’s changed, a list of observer functions are called in sequence, and passed the new value. For wx programming – which is anyway quite stateful and not very idiomatically functional – I’ve found it to be quite effective. Here’s a small example which illustrates basic use…

My GUI has a start/stop button, which toggles a running flag. When running is true, the button says “stop”, and certain things happen in the app’s idle handler (to run whatever is supposed to be running); when running is false, the button says “start”, and certain other things happen in the app’s idle handler. So here’s how we set it up:

In my app’s setup phase, I create running, which is initially false, using createSub :: a -> IO (Sub a):

    running <- createSub False

Later on (actually in a different function, which receives running as a parameter), I create the start/stop button and use running‘s current value to initialise its appearance:

     r <- getValue running
     btnRunStop <- button p [text := runStopBtnTxt r, on command := runStopClicked running]

Here, runStopBtnTxt :: Bool -> String just returns “start” or “stop” appropriately. We use it elsewhere, which is why it’s factored out. When the button is clicked, runStopClicked toggles the state of running (and does other stuff not shown here):

    runStopClicked :: Sub Bool -> -> IO ()
    runStopClicked running = do r <- getValue running
                                running `setValue` not r

(This would actually be neater using changeValue :: Sub a -> (a -> a) -> IO (), but in the real version I use r later on, so this I need the getValue anyway.)

So this setValue call will cause all of running‘s observers to be called with not r. The last thing we should do, then, is attach an observer:

    addObserver running (\x -> btnRunStop `set` [text := runStopBtnTxt x])

addObserver :: Sub a -> (a -> IO ()) -> IO () takes an observable subject and an observer function, and attaches that function to the subject’s list of observers. Here, the observer function just updates the text.

(It’s worth noting that the observer’s type is a -> IO () not Sub a -> IO () — i.e. observers receive values, not subjects. The subtext is that an observer shouldn’t modify the subject it’s observing, lest ye loop infinitely; you can get round this by passing the subject proper as an earlier argument, and provided it used setValue' not setValue you’d be fine — but I don’t know of any reason why you’d want to do this.)

The first win here is that the button’s on command function doesn’t need the button as a parameter, because it’s not directly updating the button text. The deeper win is relatively cheap and clean separation of concerns: the button only toggles the running value, and (separately) stuff happens when running changes. If some other mechanism (eg some network event) could also cause running to change, we’d already be dealing with it The Right Way: just add another observer. Without observers, if our reaction code was in the button’s on command handler, we’d need to call that, or factor the reaction code out, in order to deal with it then. This way’s clearly nicer. But hey, you’re reading a blog post containing type signatures, so I presumably don’t really need to explain separation of concerns, or the observer pattern, to you. :-)

I’m aware that there are more sophisticated approaches to this problem. I’m reliably informed that Functional Reactive Programming (FRP) subsumes observer quite naturally, but I don’t understand it enough to comment. Tom Lokhorst tells us that Erik Meijer & Co. at Microsoft have a good handle on this, as discussed in this entertaining video. Alternatively, if we like message-passing as our basic computational model, an implementation in CHP might be just the ticket — and that certainly looks like an impressively powerful approach. However, in my case, I wanted to keep things in a single thread as much as possible, because Haskell’s GUI frameworks (and indeed wx across all languages?) seem to interact with multi-threading in non-simple manners. This is also pretty much why I decided against implementing Neil Brown’s suggestion that the observers should be run in their own threads. Of course, if an observer wants to fork a thread, it absolutely can, and I may find myself doing this in my app soon — but I didn’t quite see the utility of baking it in. That’s probably my sequential imperative background showing through, and hopefully one day I’ll shake it off and welcome our new, multi-threaded masters. To conclude, everything mentioned in this paragraph is why I decided to release this as simple-observer.

Anyway, hopefully somebody will find this useful.

Generating an index of Haskell haddock docs automatically

Haskell’s great, and has a lovely documentation generation system called haddock. Even better, if you install one of Haskell’s many third-party libraries using the excellent cabal install, it will (if you configure it to do so) generate these docs for you. Having local copies of documentation like this is always good for when you’re offline, so this is A Good Thing.

Sadly, there’s no master index of what’s installed. I’ve got it configured to install globally, so the docs go into /usr/local/share/doc, and each (version of each) package gets a folder of its own there; if you’ve got cabal calling haddock, that folder will contain an html folder with the docs in, but it’s tedious to click through an (otherwise empty) folder to get to it each time, and the whole setup’s not very pretty or informative (and the lexicographic sorting is case-senstivie, which I don’t much like). Eg:

Bare index of Haskell docs

People have attacked this problem before, but PHP makes my skin itch, and I can’t be bothered with apache, so a simpler, static solution seemed right for me.

Thus, I’ve knocked up a (quick, dirty, nasty) python script to generate the index. As a happy side-effect, I’ve pointed it at the hackage CSS file, so it’s pleasingly pretty and familiar:

Pretty index of Haskell docs

I did it in python because that’s still my go-to “hack stuff together in a hurry” language, I guess; but I was very nearly did it in Haskell. Mainly it was library knowledge that drove me. Also perhaps I fancied a break. Hmmm. Next time, for sure.

If anyone’s interested in the code (the doc path is hardcoded, so if you’re installing user docs, change it), you can view it pretty/download it via this link, or just check it out here. Oh, and what about the “automatically” part? Well, just stick it in a cron job… ;-)

Update: I realised I wasn’t linking to the actual index.html file, which kinda defeated the point of the script! However, it’s an easy fix. The line that says:


                s.write('<a href="file://%s">' % package.path)
</a>

should actually say:


                s.write('<a href="file://%s">' % os.path.join(package.path, 'html', 'index.html'))
</a>

Aaaanyway…

Read the rest of this entry »

Really nice git cheatsheet

Here’s a nice git cheatsheet, with examples: Everyday GIT With 20 Commands Or So [via brunns]. Really good — and to think that I didn’t know about git fsck and git gc!

A Haskell shell for computing musical scales, in 58 lines of code.

On my way home from work I had an idea for something I thought I’d find handy. Two hours later, I’ve written it — in Haskell of course. :-)

Context: I occasionally make what might kindly be called music, using computers and the like, and I don’t have any formal musical training. Consequently, I sometimes run into trouble regarding keys and scales. The idea I had was for something which would tell me, given a set of notes (ie the ones I’ve already used in a piece), what scales those notes are in — and consequently what other notes might also fit well with them. Long ago when I used Bars & Pipes on the Amiga, it had a tool for doing this. I could buy one for my Mac for about 8UKP (I may yet — it has a pretty GUI, which I might not be bothered to do), but it seemed I should be able to knock something up. The data types aren’t exactly complicated.

So I did it. I’ll put the code at the end of this post, or you can easily download it from here, in 133 lines of literate Haskell (58 lines of actual code). Here’s a screenshot showing typical use:

Haskell scale finder screenshot

Once again, I’ve been super impressed by how easy Haskell made this — and I’m particularly loving the Shellac library which gives me an interactive shell with history and graceful recovery from errors — all in 4 lines of code!

> react = mapM_ (uncurry printScale) . checkFit . parseNotes
>     where printScale :: String -> [Note] -> Sh () ()
>           printScale n ns = shellPutStrLn (n ++ ": " ++ ppNotes ns)
> main = runShell (mkShellDescription [] react) haskelineBackend ()

OK, I’m not using any of its juicier features like completion, state, automatic help, etc., but even so, it’s a great little library.

I could wrap it in a GUI, but to be honest, this’ll probably do the trick for me, assuming I haven’t made any big screw-ups, or got the scale definitions wrong. If anyone’s looking for a minimal Shellac example, maybe it’ll be useful.

(Update 2009-09-16 13:37: To get this up and running, if you’re not already a Haskeller, the following should suffice: 1. Download and install The Haskell Platform – nice ‘n’ easy. 2. Open a shell and issue cabal install Shellac-haskeline or possibly sudo cabal install --global Shellac-haskeline. 3. Download the code (from hpaste.org) and (still in a shell) issue ghci Scales, and when that’s loaded issue main. Or if you want to compile an executable, try ghc --make -main-is Scale.lhs Scale.lhs, just like in the screenshot. That should do it — I’d love to know if I’ve missed anything and/or if that works!)

Here’s the code (but, again, it’s prettier and easier to download from hpaste):

Read the rest of this entry »

A “Monte Hall” problem solved in Haskell

I’ve just written a solution to today’s Programming Praxis puzzle, which requires you, essentially, to write a Monte Carlo attack on the Monty Hall problem. Thus, the awful pun in this post’s title, in case you missed it. ;-)

Here’s my solution.

It was a lot of fun. I started out feeling like I should be able to do it, but not quite knowing what direction to take, so first of all I was writing some data types for things like Door = Car | Goat, etc. but pretty quickly realised it was just a matter of door numbers. My first working solution was actually a bit longer, because of some intermediate steps (for example the first thing I figured out, in its own function, was how to compute which goat door would get shown to the player given a (car, choice) pair). Then hlint gave me a couple of hints, like using replicateM, and I compacted it down to what we see here. I’m sure it has naive aspects, but I’m pretty happy with it — and I’m loving doing the Praxis problems.

If Philosophers Were Programmers

If Philosophers Were Programmers [brunns].

Nice to see that Wittgenstein (or at least, one of the Wittgensteins) is also a Haskell man…

Modelling lightning with Haskell

Cool: Modelling lightning with Haskell [via dons].

The Algorithmic Beauty of Plants

The Algorithmic Beauty of Plants.

How to use git and svn together

How to use git and svn together — or more properly, how to use git on an svn repository. I’ve been doing this for a couple of things, having switched my brain and day-to-day work patterns from svn to git, and it works nicely. I haven’t tried anything nontrivial such as branching, mind. Anyway, putting it here because some friends were asking about it.

Creating/consuming JSON in Haskell, with the help of pattern guards

I’m going to write about two Haskell things I’ve used today, both of which were new to me: first, creating/consuming JSON using Sigbjorn Finne‘s json library; second, using pattern guards to make the “consuming” part prettier than my first butt-ugly attempts.

Earlier today I googled a bit looking for examples of using the JSON library, but didn’t find any clear or immediately useful ones. There were a few examples of people rolling their own JSON stuff in Haskell, but obviously that wasn’t what I was after. (I’ve just noticed that Real World Haskell talks about JSON, but doesn’t appear to immediately answer my questions — though there is loads of good stuff there of course.) I just wanted something to get me going quickly with the problem at hand, really.

The context is this: I’m working on some code which represents models of devices’ user interfaces as finite automata and tries to do some analysis on that. I want to use JSON to pump FA models between implementations in at least two, and maybe more, languages. None of that is really relevant to the rest of this post, except perhaps in the next paragraph. :-)

I have an abstract data type representing what I call actions, which are essentially lists of transitions, which are just integers (the destination state). Actions may be deterministic (in which case each transition is just an integer) or nondeterministic (then each transition is a list of integers). Again, the meaning isn’t so important; what’s important is that my ADT looks like this:

data Action = DAction [Int] | NAction [[Int]]
  deriving (Eq, Ord, Show)

Now, having imported the appropriate JSON module…

import Text.JSON

… I need to make this type an instance of the JSON type class by defining showJSON and readJSON functions for creating and consuming JSON, respectively:

instance JSON Action where
    showJSON (DAction xs) = showJSON ("DAction", xs)
    showJSON (NAction xs) = showJSON ("NAction", xs)
    readJSON = Error "not yet implemented" -- See below!

Let’s start by thinking about showJSON. It’s important to note that showJSON doesn’t produce a serialised JSON string directly. Instead, there’s an intermediate representation: the JSValue ADT; you serialise values of that type using the JSON module’s encode or encodeStrict functions. This extra layer of abstraction is a big win, because when we’re consuming JSON, we don’t have to write any code to parse the serialised string: we just call decode (or decodeStrict) to turn the string (whose contents should never be very exotic: that’s the whole point) into a JSValue. Then, readJSON‘s job is to turn that JSValue back into our domain-specific representation, whatever that is. That was the bit I was having trouble with (which is why I haven’t shown it yet), but I recognise that doing it this way was a lot easier than parsing the serialised string representation would have been, so I’m thankful for the extra abstraction layer.

In my case (and, I suspect, in most), showJSON is rather straightforward, thanks to some already-existing JSON type class instances. I simply construct a pair whose first element is a string identifying the action type, and whose second element is the action’s transition list. The JSON module already defines instances of the JSON type class for pairs, Strings, lists, and Ints, so the corresponding showJSON functions (which I’ve never seen) do all the work for me. Win.

Time for an example; let’s define a simple Action value…

da :: Action
da = DAction [0,1,2]

and take a look at its JSValue and string representations in ghci:

*Data.FA.JSON> showJSON da
JSArray [JSString (JSONString {fromJSString = "DAction"}),JSArray [JSRational (0%1),JSRational (1%1),JSRational (2%1)]]
*Data.FA.JSON> putStrLn $ encode $ showJSON da
["DAction",[0,1,2]]

The JSValue is quite verbose, but if you compare it with the encoded string, you can comprehend its structure easily enough. JSON doesn’t have tuples, so the pair we fed showJSON has been turned into a two-element list (a JSArray). The string “DAction” is wrapped in a JSONString (a more efficient representation of String) inside a JSString constructor. The list of integers is a JSArray, but each Int is represented as a Rational: the JSON module doesn’t know how to serialise integers directly (AFAICS JSON can, in general?)

So far, so good: serialising my data structure to JSON is pretty easy. Lists of Actions, maps from Ints to Actions, etc. all “just work” too, out of the box: handy.

Extraction of an Action from a JSValue is trickier, and I am by no means confident that I’m doing it in the most sensible manner — though what I’m doing does work, at least! The type signature of readJSON is:

readJSON :: JSValue -> Result a

where Result is a bit like Maybe or Either: it’s either Ok a (it worked) or Error String (fail!). readJSON takes a JSValue, and depending on what the contents of our decoded JSON string were, this could be pretty much anything! So we have a pseudo-parsing situation: we need to match it against what we’re expecting, and if it doesn’t fit, we raise an error. In other words, we have to unpack it, layer by layer. Here’s what I came up with initially:

readJSONAction' :: JSValue -> Result Action
readJSONAction' value =
    case value of
      JSArray [a,b] ->
          case a of
            JSString jsTag -> 
                case (fromJSString jsTag) of
                  "DAction" ->
                    case (readJSONs b)::Result [Int] of
                      Ok xs -> Ok (DAction xs)
                      _ -> Error "can't decode deterministic actions"
                  "NAction" ->
                    case (readJSONs b)::Result [[Int]] of
                      Ok xs -> Ok (NAction xs)
                      _ -> Error "can't decode nondeterministic actions"
                  _ -> Error "Incorrect action tag"
            _ -> Error "can't decode action tag"
      _ -> Error "not an action" 

It sure does look ugly, as you’ll no doubt agree. It unpacks the JSValue bit by bit, using a case distinction to catch unexpected content, which is translated into an Error value. Note the explicit type-tagging in the deeply-nested calls to readJSONs: there are many instances of readJSONs and we need to be explicit about which one we want to call — and they’re different for the two different kinds of action.

This works…

*Data.FA.JSON> readJSON (showJSON da) :: Result Action
Ok (DAction [0,1,2])

… (again, we need to explicitl specify which readJSON we want), but it’s ugly as hell. My friend and colleage pwb suggested “pattern guards might make it prettier”. So I checked them out, and came up with this:

readJSONAction :: JSValue -> Result Action
readJSONAction value
  | JSArray [a,b] <- value
  , JSString jsTag <- a = checkContents jsTag b
  | otherwise = Error "not an action"
    where checkContents tag cargo
              | "DAction" <- (fromJSString tag)
              , Ok xs <- ((readJSONs cargo)::Result [Int])
                      = Ok (DAction xs)
              | "NAction" <- (fromJSString tag)
              , Ok xs <- ((readJSONs cargo)::Result [[Int]])
                      = Ok (NAction xs)
              | otherwise = Error "not an action"

The main disadvantage over the previous version is that we get less detailed error messages, but I think I can live with that for more readable code. It’s still fairly ugly though, and I’m not 100% sure why I need to break out the checkContents part (I did write it in a hurry).

I just can’t shake the nagging suspicion that I’m missing some clever and elegant way of doing this. So if anyone has any pointers, I’d love to hear them. Otherwise, I can at least testify that the above techniques work: maybe that’ll be useful for another JSON adventurer in a hurry, one day. :-)

Neil Mitchell on F# from a Haskell perspective

Neil Mitchell on the F# language [dons].

Real Life Tron on an Apple IIgs

I can’t remember where I saw this, but: Real Life Tron on an Apple IIgs.

A short survey of some chart-drawing options for Python and Haskell

I’ve been thinking about chart drawing a bit lately, partially because I’ve been doing some work which needs it, and partially because I keep seeing pretty pictures like the ones here (or in these slides) and wondering how people produce them.

Perhaps old news, but today I came across the Google charts API, for drawing charts (line, bar, pie, scatter, radar, etc.) via URLs. It’s clearly not capable of the prettiness linked above, but seems quite neat for “workhorse” charting, e.g.

Example pie chart

I particularly like the maps option:

Example map

Naturally, there exist Python and Haskell bindings.

I’ve previously looked at NetworkX, matplotlib and gnuplot, all of which are a bit more heavyweight — though I think only NetworkX, if any, could handle the prettiness mentioned initially.

HaskellCharts was mentioned in the latest Haskell Weekly News.

Today I also found Cairoplot, Chaco (very plot-centric), and the fruity Mac goodness that is NodeBox — very pretty, and looks lots of fun, but not exactly a charting app.

Right. That should be enough to be getting on with, anyway…

Arrows in JavaScript

Arrows in JavaScript.

Arrows generalise computation schemes from the sequential version provided by monads, and are thus hard to bend one’s head around. It’s interesting to see them in JS.

SPJ/Haskell interview in Australian Computerworld

Simon Peyton Jones interview in Australian Computerworld [via galois].

A nice interview discussing various aspects of the history and philosophy of Haskell. SPJ sees purity, monads, and type classes as Haskell’s key aspects, and mentions a number of interesting other ideas along the way (along with uniqueness typing as seen in Clean, and functional reactive animation, which I first came across back when I was investigating music in Haskell).

More generally, this looks like a good series to keep an eye on: also contains interviews on Forth, JavaScript, Lua and other more boring languages such as C++. Shame there’s no obvious feed.

What Every Programmer Should Know About Memory

What Every Programmer Should Know About Memory (114 page PDF) [abstract absurdities].

Why functional programming?

Because it will warp your mind [raganwald].

So I’m here to say that mindwarp #3 is discovering the function as the basic unit of abstraction. Jaw-droppingly beautiful abstractions and generalizations can be created out of just functions. You can rediscover the usefulness of partial functions and currying, which were techniques created in the 1800s. You can be in the direct lineage of Alan Turing, who used higher order functions in the 1930s to define his theoretical Turing Machine in his paper “On Computable Numbers, with an Application to the Entscheidungsproblem.” And you can finally understand recursion in a deep and intuitive way, and you’ll feel like you’ve looked into the abyss and somehow come back to tell everyone else about it. And maybe, just maybe, you can explain to me what a freakin’ monad is.

Word.

LoC is a measure of cost, not productivity

Yet people talk about programming as if it were a production process and measure “programmer productivity” in terms of “number of lines of code produced”. In so doing they book that number on the wrong side of the ledger: we should always refer to “the number of lines of code spent”.

Dijkstra, The Fruits of Misunderstanding, and also a similar sentiment (earlier) in “Why is Software so Expensive?” An Explanation to the Hardware Designer — quote spotted on reddit.

… and from the second (first, chronologically) of those essays, Dijkstra hitting the nail on the head with regard to aspects of some recent rumblings about higher education:

To the economic question “Why is software so expensive?” the equally economic answer could be “Because it is tried with cheap labour.” Why is it tried that way? Because its intrinsic difficulties are widely and grossly underestimated.

To any of my students reading this: don’t underestimate the difficulty of the tasks we’ve been educating you to tackle; thus, don’t underestimate your worth if you get good at attacking those tasks; thus, hopefully you’ll appreciate (if not now then one day) the value of a degree in (actual, not pretend) Computer Science.

UK CCTV used to create a music video

UK CCTV used to create a music video [risks].

Unable to hire a production crew for a standard 1980′s era MTV music video, they performed their music in front of 80 of the 13 million CCTV “security” cameras available in England, including one on a bus.

Also good, from the same RISKS digest: How not to use SSL, viz SSL-encrypt the page data, but send the credit card details in cleartext in the URL — win!

Other good reasons: the existence of Python, Ruby, Haskell, Smalltalk, …

There are many, many reasons why you should never ever accept a job which involves programming in C++ [via TR].

Interaction design and programming: “best to market beats first to market”

Here are some very interesting notes on a keynote by Alan Cooper (of “The Inmates Are Running The Asylum” fame) at the recent IxDA Interaction08 conference.

Quote dump:

I don’t think we ought to be emphasizing innovation; in our industry, it will happen. Innovation means invention, but in the minds of business people, I believe it has come to mean “success”. But innovation doesn’t mean success (holds up that original clunky MP3 player that failed–”this was innovative, but was never successful”).

Business people don’t often do best-to-market because they don’t know how–we need to show them. Many have industrial age skills. They view software as mass production. Best-to-market only happens through craftsmanship. It’s all about quality–it’s all about getting it right, not to get it fast. It’s measured by quality, not speed. It’s a pure measurement, and a delightful one. Craftsmen do it over and over again until they get it right. In their training, they building things over and over so they get the experience they need to get it right.

Programming is not an industrial activity. It is a unique activity that has a lot in common with pre-industrial craft, and yet has a lot of unique characteristics that I call “post-industrial craft”. Programs are made one at a time, and each piece is different. It’s not scalable and it’s not formulaic. There’s no magic bullet in pre-industrial craft. We can’t say we’re all going agile and everything will come up roses. It’s incredibly nuanced and takes years of study to get right.

Programmers are draftsmen. However, they are different than pre-industrial workers. They are self-directed and know better than managers what to do. They respect intelligence, not authority. You can’t tell them what to do, you can only coerce them. Their satisfaction comes from the quality of their work.

But there are no economies of scale is software production, all you can do is reduce the quality that emerges. There are simply no good management tools for software construction. There are no appropriate tools for accounting for the creation of software. There’s no way to track any given feature, functionality, or behavior to the amount of money coming.

… and it goes on from there to say (less clearly, alas) what interaction designers can bring to this conflicted situation. Food for thought.

Database version control strategy

Quite an interesting five-part series on database version control. I haven’t used this particular approach myself, but it looks well thought through and sensibly structured, at first blush at least.

On prototypes and real applications

Quite so: prototypes and real applications.

Your prototype needs to be written quickly and then it needs to change quickly. You’ll only be able to do that with a maintainable, flexible code base. In short, a well-written code base. You’re a proficient software engineer, you know how to do this. You probably do it without even thinking.

And at some level, everyone knows this. That’s why prototypes are created in languages like Python. A language that you can write quickly, but also write well, quickly.

“This bug is at once easy to fix and impossible.”

The most common Unicode processing bug [reddit].

Data files should contain data

Data files should contain data — on the dangers of using Excel, among other things. True words – a lot of my exam mark processing headaches are due to data files with madly formatted content (not, I must admit, produced by Excel).

On the relative positions of C and Haskell on the Mohs scale

C isn’t hard; programming in C is hard. On the other hand: Haskell is hard, but programming in Haskell is easy.

Seen on this thread.

In good company

Fire And Motion — Joel on Software.

Once you get into flow it’s not too hard to keep going. Many of my days go like this: (1) get into work (2) check email, read the web, etc. (3) decide that I might as well have lunch before getting to work (4) get back from lunch (5) check email, read the web, etc. (6) finally decide that I’ve got to get started (7) check email, read the web, etc. (8) decide again that I really have to get started (9) launch the damn editor and (10) write code nonstop until I don’t realize that it’s already 7:30 pm.

Glad it’s not just me, then.

More infinite list tomfoolery in Haskell

More infinite list tomfoolery in Haskell — the one for pi’s particularly “WTF?”.

Practical Common Lisp

Practical Common Lisp — free book online [reddit]. One to read along with SICP some time in the next 12 months.

Core JavaScript 1.5 Guide

For my future reference: Mozilla Developer Centre Core JavaScript 1.5 GuideTR tells me this is the most sensible way to learn Javascript.

He particularly recommends the page on Class-Based vs. Prototype-Based Languages. JavaScript has a lot in common with Self, apparently.

Parsing Expression Grammars in Ruby with Treetop

A couple of years ago I read and greatly enjoyed Bryan Ford’s 2004 paper “Parsing Expression Grammars: A Recognition-Based Syntactic Foundation” [PDF, 134Kb] (wow, it’s still on my desk, in fact); while to my disappointment it proved inapplicable to my research (I’m constrained in the parsing and language description technologies I can use by what I have to integrate with), it was clearly Cool Stuff.

Yeah, so, apparently some Ruby hacker read that paper too [reddit].

Why Haskell is Good for Embedded Domain Specific Languages

Why Haskell is Good for Embedded Domain Specific Languages — nice summary, I’d say.

A Subversion Pre-Commit Hook in Python

A Subversion Pre-Commit Hook (in Python) [smallcool]. That’s bound to come in handy some time… :-)

“Perl, I’m leaving you”

Damn straight.

How to disable underscore subscript in TeX mode in emacs

After some random recent upgrade, emacs started doing something annoying: when in TeX mode (eg when editing LaTeX docs, which I do a lot), it would treat every underscore in the document as a signal that the next symbol should be subscripted, and display it as such: the character would be offset vertically using font-lock magic. This is very annoying, because underscore only means subscript in maths mode; the rest of the time, it’s wrong and silly to visibly subscript the thing. In general, I don’t want emacs to do such semi-WYSIWYG-ness – syntax highlighting is about as much as I need, thank you.

Anyway, how to turn it off is non-obvious, and required several increasingly imaginative google search strings. In the end I came across this recent thread on gnu.emacs.help, one of whose posts suggested that turning down font-lock-maximum-decoration from “maximum” is the right thing to do. Yep, that works: I changed it to “nil” and now I’m happy.

A lucid explanation of closures

Here’s a pretty clear explanation [reddit] of closures, which have hit the mindspace in a big way since Rails made Ruby the hot sauce of the day. When I say “clear”, I mean, perhaps, from a “traditionalist imperative” point of view — the discussion of stack frames and the funarg problems in particular. Of course, lambda-heads and category theorists probably have other criteria for “clear”. ;-)

Ruby vs Java/PHP ads

Rails vs Java in style of PC vs Apple, PS3 vs wii, etc. [digg]

Full set on youtube: 1. Rails vs Java; 2. Rails vs PHP; 3. Rails vs PHP; 4. Rails vs PHP.

A pragmatic look at monads in Haskell

(Within an hour or two of publishing this, it was pointed out to me that this talk is really about the IO monad rather than monads in general, and that in particular the assertion that a monad represents a computation which performs a side-effect is not, in general true. A nice example is the Maybe monad. So a better title for this talk is “A pragmatic look at monadic I/O in Haskell”.)

A pragmatic look at monads in Haskell (PDF, 293KB) – slides from a talk I gave last Friday in Swansea University’s Computer Science department, as part of our “No Grownups” series of non-research talks given for and by postgraduate students.

The aim of the talk was to explain monads to a non-expert (and, largely, non-Haskell-programming) audience: why do we have them, what problems do they solve, and how are they used? The approach is pragmatic in that the talk explicitly does not go into technical details, instead focusing on a broad understanding, and on some specific useful “rules of thumb” for programming with monads in Haskell. I don’t claim to be an expert on monads or to have produced a talk which is authoritative or even necessarily completely correct. I do hope to have produced something reasonably comprehensible and useful, however. I would welcome any feedback, comments, corrections, clarifications, etc.

The talk was filmed, but I don’t know if my delivery was good enough to warrant putting it online. :-) Let me know if you’re interested. The post-talk discussion was quite useful, so it might be worth it for that. In particular, there was a question about when exactly the “non-purity” happens – when does referential transparency go away? My answer was that it happens when it needs to (ie when the result is required) and that yes, obviously somewhere there is some code which isn’t pure Haskell and which is doing the impure operation – eg an operating system call. Markus opined that a big part of the point of monads is to give us a clear indication, in the type system, that an operation is impure and thus, in some sense, unsafe/not nice. I thought that was a good enough point that I’ve since added a bullet saying that to the talk – but that’s the only addition I’ve made before publishing.

Background/reference material: A History of Haskell: being lazy with class (historical context), monads @ wikibooks (the nuclear waste metaphor), IO inside: down the rabbit’s hole (probably the point where I started understanding monads), rules for Haskell I/O (not an influence, but something with a similar flavour which I saw when I’d nearly finished), do-notation considered harmful (desugaring), monads on the unix shell (just because the dirty string “dramatisation” is so great).

Primes one-liner in Haskell – oh my!

mux on #haskell pasted what is perhaps the most beautiful code I’ve ever seen in my life:

      <mux> > nubBy(((>1) .) . gcd) [2..]
<lambdabot>  [2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71,73,79,83,89,97,101,...

(Number two in an ongoing series of jaw-dropping Haskell one-liners which began with the fibonacci series.)

Haskell packages gotcha: global vs. per-user package databases

This morning, I was having a problem involving Haskell’s Cabal build system. Since it’s something which could conceivably affect others, and since googling on the error text didn’t help me, I’m going to report my experience for the benefit of future travellers/googlers.

I was trying to (re)build xmonad, and got the following error:

[gimbo@orb xmonad] runghc -v Setup configure 
   Could not find module `Distribution.Simple':
       Use -v to see a list of the files searched for.
Read the rest of this entry »

Python-style string split (and strip / trim) in Haskell

Haskell’s an incredible language but I’m repeatedly struck by (what I perceive as) gaps in its standard libraries (though as proven yesterday I could know them better). For example, there aren’t any cross-platform file path manipulation routines in the style of python’s os.path (though not for long, thanks to Neil Mitchell). Another lacking area seems to be string manipulation.

Today, I needed to split a string up according to some token, where that token is itself a string, not a single character. Thus, a split in the style of python’s strings’ built in split() method. That is, I wanted to be able to do this:

*StringSplit> split ", " "one, two, three, four"
["one","two","three","four"]
*StringSplit> split "an" "banana"
["b","","a"]

To my amazement, this isn’t a standard library function, and I couldn’t find hundreds of examples of how to do it via google. I couldn’t even find one! I could find a number of variants where the token is just a single Char, usually utilising takeWhile and the like. Gladly this led me to a reasonable (I hope) solution. This example splits on Chars but in the comments I was introduced to unfoldr, which looked tantalisingly close to what I wanted. All I needed to do now was write the function unfoldr calls. About half an hour and much hoogling later, hey pesto:

module StringSplit where

import List

-- | String split in style of python string.split()
split :: String -> String -> [String]
split tok splitme = unfoldr (sp1 tok) splitme
    where sp1 _ "" = Nothing
          sp1 t s = case find (t `isSuffixOf`) (inits s) of
                      Nothing -> Just (s, "")
                      Just p -> Just (take ((length p) - (length t)) p,
                                      drop (length p) s)

All of the complexity is in sp1 (short for “split once”), which performs a single split. sp1‘s type is String -> String -> Maybe (String, String); the section (sp1 tok) then has the right type for unfoldr (sections just seem to be cropping up everywhere lately, eh?). The point of sp1 is to split s at the first occurrence of t. It does this by using inits to build all prefices of s and finding the first one having t as a suffix. If found, it splits the string at the token, throws the token away, and return the (before, after) pair. If the token isn’t found, it just returns s paired with an empty string; that gets passed by unfoldr to sp1 on the next call, causing sp1 to return Nothing, causing unfoldr to finish – whew!

That last bit sounds a bit complicated, but it was necessary to fix a bug present in my first implementation of sp1:

sp1' t s = case find (t `isSuffixOf`) (inits s) of
               Nothing -> Nothing
               Just p -> Just (take ((length p) - (length t)) p,
                               drop (length p) s)

This misses the final substring, so split "n" "banana" would return ["ba", "a"] instead of ["ba", "a", "a"]. Hence the whole empty string/Nothing/stop machinery.

I could not have written this code two weeks ago, I think. Thus, I’m sure there are better ways to do this, and would love to hear them – and any criticism. I was a wee bit worried about using inits and find in this way for finding a substring (no standard substring function?). Haskell’s lazy nature ought to help me here though: inits will only build the entire list of prefices if t isn’t in s at all. For anything I’m planning to use this for, it ought to be fine; it’s probably not the best general way to do it though.

Postscript

For good measure, since I had to implement it later that evening, here’s a string.strip() (aka trim):

strip :: String -> String
strip s = dropWhile ws $ reverse $ dropWhile ws $ reverse s
    where ws = (`elem` [' ', '\n', '\t', '\r'])

I’m particularly proud of the ws part. :-) Look at me, all full of Haskell pride. ;-)

Postpostscript

I realised I can generalise split beyond strings, to work on lists of anything of class Eq:

-- | A function inspired by python's string.split().  A list is split
-- on a separator which is itself a list (not a single element).
split :: Eq a => [a] -> [a] -> [[a]]
split tok splitme = unfoldr (sp1 tok) splitme
    where sp1 _ [] = Nothing
          sp1 t s = case find (t `isSuffixOf`) $ (inits s) of
                      Nothing -> Just (s, [])
                      Just p -> Just (take (length p - length t) p,
                                      drop (length p) s)
*StringSplit> split [2,3] [1,2,3,4,3,2,1,2,3,4]
[[1],[4,3,2,1],[4]]

Parametric Polymorphism and the Girard-Reynolds Isomorphism (video)

Great stuff: Parametric Polymorphism and the Girard-Reynolds Isomorphism [reddit] – a 30 minute Google Tech Talk by Phil Gossett. A really clear presentation covering a number of interesting topics, with a very convincing historical sweep. Certainly cleared a few things up for me, at least. His response to the last question is really more Hindley-Milner than Girard-Reynolds, mind. ;-)

Update 2007-04-23

Phil Wadler comments.

More playing with sections (and flip) in haskell

Last week I wrote about sections in Haskell, and discovered flip. Today I hit upon a problem which got me thinking about them again, leading to one of those wonderful moments when Stuff You Just Learnt turns out to be exactly what you need to solve the problem you’ve got right now. (Then you blog about it and find an Even Better Way, as it turns out.)

I’ve been writing some file-manipulation code, and along the way had written the following function:

-- | Check if a given string (subject) ends with another (target).
endswith :: String -> String -> Bool
endswith sub tgt = drop (length sub - length tgt) sub == tgt

(Actually I’ve forgotten now whether I wrote this myself or found it somewhere, but it’s clearly not a new problem. If anyone knows a source for the above code, please do tell so I can attribute it. Anyway…)

So then you can do:

*Ew> endswith "banana" "na"
True
*Ew> endswith "banana" "moo"
False

So far so good. Later, I had a list of filenames, and I wanted to filter it down to just those ending with a particular suffix. Initially, unthinkingly, I tried:

*Ew> filter (endswith "na") ["banana", "cat", "iguana", "mastodon"]
[]

Spot the cunning use of a section. Actually not so cunning, as it turns out – the empty list is not the result I was hoping for! The problem is that the first parameter of endswith is the subject (the string to be searched) not the target (the string to be searched for) so the section (endswith "na") is not the function which checks if a string ends with “na”. It’s the function which, given a string, checks if “na” ends in that string:

*Ew> (endswith "na") "banana"
False
*Ew> (endswith "na") "a"
True
*Ew> (endswith "na") "na"
True

My first reaction was to just swap the parameters around. Let’s call this version endswith'. It’s exactly the same as endswith except for the parameter order:

endswith' :: String -> String -> Bool
endswith' tgt sub = drop (length sub - length tgt) sub == tgt

Then (endswith' "na") is the section I’m looking for, and in particular works as I want under filter:

*Ew> (endswith' "na") "banana"
True
*Ew> filter (endswith' "na") ["banana", "cat", "iguana", "mastodon"]
["banana","iguana"]

I only use endswith in a couple of places, so this change was totally viable. Lesson learnt (or so I thought): you have to think carefully about parameter order when writing functions, because some sections will be more useful than others. Obviously with endswith, the section with bound target is more generally useful than the section with bound subject.

Later, I realised this lesson is bogus, or at least not very strong. I mean, there’s probably something in it, but I didn’t actually have to swap the parameters to make my filter work. That’s exactly what flip does:

*Ew> ((flip endswith) "na") "banana"
True
*Ew> filter ((flip endswith) "na") ["banana", "cat", "iguana", "mastodon"]
["banana","iguana"]

What’s going on here? (flip endswith) is endswith with the parameters flipped. Thus ((flip endswith) "na") is exactly the same function as (endswith' "na") which, as we saw above, does what I want. So in the end, I didn’t have to replace endswith with endswith' to get my filter to work – I could just flip it. Bearing in mind I didn’t even know flip existed a week ago, I felt pretty smug to realise this. :-)

The next question on my mind is what you might call “generalised sections for multi-parameter functions”… If I have a 4-parameter function and I want to build a section where 2 of the parameters are bound (say), and it’s not the first two, how do I do it? Stupid toy example:

f :: Int -> Int -> Int -> Int -> Int
f a b c d = a + b + c + d

Can I make a section representing f with b fixed at 3, and d fixed at 5, say? I know I can do this:

g :: Int -> Int -> Int
g a c = f a 3 c d

but can we do it inline, as it were, as with the section approach? Answers on a postcard to the comments page, please! :-) I suspect currying‘s going to wind up in there somewhere…

Postscript

That’s all I originally intended to blog about; when chasing links to support this post, however, I read this and learnt two more interesting things.

First, someone has, of course, solved this problem already, and I can just use isSuffixOf to get this job done (Strings in haskell are Lists of Chars). So now I get to throw endswith away – w00t, less code. A fundamental lesson: once the language starts making sense, learn what’s in the standard libraries!

Second, thanks to Haskell’s ability to deal with functions in infix form, the section (isSuffixOf "na") can also be written as ("na" `isSuffixOf`) which is, I suppose, more natural (provided you’re not scared off by the backticks, which I’m trying not to be). Then we have:

*Ew> filter ("na" `isSuffixOf`) ["banana", "cat", "iguana", "mastodon"]
["banana","iguana"]

Sweet.

Playing with sections in Haskell

Consider the function (:) in Haskell…

Hugs> :t (:)
(:) :: a -> [a] -> [a]

So, (:) is a polymorphic function which takes an element of some type, and a list of elements of that type, and returns a list of elements of that type. In particular, it prepends the singleton element onto the list. For example:

Hugs> 'c' : "ello"
"cello"

Now for the fun part.

Hugs> :t ('c':)
('c' :) :: [Char] -> [Char]

Here, we see that ('c':) is a function which takes a list of characters (ie a string) and returns another one. As you might expect, it prepends it with a 'c':

Hugs> ('c':) "ello"
"cello"

('c':) is an example of a section: we “partially evaluate” a function by providing only some of its parameters, which yields a new function. Obviously this is only possible if we have higher-order functions in our language.

We can take the section “on the other side” of the :, and define the function which takes a character and prepends it onto the string "ello"

Hugs> (:"ello") 'c'
"cello"

What’s its type?

Hugs> :t (:"ello")
flip (:) "ello" :: Char -> [Char]

Interesting. The type is as we would expect (Char -> [Char]), but what’s this flip thing?

Hugs> :t flip
flip :: (a -> b -> c) -> b -> a -> c

So flip takes a function with type signature (a -> b -> c) and returns an equivalent function with type signature (b -> a -> c) – OK, simple enough. But why did hugs introduce it here, I wonder, though, why didn’t hugs just do what ghci does, and say:

Prelude> :t (:"ello")
(:"ello") :: Char -> [Char]

? I say: shrug.

In other news, I’ve just discovered Neil Mitchell’s Haskell blog. I met Neil at BCTCS 05 in Nottingham, and again at BCTCS 06 in Swansea (though at that time I felt like I was moving away from TCS). He’s a thoroughly nice bloke, and clearly knows considerably more about Haskell than I do, thus, a good blog to find. I enjoyed:

I think I was quite a bit less successful at explaining this one. The only way you can show sequence is not necessary is by writing an OS, a compiler, a web server etc in Haskell – all of which have been done. Unfortunately I only had a small piece of paper so couldn’t write any of those.

Neil Mitchell, “Describing Haskell to an Ada programmer”.

(The section saga continues here.

How many?

How many software engineers does it take to change a lightbulb?
Just one. But the house falls down.

software engineering proverbs

There’s no escape

SQL injection attacks in a nutshell.

All about python and unicode

Handy for future reference and pointing confused ASCII-loving students at: all about python and unicode, just the thing to explain the mysteries of unicode in a friendly manner, especially, er, if you know python.

Monad wrangling, and the joy of #haskell

Three days ago, I enthused about the Python community, and what a great help to me they were back in the day.

These days it’s Haskell that has my brain feeling too small, and I’ve just had my first experience of the Haskell community. I installed an IRC client for the first time in four years just so I could log on to the #haskell IRC channel. I’m happy to report that the experience was entirely positive.

For an imperative programmer, learning Haskell is initially hard because you have to stop thinking in terms of issuing instructions which will be performed in order, and start thinking in terms of composing pure functions (which have no side effects) together to Get Things Done. It’s weird at first, but ultimately very powerful, as it lends itself much more nicely to higher-order programming, lazy evaluation, actually composing programs out of reusable parts, etc. I’d say I’m starting to get a reasonable feel for that, although I’ve got a long way to go.

Unfortunately, a little later, you realise that there are cases where you need side-effects (eg input/output), which leads you into this whole sort-of-imperative side of Haskell, at the heart of which lie the hairy monad things. You totally can’t avoid this, because sooner or later you need to do I/O. Monadic I/O (and monads in general) look & feel imperative at first, but you soon hit barriers thinking like that, and ultimately really have to read stuff like The I/O Inside Tutorial, Tackling The Awkward Squad, etc. in order to understand what’s really going on, which is actually really clever, decidedly not imperative, and at some point turns into lambda calculus or category theory (take your pick).

It’s monads that I’ve been wrangling with lately. I’ve been trying to do a fairly simple I/O task: I have some directory containing some files; I want to operate on a subset of the files in that directory, for each of them reading the file and creating some data object representing its contents. The details aren’t important. Doing this in Python (say) is trivial. Doing it in Haskell has had me stuck for nearly a week. :-) I spent all day last Friday working through “I/O Inside”, and now understand monads much better than I did, and maybe half as well as I should (but maybe not even). That was all very informative and educational, but still didn’t answer my problem.

Anyway, long story short, tonight I installed an IRC client, went on #haskell, asked the question, and got an answer immediately and in a wonderfully friendly fashion. Full #haskell chat log for today is here if anyone’s interested, but in essence it turns out that mapM is what I need for this task. Sweet, and actually incredibly simple when you know it. I think a lot of Haskell is like that…

By the way, the #haskell channel has this neato lambdabot running, which among other handy functions, remembers and repeats amusing/apposite quotes when instructed to do so. Given the sad theory geek that I am becoming, this quote it kicked up tonight made me chortle:

Binkley: the sex is all in the operational semantics, denotational semantics only deals with love.

vincenz, 14:41:22

Threads should pass messages, not share memory

Highly recommended reading for any of my students out there: a comparison of message-passing concurrency vs. shared-memory concurrency, with a healthy dose of historical perspective. The author introduces Erlang-style concurrency in a Java-ish setting, and does so quite well, to my mind.

Reading the introductory remarks about candidates in interviews, I was pleased, nay, smug to realise that – albeit inadvertantly – I came to multi-threaded programming via the message-passing route, and would probably have made him quite happy if he’d interviewed me. Back when I worked at Frontier I did my first multi-threading work, in Python, and made heavy use of its excellent Queue class for inter-thread communication. Queue provides a thread-safe message passing mechanism, hiding all the nasty details of locking from me, which was exactly what I was looking for. My threads shared almost no state, and what state they did share was mostly Queue objects. They communicated by passing messages through Queues (messages could be anything, and often were), and it was all lovely and clean.

Why did I go down that route? No genius; I just got lucky (yeah, lucky in that I was using Python not Java or C or C++). I had excellent advice from the good folk on comp.lang.python/python-list: this was the way to proceed. Of course, looking back I realise many of these guys knew all about message passing vs shared memory, they knew about Erlang, they knew about Haskell, hell some of them even knew about Lisp. A community as smart and welcoming as that one is a precious resource for a budding programmer.

Anyway, this led to two strongly noticeable results.

First, my code worked well, and didn’t suffer from mysterious hard-to-debug race conditions, etc. It “just worked”, as is often the way with Python.

Second (confession time), I didn’t actually learn properly about semaphores, monitors, shared memory concurrency and all its ridiculous fiddly baggage until I came to teach them in the Operating Systems module at Swansea! By then I’d already formed a strong sense that high-level languages (and Python in particular) made life so much sensibler, so the shared memory stuff slotted quite readily into the mental space of “low level stuff which has to be understood, but is best done by software not humans” (like much of that module).

I was discussing this whole issue with one of my students earlier in the week. If she closed her app’s main window while a worker thread was running, the program would exit uncleanly. This being Python, it was nothing more drastic than an exception/traceback, but she found this properly displeasing and wanted to clean it up (good, I said). It turned out that the main thread wasn’t waiting for the worker to finish: it exited immediately, cleaning up all data, including data the worker was trying to update. Hence, exception city. I showed the simple fix (make the main thread wait for the worker to finish, using a shared boolean “I’m not dead yet” variable), but then I tried to persuade her that message-passing concurrency was the way to go for all inter-thread communication. Even, said I, right down to the (frequent, many) interface updates caused by the worker thread. That is, I suggested, the worker shouldn’t update the GUI component directly, because the GUI is owned by the main thread. Instead, the worker should pass messages to the main thread telling it what updates to perform, and the main thread should poll for these messages and do the updates. I don’t think I sold her on it entirely, but maybe I planted sump’n.

(Caveat: yes, if performance really matters – eg you’re doing volume graphics – this may be poor advice. For the other 95% of us, however…)

The Quest of the Algorithm of the Chimes of the Bells of the Clock of the Long Now

I’m on a bit of a quest at the moment, which is turning up all sorts of interesting bits and bobs (and distracting me nicely from the work I should be doing today).

Yesterday I listened for the first time to Danny Hillis‘ 2004 Long Now seminar on “Progress on the 10,000 Year Clock” (listen here). Like most of these seminars (which I really should write about in more detail some time) it’s well worth a listen, bristling with neato facts, insights, and mind-expanding ideas. (A less time-consuming but still usefully detailed introduction to the clock project can be found here.) One of the things which particularly caught my imagination was the discussion of the bells. Apparently His Enoship noticed that the number of days in 10,000 years is almost exactly equal to the number of permutations of ten things (365 x10000 = 3650000; 10! = 3628800) – so the idea is that once per day these ten bells will chime in a certain order, never heard before, never to be heard again. (Eno has released a CD of “bell studies”, which is on its way to me even as I type.) Hillis invented an algorithm for the ordering of the chimes, so we know what ordering will be played for any day in the next 10,000 years (Eno’s CD plays the sequences for January 7003) and as you’d expect, people have run with this idea. I have an itch to run with it too, probably (predictably) in Haskell.

So where’s the algorithm? I haven’t found it yet, to my great surprise. :) I would have expected it to be fairly easily available, but apparently not. It might be in Stewart Brand’s book about the clock (also on its way to me), but I sure don’t see it anywhere online. However, there are a couple of implementations floating around, so perhaps some reverse engineering is in order…

Sean Burke has created a web page for exploring the bell patterns, with visualisations and MIDI downloads – code in Perl (and a Postscript (!) version here). Not fancying reverse-engineering the Perl too much, I wrote and asked Sean if there was a better source for the algorithm. Apparently Danny’s original version is in Mathematica. I haven’t found it, but Sean says he’ll send me what he can find in a few days. Otherwise, I guess I’ll keep digging, ask around on the Long Now forums, etc.

Sean pointed out that, when reverse engineering, “there’s understanding, and there’s understanding”, and pointed me at this fantastic war story, which sounds like a Daily WTF candidate to me. I’ve been there in the past, and in my case it was spaghetti Perl I was banging my head against – not pleasant. Still, the whole process of reverse engineering, of picking apart the code slowly, gradually and gently teasing the tangled knot open, can be a wonderful thing in itself. Or it might just turn out that the Mathematica code is clean and easily Haskellised. I doubt it, though, from what I’ve seen of Mathematica. :-) I expect it’ll live in a much lower-level domain than I want to work in, which is, of course, more than half the fun. If I can take an esoteric algorithm in a difficult language and translate it into beautiful and readable higher-order code, that’d be something worth writing about. So, watch this space (but don’t, of course, hold your breath).

In the mean time, as I said, the search is turning up all sorts of cool stuff. We present:

Another version of the algorithm, by Joe McMahon, in OS X/AppleScript discussed here, here & here (code via that last link). Interesting mention of ChucK too, which looks quite shiny (though again, maybe a little low-level for my taste).

Prototype chime generator diagram – I would wear a t-shirt with that on it.

An interview with Alan Kay from two weeks ago, which also points at the Kay-says-it’s-a-must-read Doug Engelbart essays (and no, I must confess, I hadn’t heard of Engelbart).

Pop culture lives in the present; it doesn’t really live in the future or want to know about great ideas from the past. I’m saying there’s a lot of useful knowledge and wisdom out there for anybody who is curious, and who takes the time to do something other than just executing on some current plan. Cicero said, “Who knows only his own generation remains always a child.” People who live in the present often wind up exploiting the present to an extent that it starts removing the possibility of having a future.

Stewart Brand meets the Cybernetic Counterculture – whee, the 60s!

(These last two via this del.icio.us page on “admirable people”.)

Haskell study plan

Haskell study plan [smallcool].

Haskell and Category Theory

An introduction to Category Theory from a Haskell point of view. As always, monads are where your brain starts hurting.

Foozles!

Yikes. This is waaaaay too close to home: Foozles – a programming language feature comin’ atcha soon! [wadler]

2013: Francoise Boisbleu proves that under a certain formulation, Foozles are a categorical dual to Aspects, which gets everyone terribly excited.

It certainly would.

Classic comment / code impedance mismatch

Awesome, almost Dadaist WTF:

// MUST be set to 1!
Params->ParentalLevel = 3;  // QA now insist this is set to 2.

Made me laugh out loud, anyway.

Hacking lightstats for ultimate tagness

I’ve hacked the lightstats plugin to forget about categories, and instead display “posts per tag” and “bytes per tag”, querying the tables created/maintained by ultimate tag warrior. Examples here. Code available by request.

Arguably the “posts per tag” is a bit redundant given the w00ty uber tag cloud – particularly once I have that displaying post counts in the tooltips. I’m quite proud of the “bytes per tag”, though… One big SQL statement gives me all I need:

SELECT tag, sum(length(post_content)) AS bytes
  FROM wp_post2tag
  INNER JOIN wp_tags ON wp_post2tag.tag_ID=wp_tags.tag_ID
  INNER JOIN wp_posts ON wp_post2tag.post_ID=wp_posts.ID
GROUP BY (wp_tags.tag_ID)

Well, OK, not so big. Quite big as a one-liner though. :-)

I’ve also installed wp-cache, which may or may not be pointless… (And, the following morning, disabled it, because it would prevent comments appearing for up to an hour after they’re posted!)

Why the iPhone puts Java into the past tense (apparently)

In Which I Think About Java Again, But Only For A Moment [smallcool].

Me, I defected long ago. I’m another of those Apple Java engineers who dropped out. I spent five years as a raving Java fanboy, but I gave up after optimizing AWT, implementing drag and drop, and trying to make 1,200 pages of crappy APIs do the right thing on the Mac. Then I took a one-week Cocoa training course, and wrote the first prototype of iChat.

Desktop Java never worked because Sun tried to build their own OS on top of the real OS, duplicating every API and feature. This led to terrible bloat, making every app as heavyweight to launch as Photoshop. Worse, the GUI portions of the Java platform are awful, because Sun is a server company with no core competency at GUIs. The APIs are too clumsy to code to, and compared to any decent Mac app, the results look like a Soviet tractor built on a Monday.

He almost makes me want to get a Mac, ditch BSD and emacs, and start writing Cocoa apps – except that then life would just be too darn easy, and I’d never hear the end of it from TR (or Bash, probably).

It’s always somewhat depressing, or at least downheartening, to see someone like this tell me I shouldn’t be using emacs. It always makes me wonder if, maybe, they’re right – maybe out there there’s an editor that’ll do everything emacs does for me, but somehow nicer, more productive. Usually, as far as I can tell, that means the editor does IDE-like things such as autocompletion, code browsing, etc. – and yes, that’s stuff I just don’t use when coding. But gosh darn it, I’ve tried a lot of editors and nothing has ever come close, for me, to the feeling of power and (welcome) flexibility emacs gives me. Effortlessly editing multiple files in multiple split views in multiple windows (across multiple virtual desktops), powerful and easy regexp and macro capabilities, and (as one of the commenters on the linked article says) just doing The Right Thing with indentation in Python, Java, C, Haskell, … So the downheartening aspect is the tantalising feeling that there’s something else out there I should be using, but I just can’t find it! Of course, not using a Mac probably doesn’t help me, here. ;-)

I haven’t listened to Depeche Mode for a while, mind…

Oh, and here‘s another “Java is rubbish” story (comparing EJB lines of code with python/django) from the same source.

You can’t beat python for seeing in the new year…

Gimbo’s New Year’s Eve 06/07: human_twister.py

Fibonacci series one-liner in Haskell

Here’s a beautiful example of why Haskell is the most advanced programming language on the planet – a one-line definition of the entire fibonacci series:

fiblist = 0 : 1 : (zipWith (+) fiblist (tail fiblist))

Stunning. Note that that is not a function to calculate the nth fibonacci number: it really is a definition of the entire series. If you want the nth fibonacci number, look up the nth element of that list. Let’s see you do that in Java! (Or C#)

(Via the introductory series on Haskell at the rather good Good Math, Bad Math (more on the author here).)

Rails vs Django

Rails vs Django [smallcool].

Interesting and balanced. I tried Django about a year ago and did indeed get going with it quite quickly, although the lack of migration was a big pain in the butt, and sounds like a killer feature in Rails.

Logo in Javascript

cs fd 70 repeat 11 [ repeat 17 [ rt 226 fd 75 ] fd 75 ]

Thanks, TR! How lucky the children whose first language was Logo.

Design patterns as responses to programming language deficiencies

We teach a third-year/MSc module called “Design Patterns and Generic Programming”. Instantiated by Oliver Kullman, and now taught by Chris Whyley, it introduces these topics in the context of C++ (and is, for most of our students, their first exposure to that language).

On Friday I suggested something to Chris which I’d been mulling over for a while, namely an “anti-patterns” lecture at the end of the module. It would be “anti” in two senses… First, a discussion of the idea of antipatterns: things to avoid. Second, and more interestingly, a discussion of the idea that patterns are just a stop-gap response (albeit a highly rational one) to deficiencies in your programming language, and that more advanced languages make them trivial or meaningless. Chris thought this sounded good, so now I’ve got to gather my thoughts. In timely fashion, along comes this piece (via raganwald) expounding the very same idea. (Disclosure: it’s not my idea, it’s something I’ve been seeing mulled over and expressed in varying depth and eloquence on the blogosphere of late.) This post is particularly interesting in that it looks for pre-GoF patterns, recognising that patterns aren’t a specifically Object-oriented phenomenom, but rather a general software development phenomenom, and we can excect to see new patterns in the future, as the patterns of today fade into the undistinguished background.

Tail call optimisation at Raganwald

I command all of my students reading this to go and read this piece on tail call optimisation. If you’re anything like me, you’ll need to read it at least twice, write out the ruby examples by hand so they actually enter your brain (maybe I just needed more coffee), and follow many of the links for further explanation. It will be well worth it. This man has intelligent things to say, with which it is worth being familiar, if not intimate.

“Ridiculously easy” distributed programming with Ruby

MapReduce for Ruby: Ridiculously Easy Distributed Programming [raganwald]

Great for tasks which trivially parallelise, anyway.

Searching for symbols via Google is hard

Hokay. Google is without a doubt the single most useful and successful tool ont’Internet, a marvellous success story and something most of us would miss deeply until the happy day it’s superseded by something Even Better. It’s fast, it’s got a nice simple interface, and most of the time it “just works” and gives you what you want.

Unfortunately, some of the time, it really doesn’t “just work” (for me, at least) and in an apparently non-fixable way. I usually hit this when I’m doing programming-related searches.

Read the rest of this entry »

jobtimer.py – timing small repetitive jobs

It’s exam marking season hereabouts (and, thanks to the AUT industrial action, coursework marking time), so I’ve got my head down in piles of exam scripts.

One exam (on IT security) was a complete nightmare to mark – essay questions, loads of text, oh it just took ages. It really seemed to go on forever. At least I only had to mark half of it – but I really wasn’t looking forward to my other exams, to be marked all on my own.

Python to the rescue!

No, not a random number generator (though it’s sometimes tempting). Instead, a motivational tool: something to keep me focussed and “in the game”.

I have two problems when marking, basically: one is that when I’ve got a huge pile to get through, and it’s going fairly slowly, oh it’s sooo painful and you want it to be over, but it isn’t, and it won’t do itself, and, well, it’s all very antimotivational. The other problem is that I get distracted easily, so I’ll do a few, then chat, then do a few, then play Urban Dead for a bit, then do a few more, etc. Naturally these two problems feed into each other, and a snail’s pace is achieved.

It’s all mental – the issue is focus. Thus, we present jobtimer.py, a little script I knocked up in a hurry yesterday evening to help me stay focussed. And I gotta say, it’s proven instrumental in helping me hammer through the networks exams in record time.

Basically, jobtimer.py is a simple tool for keeping track of progress through a large batch of small repetetive jobs, where you want to know how long you’re spending on each job on average, and how many you’ve done so far. It has a simple text-mode keyboard interface, whose central feature is “you hit space to tell it you’ve finished one job and are starting another”. You can pause it, report on averages, and see how much time since you started has been spent “unpaused and working” (as opposed to “paused and playing Shartak, say).

It’s very simple: no persistence between sessions, no flashy graphics, and probably only works on Unix – it uses select() on stdin to catch keypresses; a Windows version could be hacked using msvcrt, I guess. The code’s not beautiful, but it does the job beautifully well for me.

Read the comment at the start of the code to see excactly how to use it. It’s dead simple.

Screenshot:

Screenshot of jobtimer.py in action

The clock starts ticking at 15:06:07; the first job takes 27 seconds, then 2 mins 5, then 1:42, then 2:51. 10 seconds into the next job (at 15:06:23), the clock is paused. 15 seconds later, it’s unpaused, and three seconds later that job’s complete. At 15:06:44 we hit ‘a’ to get a reading of averages/stats: 5 done, average 1 min 27, elapsed wall clock time is 7 mins 36, 96% of which has been spent with the clock ticking. Etc., etc. – you get the picture. Actually, looking at this shows me a bug/feature: when you quit, it doesn’t count the job that was just running – so end a job before quitting if you care about accuracy. :-)

Understanding Ruby blocks, Procs and methods

Understanding Ruby blocks, Procs and methods.

Scripts in ruby a la python’s __name__ == ‘__main__’ idiom

A common idiom in python is to check the special variable __name__ to see if the current module is being run as a script or not. For example:

class Foo:
    ...

def bar():
    ...

if __name__ == '__main__':
    bar()

Here, if the module is run as a script (ie passed directly to the python interpreter), then __name__ has the value “__main__”, this is detected, and (in this case) the bar() function is called. On the other hand, if the module is just imported from some module, __name__ has a different value (the name of the module file, I think?), and bar() doesn’t get called.

This is nice for a number of reasons – for example, you might put unit tests into bar().

How to do this in Ruby? It’s not in FAQ, which surprised me. I was about to ask on ruby-talk but then remembered the biggest FAQ of them all, and turned to google. Aha (and eek, what a horrible mailing list interface). Anyway, it’s:

if __FILE__ == $0
    bar()
end

OK, so why does this work?

$0 contains the name of the script being executed – ie, the name of the file that was passed to the interpreter. Whatever code you’re executing, this value never changes over a particular run of ruby. On the other hand, __FILE__ is always the name of the current source file. If the two match, then the current source file is the one that was passed to the interpreter.

I guess that’s pretty clear. Cool.

Students: if you ever do this, you are compelled to commit seppuku

Lesson one in security: deny by default, allow with care. It is entirely brain dead for your login logic to be “if the logged_in cookie is false, they’re not logged in, otherwise they are”, rather than “if the logged_in cookie is true, they’re logged in, otherwise they’re not”.

Erlang tutorials

Erlang tutorials.

Erlang looks very exciting. I’m still trying to crowbar Haskell into my brain – and reaching the conclusion that my brain needs inflating a little before it will fit. But Erlang is calling.

Crash Bandicoot was written with Lisp

WTF? Crash Bandicoot was written with Lisp??? Seems everybody’s talking about how great Lisp is lately…

And on the subject of games, these are pretty cool too.

Don’t Invent XML Languages

I saw this a while ago and meant to blog it, if only for the super useful/interesting “Big Five” list of XML uberlanguages whose existence means it’s best if you Don’t Invent XML Languages [lambda]. Of the Big Five, the one I’m interested in Right Now is DocBook.

Paul Graham: How to Do What You Love

Paul Graham on How to Do What You Love.

Handy dandy Subversion tools

Some Subversion-related tools I need to keep in mind, though not use right now: SvnReporter; svn-copy-register and svn-import-releases. That is all.

Operational Semantics of a Simple Process Algebra in Python and Haskell

As promised, though I’m still working on the shiny LaTeX article which actually explains all this stuff…

From the README:

This is an investigative/learning exercise in transforming process algebraic expressions into labelled transition systems according to their operational semantics, using Python and Haskell.

It started out just as an exercise in operational semantics, using Python because it’s the language I get things done fastest in. That proved interesting enough that I wanted to re-do it in Haskell, which I’m learning and which arguably handles this sort of thing better.

The Python version is more advanced. It includes mu-recursion, which the Haskell doesn’t, and is quite tidy IMHO. OTOH the Haskell is less developed, and in particular the functions for actually creating the graphs could, I’m sure, be improved. Always nice to have some future work to do…

I’m publishing this in case anyone is interested in the code. In particular, it is perhaps useful as an example of:

I’m working on a paper describing the problem, the semantics, and the approaches taken in the two languages, but it’s very much an ongoing WIP and isn’t ready to publish yet.

Homepage for this work: http://www.cs.swan.ac.uk/~csandy/research/op_sem_py_hs/

gimbo gimbo, where art thou?

I’m still here, I’ve just been too busy to blog. However, while I’m waiting for ghc-6.4 to compile (that’s my excuse anyway), I thought I’d do a quick blogdump…

I was going to write about my week in London, and a bit about what I’ve been reading lately, but I started by writing the stuff below instead, and now I think that’s enough for one post, so I’ll publish this and follow up with the rest maybe tomorrow or Sunday. (Short version: London fun except bombs, Quicksilver OK, Accelerando completely barmy but getting a bit dull at 40%). Colin, I should warn you that the rest of this post is of the kind you don’t like. The London diary thing might be better. Sorry!

Work’s been truly excellent this week. No students so no teaching, and also no admin for a while too. Some time I need to do some preparation for next year’s teaching, but for the next two months I’m mainly going to be focussing on research at last, aiming to break the back of my MPhil. I made a breakthrough on the parsing side last Sunday (or rather, I cleared up a misconception I’d been harbouring), but have spent this week on a different tack, learning about operational semantics through the magical medium of Python. Specifically, Markus gave me a toy process algebra and its semantics, and outlined the algorithm for turning expressions in the PA into their labelled transition systems, then I went away and programmed it.

It’s been excellent fun, and I got it basically done in no time at all, mainly because I chose to do it in Python. It’s quite remarkable, in fact… For months I’ve been struggling to get anywhere with my research, and it’s been very depressing and making me feel like I can’t hack this stuff. Now I realise it’s just that I’m a complete Haskell newbie. If I was doing my research in Python, I’d probably have finished by now. Alas, I have to do it in Haskell, because of the system we’re interacting with, but it’s encouraging to realise my problems relate to the language/paradigm I’m working in, not some basic failing on my part to understand what the heck I’m trying to do.

Anyway, I’m writing up an article explaining what I’ve done, and either later today or early next week I’ll publish it and my code, so anyone who reads the above and says “huh?” can take a look if they want. (JOn? You reading this? ;-))

Next week is graduation week at Swansea University and I’m acting as a marshall on Monday, which is Science day. So I get to see all this year’s third years do their stuff. With luck and effort, I should be there myself this time next year.

What else is new? I recently made the shift from ion to ion3. Ion’s been by window manager of choice for about three years now, mainly because I can’t be bothered with all that tedious point-and-click move-and-resize nonsense you have to do with windows in most WMs. TR occasionally moans at me that it’s modal but I don’t see it as a problem, it works for me and is extremely keyboard friendly and fast, so I’m happy. But I’ve been feeling behind the curve, and in particular some apps (eg the Gimp) don’t play well with the tiled model – which is why ion3 is nice because it adds optional “float workspaces” which act more like a conventional tedious point-and-click point-and-resize window manager if and when that’s what you need. Making the move was non-trivial because my config files had to be ported to Lua, but now it’s done and I’m very happy with my window manager indeed. Once again, I’d recommend looking at Ion if you’re getting dissatisfied with your Gnome/KDE experience and want to strip things down.

Finally, a couple of Python goodies via the Python-URL: try/except vs if/else in terms of speed (an oldie but a goodie, especially when advocating Python to curmudgeons), and Sparklines, which are kinda weird and kinda cool, but I’ve no idea if they’d actually be useful.

Hackers and Painters

Developers have much to learn from Hackers & Painters, a review of Paul Graham‘s book “Hackers and Painters”. Sounds like there’s lots of functional programming evangelism going on here, and it’s interesting to read that Graham asserts:

The programmers you’ll be able to hire to work on a Java project wont be as smart as the ones you could get to work on a project written in Python.

… particularly because I’ve heard that quote before, but it was Haskell that was being bigged up, not Python. All the same, it’s nice to have my status as a cognoscenti reaffirmed in public. ;-)

Lisp in Web-based Applications sees Graham expanding on what makes Lisp great, if you don’t want to buy the book just yet… There’s a particularly interesting bit about using closures to elegantly solve the problem of HTTP’s statelessness. Quote: “by using closures, we could make it look to the user, and to ourselves, as if we were just doing a subroutine call.” I’ve bolded the important bit: all web apps these days make it look to the user like you’re just doing a subroutine call, but to make it look like that to the developer is much more impressive. Sure, there are mechanisms in Java or whatever, but I love this idea of using closures: so much simpler and more elegant. (Here’s a nice explanation of closures, for the unsure.)

Finally, I disagree that “its hard to find successful adults now who don’t claim to have been nerds in high school”. For my whole life the world has seemed to be full of successful ignorant bullies and deceivers, and I don’t really see any signs of that changing. It’s a nice dream for a geek to have, I guess, and I can see how rising to the top during the dotcom bubble would surround you by enough successful nerds that you might think it was even true.

But who cares about the iniquities of the world when we’ve got shiny shinies like Haskell and Python to play with, eh? :-)

Welcome to tenthousandfactorial.com

Gimboland continues to turn into www.tenthousandfactorial.com… Peter William Lount, editor of www.smalltalk.org shows us how to do it in Smalltalk. Apparently Smalltalk’s been able to do this since about 1980. When I did my first degree, the boys doing straight Computing (I was doing Computing & Pure Maths, not gay Computing, by the way) met Smalltalk as their first O-O language, but alas I never had any serious exposure. The main impression I was left with was of every Smalltalk program being somehow an image of an entire virtual machine, which seemed odd and limiting but was probably neither (this was ten years ago, btw). *Shrug*

Moving on a few years, it had to happen, Daniel Berger wrote and told me how to do it in Ruby. He say:

Hi, I thought this might interest you. Here’s a “factorial” method in Ruby:

class Integer
    def factorial
        (1..self).inject { |f, n| f * n }
    end
end

… and, with that in place, we can now do this:

10.factorial -&gt; 3628800
10000.factorial -&gt; really big number

Or, you could just do that as a standalone method, rather than defining it within the Integer class like so:

(1..10).inject{ |sum,n| sum * n } -&gt; 3628800

Yes, Ruby handles bignums “out of the box”. :) Regards, Dan

Fair enough, though to my mind this is at least as esoteric as the Pythonic reduce-based method mentioned previously. Again, provided you know the semantics of insert, I guess it makes perfect sense. For my money the Haskell version is still the nicest, just because it’s closest to how a mathematician would naturally go about defining factorial. Smalltalk comes close – it looks the part, but there’s still some odd syntax there, IMHO.

This is great. What started out as a glib comment which at the back of my mind I thought was probably wrong but hey, no-one ever reads Gimboland so what the heck, has turned into a nice little thread… :-)

Any other takers?

Domain Specific Languages

For my later digestion, a raw dump of a number of bookmarks on Domain Specific Languages and Haskell: one, two, three, four (which has many more links). Thank you.

More on factorial in Python

More stuff over at tiddly-pom on the whole 10,000 factorial thing. Julian does a good job of pointing at me and laughing (rightly so, rightly so), and there are a couple of imperative approaches to the problem suggested, including one using generators – yay.

Of course, something I forgot to point out about Stephen’s reduce based solution is that it’s a functional approach, and so has more in common with what’s happening in the Haskell version than these imperative versions. Anyway.

Python still amazingly cool also

Oh, how quickly we forget. Or to put it another way, how stupid I am. Or to put it another way, what a fickle language whore I am.

Further to my recent joy regarding Haskell, Stephen Judd appeared from nowhere and reminded me that of course, Python also handles arbitrary-length integers “out of the box”. How the hell did I forget that?

In fact, he went one better and gave me a single line of Python for calculating 10,000 factorial, viz:

reduce(lambda x,y:x*y, range(1,10001))

Admittedly this is harder to grok than the Haskell version:

bigFac :: Integer -> Integer
bigFac n
  | n == 0 = 1
  | n >  0 = n * bigFac(n-1)

(at least, without knowing the semantics of reduce) – but it does also seem to run a bit quicker on my box.

Hurrah for Python!

Of course, this all just reminds me that I’m spending too much time lately thinking about theoretical computer science, and not enough time getting my hands dirty programming… :-/

Update 2007-01-26: Of course, the python version is not easier to grok than this alternative Haskell version:

product [1..10000]

Now that’s beautiful, as is the definition of product.

Why Haskell is cool (again)

Why Haskell is cool (again). Over a year ago I was mucking around Haskell but didn’t get deeply into it. Today I picked it up again, and while playing around with tutorials did something which, in my geeky way, I find cool.

With four lines of code, and with no more knowledge than a beginner has, I was able to calculate 10000 factorial. (For my non-mathsy readers, that is, 10000 times 9999 times 9998 times 9997 … times 3 times 2 times 1).

This is a Very Big Number (it’s here if you don’t believe me), and until today I haven’t come across a programming language which can even represent/work with anything this large “out of the box”. I’m sure there are other languages which can – I just haven’t met them. It’s interesting.

It took about seven seconds to work it out. Nifty.

Where Python meets Haskell

Alex Martelli explaining how Python’s list comprehensions and (new and groovy) generator expressions relate to Haskell, and why Python doesn’t (in generally) do lazy evaluation. As someone with a foot in both camps, but no deep understanding (alas – yet) of what’s actually happening, this is pretty interesting stuff…