Category: Computing

Scheme bindings (by )

One of the things I don't like about Scheme is that just about everything in it is mutable. The most awkward ones are the value bindings of symbols.

You see, when a Scheme implementation (compiler or interpreter) is processing some source code, at any point in the program it is aware of the current lexical environment - the set of names bound at that point in the program, and what they're bound to.

Read more »

Paradigm shifts (by )

When I was a kid, I used to read a lot. I'd devour the technical sections of libraries for new things to learn about. Then I got an Internet connection, and tore into academic papers with a vengeance. Then when I left home and got a job, I had money, so I would buy a lot of books on things that I couldn't find in the library.

I look fondly back on when I read things like Henry Baker's paper on Linear Lisp, Foundational Calculi for Programming Langauges, the Clean Book. Or when I first learnt FORTH and Prolog, and when I read SICP or when I learnt about how synchronous logic could implement any state machine.

All of these were discoveries that opened up a new world of possibilities. Mainly, new possibilities of interesting things I could design, which is one of my main joys in life.

However, after a while, I started to find it harder and harder to find new things to learn about. Nearly a decade ago I all but gave up on the hope of finding a good technical book to read when I went into even large bookshops with an academic leaning. I started browsing the catalogues of academic publishers like MIT Press and Oxford University press, picking out good things here and there; that's where most of my Amazon wishlist comes from. But even then, most of the books I find there are merely ones that will give me more detail on things I already know the basics of, rather than wholly new ideas.

But, of course, the underlying problem is that my main field of interest - computer science - has only been pursued seriously for about seventy years. Modern computing (as most people see it) isn't really the product of current computer science research; industry lags far behind academia in many areas. The computer software we run today is primarily based on the produce of academia around the 1960s (imperative object-oriented programming languages, relational databases, operating systems with processes that operate on a filesystem, virtualisation, that sort of thing). This is for a number of reasons (some more valid than others; but, we are catching up, mainly thanks to the social effects of the Internet), but it means that there's little incentive for industry to actually fund more computer science. So the rate of new ideas actually being developed is far less than the rate at which I can satisfy my curiosity by learning them!

One answer is to try and come up with new paradigm-shifting ideas myself. I'm trying, but I'm not really good enough - I can't compete with proper academics who get to spend all day bouncing ideas back and forth with other proper academics; I can't really get my head deep enough into the problem space to see as far as they do. All I can really do is solve second-level problems, such as how to integrate different systems of programming so that one can use the most appropriate one for each part of one's program without suffering too much unpleasantness at the boundary between them.

Which is why, whenever I read something about some fun new deep idea, I have to stretch my mind to encompass it in the first place.

And that's half the fun...

Flexible data models (by )

Most bits of software have, at heart, a data model for representing whatever it is they process. Be it an SQL database, a bunch of files on disk, a raw disk partition, something purely in memory, or just a file format that is processed sequentially, there is usually some kind of data structure being dealt with.

And when requirements change and the software evolves, that data model often becomes the Achilles' heel, as it turns out to be unable to represent new concepts, and changing it requires rewriting a lot of the application - since, as the "lowest layer", just about everything else depends upon it.

Therefore, I see a challenge in finding ways of designing data models and the software that uses it in the first place so that changing things requires the minimum of effort...

Generalisation (by )

One of the things I instinctively do when designing software, given a client's requirements, is to generalise things as much as possible, in order to make it easier to deal with changing requirements in future, or to avoid having to write special-case code to deal with more unusual situations that they already need handled.

Eg, somebody might say "I want a system to transport email and files between computers in my organisation". So you might think: Ok, I'll start by designing a general packet-switching system to transfer data across an interlinked network of computers, with routing algorithms to work out the best paths, retransmission systems to deal with failures, and so on. Then on top of that I'll build an email system and a file transfer system. That way, most of the difficult stuff is done in a single module that deals with getting data from A to B across an unreliable, changing, network. Email and file transfer are then much simpler modules, with as little duplication of work between them as possible. So it's easy to add more functions to the system in future, and any improvements to the underlying routing engine benefit email, file transfer, and any other application equally.

Standard good software engineering practice, right? Modularise and have an abstract API between layers of a system?

However, sometimes I do this, but am then faced with an uphill struggle, as the client starts wanting changes that break the abstraction layers between the modules...

For example, they might suddenly start saying that they want all the email to go via their fast but expensive transatlantic cable, so it gets their quickly, while spending as little as possible - they pay by the megabyte, but emails are small. Meanwhile, they'd like the file transfers to go via the cheap satellite link, which is slow. But nobody's in a hurry with a large file transfer.

Ok...

But the nice routing module we designed doesn't care what application is using it; it just gets given a bunch of data and told to send it somewhere.

So we have two main classes of choice:

  1. Make the routing system, at the point where it has to choose between satellite or transatlantic cable, break the layers a bit by peeking inside the bunch of data it's given to decide if it's part of a file transfer or an email, and decide how to route it based on that. This is quick and easy, but it means that the routing system now needs to know a bit about the applications, so it'll now need updating if extra applications are added or the rules change, which increases maintenance overhead and scope for error.
  2. Sit down and have a think about this requirement, and how it might impact future applications (a bit of prediction and guesswork is required here), and design a change to the API to fulfill that need. For example, adding a "type of service" field to every chunk of data given to the routing system, saying whether it needs to get there quickly or cheaply. This creates a more maintainable system in future, but is also more up-front work.

However, it really makes my life hard when people, after requesting a system with so many esoteric variant cases on a complex operation, and the expectation that more variant cases will arrive in future, that it has to be a very modularised system to control the complexity - but where one case is by far the most common - then start requesting changes to the system that totally ignore the fact that there are any exceptions to the common case.

Which is then a real headache to deal with, as you have to figure out how their feature applies to all the other variant cases as well, and try to explain this to them...

Debugger is not a naughty word (by )

Computers are famed for harbouring bugs, and the high rate of failures in software compared to other industries is a constant cause of embarrassment. I'd like to explore why this is, with an example. And what we might be able to do about it.

Note: Although a lot of the details of the remainder crash are unfortunately very technical, I have done my best to explain things in a way that lay people should be able to make some sense of. However, some things would require a lot of background information, in which case I've just plowed on without explanation. So if you come across things that you don't understand, feel free to skip ahead a bit; you shouldn't lose too much.

Read more »

WordPress Themes

Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales
Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales