Scheme bindings (by )

One of the things I don't like about Scheme is that just about everything in it is mutable. The most awkward ones are the value bindings of symbols.

You see, when a Scheme implementation (compiler or interpreter) is processing some source code, at any point in the program it is aware of the current lexical environment - the set of names bound at that point in the program, and what they're bound to.

There are three standard kinds of bindings (and, possibly, other implementation-dependent kinds, too):

  1. Value bindings. Since these are mutable, what the name binds to is not a value, but rather a location in memory where the value may be found. In a compiler this may be a binding to a location in memory that won't actually be allocated until run time; in an interpreter it'll probably point to the actual memory location with the value in it.
  2. Macro bindings, which associate a name with a macro transformer
  3. Special form bindings, which associate the names of special forms (low-level constructs handled directly by the implementation) to some implementation-defined object that selects the special implementation semantics. These are the names of the core syntax of the language (or whatever it's been renamed to).

Notably, the arguments bound by a lambda expression are always made as value bindings; one cannot pass a macro or a special form into a function as an argument, as that would make compilation really hard.

But since value bindings point to a memory location rather than to a value, the implementation cannot do much by way of partial evaluation - an important optimisation. For example, any addition such as (+ 1 2 3) is a call to the + function - but functions are just value bindings. When a compiler compiles such code, it sees that + is a value binding. Where it a macro binding, it would run the macro transformer on the expression, then attempt to compile the result; where it a special form binding, it would handle it as per the type of special form. But it's usually a value binding, pointing to a memory location, so the compiler has to output code that looks in that location, checks the contents is a function (and complains if not), then calls that function with the arguments. This is because the program could, at any point, run (set! + (lambda args ...)) and redefine addition at run time...

If we had constant bindings, that bound a name to an actual value, then where such a bound symbol appeared in a function call position, the compiler could know precisely which function was being invoked in the same way that a C compiler does, do all the type checking at compile time, and compile a direct call - or even inline it.

More excitingly, we could partially evaluate things. Functions that were known to not have side effects could, if all their arguments were literals or constant symbols, be applied at compile time and replaced with constants. Inlining is just a variant on this; where the source code of a function is known, calls to it can be expanded inline, even if the arguments aren't all known.

This can produce significant speed increases. But Scheme currently has only mutable value bindings. Luckily, it's a very extensible language...

Now, there are three ways I can think of that constant bindings could be added to Scheme.

  1. By creating a new set of binding forms that create them. Perhaps define. defines a constant (the full stop implying there's no changing after the definition). lambda. would define a function whose arguments were not mutable within the body. let., let*., letrec. would follow. The upside is that it's a completely compatible extension to the standard; the downside is that it's like the const modifier in C - you have to remember to use it almost all the time, and leave it off only when you do need a mutable binding, and the temptation would be to forget to put it on in the first place.
  2. By breaking the language and making bindings constant by default, and then adding special syntax to create mutable bindings instead. This produces a better result... but breaks existing code. Admittedly not a lot of existing code since most bindings are actually never modified, and it'd be easy to change the few that are to use mutable bindings by searching for references to set!.
  3. By being cunning and making the compiler work it out. If it is compiling the whole program as a unit, it's quite easy to see which symbols are used as the first parameter of set!. Whenever a symbol is bound, then a scan of the scope of that binding can be done for set!s to that symbol, excluding any scopes in which the symbol is shadowed, and thus every binding found to be mutable or not. Function call arguments are always passed by value, not reference, so there's no way for a reference to sneak out of its lexical scope. The only hard part is when separate modules are compiled separately for later linking - but then we can fix that by making our module system such that when we export a binding from a module, we have to mark it as mutable there and then if we wish to allow other modules to mutate it, which would count just like a set! and cause a mutable binding to be made.

Option 3 looks promising... no extra work for programmers (ok, a tiny bit if they want to export a mutable binding from a module), no violation of the standard, and we get mutability information to allow the compiler to do all sorts of deep optimisation. Yum.

...so the next target for immutablising would be the lowly cons cell. Little chance of static mutability analysis there, since they can be passed around at run time in arbitrary ways. I heard of some Scheme implementation that's been experimenting with immutable cons cells and found that very little code ever mutates cons cells, but I can't remember what. However, it's definitely a standard-departing change; can't really keep calling it Scheme after that...

1 Comment

  • By Peter Bex, Fri 11th Apr 2008 @ 7:36 am

    I think the scheme that's been experimenting with immutability was PLT. Clojure is also a pure Lisp-1 (but not really Scheme)

    If you have a proper module system I think you can do pretty decent inlining within the module. Anything that imports the module and uses set! on it can operate on its own copy, for example. To make mutable exports you could add a special export operator that indicates that it is mutable.

Other Links to this Post

RSS feed for comments on this post.

Leave a comment

WordPress Themes

Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales
Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales