Generic Functions (by )

When most people think of object-oriented programming, they think of the interpretation of it that originated in Simula, that has since bubbled into Java, C++, and so on. Or maybe the prototype-based variant found in Self or JavaScript.

Under this interpretation, every object has a set of fields (which are typically mutable), and a set of methods. Under the widespread class-based paradigm, "classes" define the set of fields and methods for objects; each object is an "instance" of a class, and thus has the fields described in the class, and the methods. Under the prototype system, objects are created and given fields and methods directly, or "cloned" from a "prototype object" when lots of objects need the same sort of behaviour.

So if one were implementing an address book application, one might define a class of people, with a method that describes them (in a syntax of no particular language):

class Person {
    String name;
    String address;

    method describe() {
        print name + " of " + address;
    }
}

Then one might create a particular person whose name is "Bob" and whose address is "123 Any Street", and then calling the describe method will generate the text Bob of 123 Any Street; while perhaps another class in the system might model UK limited companies with the fields name and registrationNumber, and whose describe method might return name + " a limited company registered in England and Wales, number " + registrationNumber. The end result is that one can take objects that might be people or companies (or anything else with a describe method) and call describe on them to get a description; your code does not even need to know that people and companies exist or anything about them, all it needs to know is that it can call describe on some things to get a description.

The idea is very much that methods live "inside" the classes, and therefore, inside the objects that are instances of those classes.

This has two downsides:

  1. As the list of methods for a class needs to be in a certain place - inside the class definition - only the person who is responsible for the class definition can add methods. Say you write a powerful piece of software to translate text between languages; clearly, this should be a method of class String, as any string can be (attempted to be) translated. That's not an immediate problem if you have the full source code for the entire system; you can go and modify libraries to add methods to classes you use from them; but sometimes you don't have the source code of a library, or distributing your translation system as patches to your programming language's implementation of strings is a bit messy.
  2. It's lovely that you can call methods such as describe on random objects, without needing to know what class of object they are; this is called Type polymorphism. However, some operations involve more than one object. Consider addition. The way you add two integers is quite different to the way you add two floating-point numbers, or a float and an integer, or two matrices. How do you handle this? Do you give all number-like classes methods such as addToInteger, addToFloat, and so on? If you do that, then when adding two objects you know of by references a and b, then if you know b is a float you can write a.addToFloat(b) to ask a to add b to it, and it'll work whether a is an integer or a float because the correct version of the method for its class will be invoked. But in order to do that, we need to know b's a float; we'd need to know about floats, integers, complex numbers, and other such objects that could be added to a number, and check b for each one, and then call the appropriate method on a. We're automatically polymorphic in a, but not in b, where we have to do it manually.

However, the Generic Function approach works rather differently. Methods are not inside the classes. A class just defines the fields instances of that class will have. One then defines things like describe or add as a generic function somewhere - independently of any class or anything, merely saying describe takes one argument and returns a string or add takes two arguments and returns something.

Then one provides methods to that generic function that implement it for given argument types.

So one might write:

  generic function describe(o);

  method describe(Person p) {
      print p.name + " of " + p.address;
  }

  method describe(Company c) {
      print c.name + " a limited company registered in England and Wales, number " ` + c.registrationNumber;
  }

But we can also do things like:

  generic function add(a,b);

  method add(Integer a, Integer b) { ... }
  method add(Integer a, Float b) { ... }
  method add(Float a, Integer b) { ... }
  method add(Float a, Float b) { ... }
  method add(Integer a, Complex b) { ... }
  ...

Of course, in this case, it helps if your generic function system is clever enough to be told "this generic function is commutative" and then not have to provide both Integer+Float and Float+Integer, or to be able to just give Integer+Integer, Float+Float, Complex+Complex, and then methods to convert an Integer to a Float and a Float to a Complex, and let it work the rest out for itself - but that's an extension you can easily do.

And one can write an add method for matrices; but any attempt to add an integer to a matrix will fail, because there's no method to do it, nor a way to promote an integer to a matrix or vice versa to use the Integer+Integer or Matrix+Matrix methods.

And, of course, whenever we wish, we can write:

  generic function translate (text, fromLanguage, toLanguage);

  method translate(String text,Language fromLanguage=ENGLISH, Language fromLanguage=FRENCH) {
     ...
  }
  method translate(String text,Language fromLanguage=ENGLISH, Language fromLanguage=GERMAN) {
     ...
  }

...which illustrates that methods don't have to be selected just on the classes of their arguments, but also potentially on values, or indeed arbitrary conditions (one might define a method only on prime integers). Having defined the generic function translate, one can use entirely different translation libraries for different languages, but they all fall under the same translate method on strings. If you had type-promotion rules as we described for the addition case, then you could define a rule somewhere to promote a file to a string (by reading the contents of the file, as a string), and then you'll be able to translate files, too.

In other words, generic functions are MAGIC and POWERFUL and ELEGANT and FUN.

But they present some interesting issues to language designers.

Pages: 1 2

1 Comment

  • By Peter Bex, Tue 18th Aug 2009 @ 2:15 pm

    I agree that generic functions are an interesting idea, but I think you've not given enough credit to prototype based systems, or "open class" systems like Ruby.

    In a prototype-based system you can manipulate any object as you please, so you can add new methods which become available to all clones of that object (unless the clone already overrides the method with its own).

    In the "open class" systems you can mix in any number of modules into any class you wish.

    Of course, with both systems you pay the same price as with generic functions; because everything is so open, it is hard to compile things efficiently because anything can happen at run-time.

    A more interesting question is; do generic methods get you into trouble as easily as mixins do? (the dreaded "monkey-patching" problems)

    I don't know if prototype systems can get you into these troubles so easy either, since I haven't done any big systems hacking with such an object system.

Other Links to this Post

RSS feed for comments on this post.

Leave a comment

WordPress Themes

Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales
Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales