ARGON (by )

I've been tinkering with the design for ARGON since I was about eleven or twelve, I think.

I was always an avid reader of books, and the local library's computing section had a mixture of "Learn to program in BASIC on your ZX Spectrum!" and textbooks about minicomputer operating systems, PROLOG, LISP, C, and the like.

Plus, my mother was doing an HNC (later an HND) in Business Studies at the local technical college, and I often went in there to meet her after school, and sat in their library, which had even more textbooks. Plus leaflets on the college's computer system, which was a VAX cluster. I didn't get my hands on the VAX cluster, but I was interested to read up on its operations.

And another source of inspiration was a book called The New Hacker's Handbook. This tome was about exploring computer networks, but it had lots of background material on various minicomputer operating systems, including the likes of UNIX as well as much more primitive systems.

The more primitive systems interested me, because I could see myself implementing them. I remember things like Hewlett-Packard Time Shared BASIC from the early 1970s (up to 32 users! In 32K words of RAM!); although I recognised that reliably executing multiple machine-code programs in parallel on my IBM PC would be a non-trivial problem (particularly with no hardware memory protection), writing an interpreter that executes multiple programs in parallel with language-enforced crash isolation seemed tractable.

And the thought of having an operating system that could really make the most of my hardware appealed greatly. I couldn't buy more hardware; but I could write better software. DOS certainly sucked in a big way compared to the systems I was reading about.

So I started to design an operating system. I didn't feel the need to write one from the bare metal upwards; DOS would be a fine base to build on, as it would hardly get in my way, but would provide useful filesystem access functions. The problem was that I kept finding more reading material as time passed, which expanded my horizons further and further. I felt no passion for backwards compatibility; as I spent my free time writing my own software, all I needed was a better platform on which to do so. So as well as designing an operation system, I was happy to design the entire programming environment on top of it, as well.

I didn't want my computer to just be a desktop machine - I wanted it to be controlling machinery in the background, while being my interactive console in the foreground.

And I wanted all of this done with a system that was simple enough for me to build myself. So a lot of the 'design work' was in finding tractable ways to solve the big problems.

That's why I like things like FORTH and LISP - they're simple enough to implement in a week, but powerful enough to let you incrementally build the language up with metaprogramming at will.

But this has a secondary benefit; modern operating systems and programming environments are terribly heavyweight things, fat with overcomplicated mechanisms, and ripe with opportunity for bugs. A minimalist, pared-down system upon which powerful features can be built in layers of simple well-defined modules, as well as being easier to write and maintain, is easier to get right.

I was inspired by TAOS, an OS aimed at distributed systems that used a low-level virtual machine to make code portable; BeOS didn't really impress me much, but Oberon had some interesting ideas (if woefully inadequate on the multiprogramming side).

Needless to say, having tinkered with designs for VM-based protection and core portability for years, Java held few surprises for me.

TAOS got me thinking about distribution, though. It seemed logical that a network of computers should be able to share resources as effortlessly as resources within a single computer could be shared. I remembered that early documentation about the VAX cluster; one had explained how the cluster involved multiple computers sharing devices and a file system. I still only had one PC, but I had Internet access, so I was happily tinkering with IP networking, and I could see the possibilities.

I head about the release of Plan 9 to the general public in 1995; I read a magazine article about it, that explained its very elegant approach to distribution. Plan 9 is a micro kernel, with all inter-process communication accomplished through a file system interface; every process has a file system tree with various other server-processes mounted as subtrees of the file system. Services from remote machines could be mounted using the 9P protocol. When a user logged on, then a session manager would create a /dev file system for them with /dev/console pointing at the physical hardware console; if you asked a remote machine to run a command for you, then 'your' filesystem was shared and used as the root filesystem for the remote process, so it saw all of your files - and your /dev/console. The console device could have text written to and read from it like a UNIX console, but with the right escape sequences it could also draw graphics and accept input from the mouse. The Plan 9 GUI was just a program that used the full-screen /dev/console you are initially given to display a windowing system, where creating a new window involves creating a new file system mapping for the subprocess that inherits from the parent, but re-mounts /dev/console to be served from the GUI program itself, and restricting output into that program's window (so you could run an instance of the GUI from within itself, sort of like Xnest).

You could have servers that were dedicated storage servers, exposing actual disk-backed file systems full of normal files; and servers that were dedicated CPU servers, with no real file systems of their own beyond what was required to boot (perhaps a /tmp and some swap), that existed just for processes to be run on them. And then workstations that provided real hardware /dev/console servers, and handled logging users in to create a session. If you took a bunch of machines and linked them to the same authentication database, you had a cluster.

I read up a lot about databases. Database replication has been discussed for decades, but until recently, only really implemented in academic test-beds and in high-end mainframe systems. But I wanted the fault-tolerance and maintainability of a mainframe, built from standard hardware. A mainframe, really, is just a lot of CPUs, disks, and memory joined by a network; they're just all built into one big box (using proprietary, nonstandard, expensive, hardware). There's nothing they can do that a network of PCs can't.

Database replication is hard; particularly when you want to deal with network partitions, nodes that fail, and growing/shrinking the cluster without downtime. But all of these problems are solvable (indeed, I am delighted to currently have a job solving them).

So I learnt, and tinkered, and the current large-scale structure of ARGON slowly took form: a fluid cluster of machines of all sorts, joined by a network, with a shared replicated transactional file system, and software that needed executing for whatever reason automatically run on the least loaded node. I moved towards an object model, where entities in the file system contained both code and data. All the "commands" to the system were to be messages sent to entities, that would make them run some code, reading and writing their internal state and returning something to the caller. I found ways of mapping expected OS functionality - access control, users, user interfacing, and the like - into this model, meaning the core itself had less to do.

Somewhere along the line, I abandoned DOS, and migrated to three possible implementation paths: one as a portable "runtime" like the JVM that could be compiled up and run on anything vaguely POSIX-like, one as bare-metal implementations for embedded systems or custom hardware, and one as a dedicated replacement for init to be run on a customised NetBSD kernel. I wasn't so crazy as to seriously consider writing my own x86 operating system core.

On the programming environment side, I had spent enough time reading about unusual approaches to programming that I had long since abandoned any idea of adapting BASIC, Pascal or C (the widespread languages of my early career) to run on it. I was far more interested in languages like LISP, Prolog, Concurrent Clean and Occam. I settled on the core philosophy of LISP, as it could encompass anything else; but I liked the Concurrent Clean approach to mutating state. I felt that Prolog was great for certain kinds of tasks (searching things in complex ways, for example), but was painful for most everyday programming, so I moved towards having a Prolog-like interface to the distributed file system and in-memory "temporary work areas" rather than basing the language around it.

Tinkering with the idea of knowledge-based data storage shortly led me to the current design for CARBON, the ARGON "file system" as seen by users, and its integration with TUNGSTEN, the "file system" as provided by physical disk storage (Plan 9's FS metaphor and RISC OS's ability to load user file systems willy-nilly had led me to separate the two, long ago); TUNGSTEN provides a replicated persistent knowledge base for each 'entity' that holds its code and data (or, more often, simply references to shared code, and then per-entity data; but that's a design pattern rather than enforced). While CARBON provides a protocol for accessing the knowledge held within an entity (or, at least, what it wishes to publish). TUNGSTEN lets an entity partitions its knowledge stores into a tree; each node of a tree contains some assertions, and also 'inherits' the assertions from its children, allowing an entity to create a knowledge sub-tree for public data to be exposed automatically via CARBON - and then provide additional sub-trees to be merged on top for readers who meet access control requirements. But the entity can also place dynamic gateways into CARBON, which are assertion patterns that, if queried, result in arbitrary code within the entity being invoked; these would allow entities to combine static publishing of data with dynamic generation of it, all through one "getting structured data through a knowledge-base model" protocol.

One function of CARBON is to form navigational meshes; the core CARBON schema defines properties for the name of an entity, and how to represent it in various ways (icons and the like), as well as relationships the entity has with others. At the simplest level, an entity might have no code in it whatsoever, and just be a repository of information that the owner edits by hand; or it might have enough code to present a specialised editing interface to those with sufficient rights, and then expose the results to others for viewing; or it might have very little state within it at all, and dynamically answer queries (by querying external hardware, or by computing answers, or by consulting some external resource like an SQL database or a POSIX filesystem...). Inspired by the benefits of virtualisation, I came up with ways for entities to pretend to be any number of other entities, by appending arbitrary parameters to their identities, in the way that a PHP script on a web server can be an infinite number of different "web pages" by accepting different query string parameters.

As well as being inspired by interesting systems, as I've described above, another design input for ARGON has been looking at problems, and thinking how I'd solve them. I don't have any immediate plans to develop ARGON as a desktop OS, since that's a very hard market to break into - although much power can be obtained through simplicity, you still need to write a LOT of code to be a usable desktop OS, since a usable desktop OS has to interact with a lot of complex things such as "HTML with CSS and JS and Flash", "Microsoft Office file formats", and a plethora of file systems, devices, and protocols before it can be considered useful. However, I have nonetheless charted out how an ARGON-based desktop environment could be built, purely to make sure nothing in my core architecture unnecessarily hinders that path.

All in all, the current design for ARGON is really just a combination of things that have been done elsewhere; the challenge has been finding a compatible model that lets them all happen. And, of course, such a model is one that's so simple it allows anything to flourish on top of it, rather than one so complex that it encompasses all the possibilities. The market I now aim for is still to be a wonderful environment to develop software, but now it's Internet software, providing services via IP protocols such as the Web - although I'm interested in potentially developing ARGON for embedded control systems, and maybe even for smartphones (although writing a Web browser still daunts me).

At first, the design changed rapidly, as I learnt about new approaches and thought of new challenges, but slowly, more and more of it has survived each new challenge I find, or turned out to be superior to new approaches. And so I am moving from the big picture to details, which is a whole new ball game!

The good news is that the spec for HYDROGEN, the lowest level of ARGON (a sort of hardware abstraction layer) is firming up as we speak. The latest development version can be viewed (as DocBook) at http://www.argon.org.uk/specs/HYDROGEN.docbook; I don't have an automated process to generate HTML from it in place, alas!

When the spec is done (one day) I'd like to throw together a simple portable implementation of a minimal core in C, using the techniques pioneered by Anton Ertl for GForth, then extend that with portable (eg, written purely in terms of that kernel) implementations of the rest of the standard library. This will then provide all the platform needed to start writing HELIUM, the resource management system, and the rest of ARGON above it - while, in parallel, work on highly optimised native-code compilers for x86 with hand-coded assembly implementations of the parts of the standard library that will benefit from it, bare-metal implementations of the kernel in ARM assembly that can run the portable implementation of the standard library (and then, in turn, parts of the standard library can be replaced by ARM assembly versions) - which will slowly but surely open up a wide range of underlying platforms that can host the ARGON system as it grows up.

I'd like to make money from it, if I can, but it'll always be open source; there's no way a programming environment can really succeed as a closed-source endeavour without a lot of capital behind it, and that's no fun. I might make money from it by developing specific implementations for embedded platforms and the like, or just selling consultancy and support on top of an open-source product. Or maybe it'll just always be a hobby, ticking away in the background.

My one stringent requirement is that I'd really like it implemented to a usable level before I die. It'd be a shame for it all to go to waste.

3 Comments

  • By Faré, Mon 29th Jun 2009 @ 4:26 pm

    I'd like to "me too" at a lot of what you said. However, what you actually propose in the end leaves me dubious.

    HYDROGEN looks way low-level. Why would anyone want to use it? And fixed stack sizes only makes it worse.

    A modern system would provide users with abstractions in terms no lower-level than some well-typed high-level graph. How are you going to reconcile that with the low-level aspect of HYDROGEN?

  • By Faré, Mon 29th Jun 2009 @ 4:50 pm

    Of course, my "solution" would be to start standardizing, not the base level of the implementation tower, but the protocol of implementation/virtualization of a computing system with another.

  • By alaric, Tue 30th Jun 2009 @ 9:38 am

    HYDROGEN's not meant for (normal) users to see - it's more the assembly language of a VM, that core bits of the system can be written in but will mainly be the target for an HLL compiler (that will provide sandboxing through pointer safety etc. for a start!)

    It is indeed way low level :-)

    However, this means that little else can really be implemented without it (sure, we could experiment with writing HLL implementations, but that'd be wasted effort in the longer run).

Other Links to this Post

RSS feed for comments on this post.

Leave a comment

WordPress Themes

Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales
Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales