HYDROGEN: Conclusion (by )

As promised, here is the sixth and final part of my series on HYDROGEN, where I draw some conclusions and discuss the road ahead.

Introduction

I hope the previous postings have proven interesting. I'm doing something unusual, and taking an unusual approach to it, but I hope you'll agree that the result elegantly meets my design criteria of a runtime environment for operating systems and embedded software that is easy to implement, yet can scale from small embedded environments up to powerful hardware.

HYDROGEN is an odd mixture of things; it's a virtual machine, a programming language, and a hardware abstraction layer in the classical sense of a library of operating-system-level functions that can be implemented afresh on each platform to allow higher-level software to be platform-independent. Bundling the VM and the HAL together makes sound sense in abstracting out computation as well as devices and memory management, which are all justifiably parts of the same platform; actually defining a programming language isn't strictly necessary (we could do everything with static VM bytecode files that are compiled elsewhere from some other form), but is a pragmatically useful thing to do.

But HYDROGEN itself is a very basic environment. It's only as rich as it is because I wished to define a sufficient language core to cover the kinds of platform-dependent actions an operating system or embedded application might require; I have made no mention of data structures, for example. Also, you may recall that I specified that device drivers are all asynchronous, allowing the application to request an I/O operation and to supply a callback when it's done, and the request may be rejected due to the device being busy and having no internal queue space left; so it's even up to the application to decide how to handle that. I propose an idle word you can call to wait until an interrupt of some kind occurs, so you could write a callback that sets a flag, and loop around idle until the flag gets set - but then you need to account for the fact that the callback interrupt could be delivered to another CPU, in which case your original one will be suspended in the idle - so then you'll need to make your callback detect if it's on the "wrong" CPU and issue a wakeup interrupt to that CPU (yeah, I'll need an interface to do that, and I've not designed one yet...)!

Also, although I've designed HYDROGEN to be a reasonably pleasant low-level language, within my constraints - and the ability to write your own parsing words means you can make it more and more pleasant if you like - it's still terribly low level. You have to manage your own storage, and it's up to you to preserve type safety; code sequences such as 123 3.0 + (integer addition when there's a floating-point number on top of the stack) have undefined behaviour. It will be easy to crash the entire virtual machine with something like 12345 call, where call is the primitive that does an indirect call, reading a subroutine handle from the top of the stack and calling it.

The next levels

HYDROGEN isn't actually meant for too much programming to happen in it. While the virtual machine operations of literal pushes and calls are the machine code of the virtual machine, HYDROGEN source code is the macro assembler; however, I'm also working on a high-level language, CHROME, which will use the HYDROGEN code generation interface to compile code at run-time. CHROME will be a Lisp-esque language with type safety; there will be no way in CHROME to make a pointer up out of thin air. So CHROME will only be able to use whatever underlying system functionality is given to it; a gamut of 'safe' operations are provided by the core (arithmetic and the like), but systems code can be given access to a CHROME primitive that lets its users embed arbitrary HYDROGEN code, thereby allowing it to interact with the full capabilities of the underlying HYDROGEN VM (including the ability to crash things...), but like the HYDROGEN NATIVE word, that will generally only be used to implement higher-level libraries.

But CHROME doesn't provide garbage collection. Or, indeed, the basic data structures that represent CHROME programs, and that CHROME programs operate upon; that's to be provided by IRON, the data model. IRON is designed to have a written notation that is compact enough to be used to represent source code, like Lisp s-expressions. The IRON implementation will also provide garbage collection; since IRON is designed for functional languages, it supports immutable objects (which can therefore only point to objects that existed when they created) and linear objects that can be updated, with an invariant that pointers from immutable to linear objects are illegal and that at most one pointer to a linear object may exist; this means that linear objects may be deallocated when the sole reference to them is lost, and the linear objects can be considered as a root set for the garbage collection of the immutable objects in the thread, using the Joe Armstrong and Robert Virding's elegant GC algorithm; special support is required for sending immutable objects to other threads down communications channels or making them globally accessible (as either could introduce inter-thread sharing), or passing initial state into a thread or getting results back to the parent when the thread terminates, but that's something I'm working on. In order to implement IRON's dynamic typing efficiently, I've defined an optional HYDROGEN feature for tagged arithmetic, so that it can be hand-coded to take advantage of platform-specific tricks. I do wish modern CPUs had better support for this; it would not be hard to provide a set of operations to perform arithmetic on tagged values, automatically triggering the correct hardware for integer or floating-point values and trapping to a handler for the two other possibilities of a two-bit tag field, and an ever-increasing amount of code is written in dynamic languages; performance improvements in the order of 10%-50% might be easily obtained, depending on the workload. Low-hanging fruit, people! 🙂

Then there's HELIUM, the resource manager. HELIUM's main job is to manage blocked threads in queues. That means scheduling access to CPUs, and providing for queues of threads waiting to submit jobs to I/O devices or for I/O to complete. HELIUM will provide the tools to wrap access to devices, so that threads can just block until an I/O can be started then block for it to complete. At the core of HELIUM is the concept of thread priority and resource reservation, which will allow it to manage the queues for resources; queues generally operate on a prioritised first-come first-served basis, but threads can register to be allowed straight to the head of the queue a given number of times per second, as long as their estimate of how long their job will take to process won't result in the resource the queue controls being oversubscribed, and HELIUM will measure how long it actually takes to process those requests and penalise the thread if it goes over quota. This is the core of real-time scheduling for both CPU and I/O.

The HELIUM notion of a thread also caters for a special case of short-lived thread used to handle an event. Rather than having to maintain pools of threads waiting to handle asynchronous requests from various sources, HELIUM is designed to make it practical to just create new lightweight threads to perform a task; only if those tasks expire their CPU time slice or block on an I/O queue do they become full-fledged threads that will need to be rescheduled in future. HELIUM also provides a memory allocator, which allocates memory into per-thread pools; when a thread terminates, any memory not explicitly salvaged from that thread's pool (on account of being passed to other threads or lodged in global state) is automatically deallocated, reducing the garbage collection load on IRON; this means that only long-running threads and global state need garbage collection, as all the garbage from short-lived tasks gets thrown away before an actual garbage collection is required.

HELIUM also has an explicit concept of a daemon. A daemon isn't an entire thread to itself; instead, it's a subroutine handle registered for background processing. When there's no normal threads to process, the active daemons will be invoked in turn. This is for proper CPU-bound daemons such as your idle-cycles grid computation; stuff that has to happen once a second is a matter of setting up a timer interrupt to trigger a task. The daemon system explicitly round-robins the daemon handlers registered with it. If there are no active daemons, the CPU will actually be halted pending further interrupts. By avoiding long-lived background threads for this kind of thing, we reduce the state complexity of the system.

But there's more

There's plenty of features of the HYDROGEN design that I've not mentioned in this discussion (such as the interface for loading and saving the virtual CPU state for threading, and the related interface for saving and restoring continuations to implement control structures). I've not spoken about code generator optimisations such as tail calls, tools for managing data structures that can be safely modified by multiple CPUs or within interrupt handlers, and so on.

And there's stuff I'm planning to add one day, but don't need yet - snazzy optimising compilers for OO languages can optimise generic function dispatch to within an inch of its life using various tricks. What about having multiple entry points to a subroutine, by placing "markers" in it while compiling, then getting multiple subroutine handles out when the compilation of the subroutine is done? That can be useful in the presence of polymorphism; we can have a subroutine start with a set of argument type checks that dispatch off to error handlers or subroutines that implement alternative methods, then another entry point at the start of the body of the actual code, so that call sites that know their argument types are correct can skip over the type checks. And it'd be a good idea to have a standard interface for creating "indirect subroutines", yielding a handler that is constant but can be redirected at run time to refer to different handlers - although this can be implemented with indirect calls, it's possible to implement it more efficiently on some platforms by peephole-compiling calls to indirect handles into direct calls to their current target, but registering the call site with the indirect handle, so that when it is modified, all call sites referring to it are modified in place, thus eliminating the redirection (and the system can even gather counts of call sites and changes, to make a decision as to whether it would just be more efficient to return to indirection). I'm proud of the HYDROGEN VM architecture that means primitives handled specially by the compiler look just the same as calling arbitrary user subroutines, from the perspective of users of the code generator; it means that something like indirect calls can be written as a user-level toolkit, then if they are identified as a performance issue, implemented in the core and the user-level implementation just moved to the standard library for implementations that don't have an optimised implementation.

And so on.

I'm writing all of this up in the HYDROGEN specification, the latest draft of which can be viewed (in DocBook) from the Subversion repository. It's a slow process, though, since I can only do it in my spare time, which I've had far too little of lately!

No Comments

No comments yet.

RSS feed for comments on this post.

Leave a comment

WordPress Themes

Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales
Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales