HYDROGEN: Code generation (by )

Conclusions

I hope that's given you an idea of some of the cunning tricks I have planned for HYDROGEN. The key thing to remember is that, by having very simple yet flexible interfaces, it should be possible to quickly knock together naive HYDROGEN compilers for new platforms, to get running fast - while not preventing complex optimisations to get the best out of a platform, when time permits.

But the first implementation of HYDROGEN I have planned won't even compile native code. To get the maximum initial reach, I want to implement a portable (well, as portable as gcc is) version, in C, that "compiles" to an indirect-threaded-code representation like GForth uses; this means that subroutines will be implemented as a header struct in memory, immediately followed by a list of pointers to primitive operations that are actual machine code addresses accessible via a gcc-specific computed goto. Calls to non-primitive words, and literal pushes, are implemented with push and call primitives that then read a literal operand from the instruction stream ahead of them. This format will be easily disassembled to retrieve the original PUSH/CALL VM ops, so there's no need for a separate bytecode version; therefore, it can still be inlined easily, and it's just as amenable to peephole optimisation as a native code compiler. Indeed, with a little work, it's even possible to have several register int variables in your core interpreter function as virtual registers, and to produce a version of each primitive for each possible assignment of registers to stack elements, and thus to have the compiler maintain register->stack mappings and to choose the appropriate primitive implementations for the current context. Anton Ertl has been working on this sort of thing.

Pages: 1 2 3 4

3 Comments

  • By Gavan, Thu 16th Jul 2009 @ 11:53 am

    Do you have any mechanism for assuring that a certain bit of code which will be used later must be compiled by a certain point in the code?

    I can think of several cases (mostly within device drivers) where the latency of certain routines is critically important, and having to wait (even the first time) for the parser to do its thing could lead to lots of unpleasantness.

  • By alaric, Thu 16th Jul 2009 @ 12:35 pm

    The definition of a subroutine with ( ... ) compiles it there and then, in the implementations I have planned (except for the case of tethered systems, but you'll have to wait to hear about them). Either way, by the time you call a subroutine, it ought to be compiled.

    It's all up to the implementation, though - weird JIT stuff could be done. It's just that those implementations would suck for real time stuff, and should say so on the tin 🙂

  • By alaric, Fri 17th Jul 2009 @ 2:07 pm

    This is also interesting reading:

    http://factor-language.blogspot.com/2009/07/improved-value-numbering-branch.html

Other Links to this Post

RSS feed for comments on this post.

Leave a comment

WordPress Themes

Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales
Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales