
Writing a Java byte-code compiler for JRuby
===========================================
$Id: Compiler.txt,v 1.4 2002/11/06 00:55:41 ndrs Exp $
Copyright (c) Anders Bengtsson 2002

0. About
--------

I'm writing this document at the same time as we are implementing the
compiler, so not everything here is necessarily correct or even
remotely true.

1. Introduction
---------------

The transformation of Ruby code into Java byte-codes is done in several
steps. The first step is to transform Ruby source into an AST, which
was already done as part of the interpreter.

The compilation of the AST to Java byte-codes is done in two steps,
since we don't want to deal with the horrors of the JVM at the same
time as the horrors of the AST.

2. AST -> Ruby byte-code
------------------------

The first step in compilation is translating the AST into custom
high-level byte-codes. The byte-codes assume a VM with an "operand
stack", which is the same model as the Java VM uses. These byte-codes
serve many purposes. They are intended to be easy to translate to JVM
byte-code and they can also possibly be interpreted directly. Ideally
the byte-codes should also be on a slightly higher abstraction level
than the syntax-oriented AST tree.

We flatten out the AST and extract some information hidden in the AST,
like the number of arguments a certain method call is using.

A simple example, "x = 10", is in AST form a tree like this:

    newline-node
        local-assignment-node (variable-index = 3)
            fixnum-node (value = 10)

When transformed to byte-codes it looks like this:

    push-fixnum (value = 10)
    assign-local (variable-index = 3)

An interpreter working on these byte-codes would probably be faster
than the AST-walking interpreter. But since we already have a working
interpreter we instead focus on getting to JVM byte codes.

3. Ruby byte-code -> JVM byte-code
----------------------------------

3.1 Invoking compiled code
-------------------------

This is where it gets interesting. The big question is maybe not how
to do the compilation, but how to use the resulting byte-code. ASTs and
Ruby byte-code can be passed around as objects and used in many
different ways, but Java byte-code has to be neatly placed in methods
within classes. How do we integrate that in our Ruby runtime?

We can always use reflection callbacks to reach our generated code,
but that would be slow, probably slower than our interpreted code that
uses indexed callbacks(*). A better idea would involve direct,
compiled, calls to our methods. This could be done with custom
generated callback classes or something similar to the old indexed
callbacks.

*) See IndexedCallback, ReflectionCallback

3.2 Compilation units
---------------------

The most direct mapping from Ruby code structure into Java's class
structure is to compile each Ruby file into a Java class file. The
outer code of the file compiled into one rubyMain() method and all the
other method and block bodies into their own methods.

This makes sense from a user's point of view too: Each '.rb'-file that
they see can be compiled into a corresponding binary file.

Note that we do not use Java's object oriented features here. Since we
have an entirely separate OO model we just use Java classes as a place
to store code. For this reason it is important that we do not use the
extension '.class' for the generated files, since that would confuse the
users.

[Add example here]

3.3 The environment
-------------------

The compiled code doesn't run in a vacuum. It needs access to the Ruby
runtime as much as interpreted code does. To make the code generation
as easy as possible this environment must be very simple.

For this purpose the two variables 'runtime' and 'self' are passed to
every Ruby-implementing Java method.

3.4 Handling exceptions
-----------------------

[To be written]
