Virtual Machine: Implementation

In this chapter we'll dive into some of the more interesting and important implementation details of our virtual machine.

To begin with, we'll lay out a struct for a single thread of execution. This struct should contain everything needed to execute the output of the compiler.

pub struct Thread {
    /// An array of CallFrames
    frames: CellPtr<CallFrameList>,
    /// An array of pointers any object type
    stack: CellPtr<List>,
    /// The current stack base pointer
    stack_base: Cell<ArraySize>,
    /// A dict that should only contain Number keys and Upvalue values. This is a mapping of
    /// absolute stack indeces to Upvalue objects where stack values are closed over.
    upvalues: CellPtr<Dict>,
    /// A dict that should only contain Symbol keys but any type as values
    globals: CellPtr<Dict>,
    /// The current instruction location
    instr: CellPtr<InstructionStream>,
}

Here we see every data structure needed to represent:

  • function call frames
  • stack values
  • closed-over stack values (Upvalues)
  • global values
  • bytecode to execute

The VM's primary operation is to iterate through instructions, executing each in sequence. The outermost control struture is, therefore, a loop containing a match expression.

Here is a code extract of the opening lines of this match operation. The function shown is a member of the Thread struct. It evaluates the next instruction and is called in a loop by an outer function. We'll look at several extracts from this function in this chapter.

    /// Execute the next instruction in the current instruction stream
    fn eval_next_instr<'guard>(
        &self,
        mem: &'guard MutatorView,
    ) -> Result<EvalStatus<'guard>, RuntimeError> {
        // TODO not all these locals are required in every opcode - optimize and get them only
        // where needed
        let frames = self.frames.get(mem);
        let stack = self.stack.get(mem);
        let globals = self.globals.get(mem);
        let instr = self.instr.get(mem);

        // Establish a 256-register window into the stack from the stack base
        stack.access_slice(mem, |full_stack| {
            let stack_base = self.stack_base.get() as usize;
            let window = &mut full_stack[stack_base..stack_base + 256];

            // Fetch the next instruction and identify it
            let opcode = instr.get_next_opcode(mem)?;

            match opcode {
                // Do nothing.
                Opcode::NoOp => return Ok(EvalStatus::Pending),

                ...

The function obtains a slice view of the register stack, then narrows that down to a 256 register window for the current function.

Then it fetches the next opcode and using match, decodes it.

Let's take a closer look at the stack.

The stack

While some runtimes and compilers, particularly low-level languages, have a single stack that represents both function call information and local variables, our high-level runtime splits the stack into:

  1. a stack of CallFrame objects containing function call and return information
  2. and a register stack for local variables.

Let's look at each in turn.

The register stack

In our Thread struct, the register stack is represented by the two members:

pub struct Thread {
    ...
    stack: CellPtr<List>,
    stack_base: Cell<ArraySize>,
    ...
}

Remember that the List type is defined as Array<TaggedCellPtr> and is therefore an array of tagged pointers. Thus, the register stack is a homogenous array of word sized values that are pointers to objects on the heap or values that can be inlined in the tagged pointer word.

We also have a stack_base variable to quickly retrieve the offset into stack that indicates the beginning of the window of 256 registers that the current function has for it's local variables.

The call frame stack

In our Thread struct, the call frame stack is represented by the members:

pub struct Thread {
    ...
    frames: CellPtr<CallFrameList>,
    instr: CellPtr<InstructionStream>,
    ...
}

A CallFrame and an array of them are defined as:

#[derive(Clone)]
pub struct CallFrame {
    /// Pointer to the Function being executed
    function: CellPtr<Function>,
    /// Return IP when returning from a nested function call
    ip: Cell<ArraySize>,
    /// Stack base - index into the register stack where register window for this function begins
    base: ArraySize,
}

pub type CallFrameList = Array<CallFrame>;

A CallFrame contains all the information needed to resume a function when a nested function call returns:

  • a Function object, which references the Bytecode comprising the function
  • the return instruction pointer
  • the stack base index for the function's stack register window

On every function call, a CallFrame instance is pushed on to the Thread's frames stack and on every return from a function, the top CallFrame is popped off the stack.

Additionally, we keep a pointer to the current executing function (the function represented by the top CallFrame) with the member instr: CellPtr<InstructionStream>.

For a review of the definition of InstructionStream see the bytecode chapter where we defined it as a pair of values - a ByteCode reference and a pointer to the next Opcode to fetch.

The VM keeps the InstructionStream object pointing at the same ByteCode object as is pointed at by the Function in the CallFrame at the top of the call frame stack. Thus, when a call frame is popped off the stack, the InstructionStream is updated with the ByteCode and instruction pointer from the CallFrame at the new stack top; and similarly when a function is called into and a new CallFrame is pushed on to the stack.

Functions and function calls

Function objects

Since we've mentioned Function objects above, let's now have a look at the definition.

#[derive(Clone)]
pub struct Function {
    /// name could be a Symbol, or nil if it is an anonymous fn
    name: TaggedCellPtr,
    /// Number of arguments required to activate the function
    arity: u8,
    /// Instructions comprising the function code
    code: CellPtr<ByteCode>,
    /// Param names are stored for introspection of a function signature
    param_names: CellPtr<List>,
    /// List of (CallFrame-index: u8 | Window-index: u8) relative offsets from this function's
    /// declaration where nonlocal variables will be found. Needed when creating a closure. May be
    /// nil
    nonlocal_refs: TaggedCellPtr,
}

Instances of Function are produced by the compiler, one for each function definition that is compiled, including nested function definitions.

A Function object is a simple collection of values, some of which may be nil. Any member represented by a TaggedCellPtr may, of course, contain a nil value.

Thus the function may be anonymous, represented by a nil name value.

While the function name is optional, the parameter names are always included. Though they do not need to be known in order to execute the function, they are useful for representing the function in string form if the programmer needs to introspect a function object.

Members that are required to execute the function are the arity, the ByteCode and any nonlocal references.

Nonlocal references are an optional list of (relative_stack_frame, register) tuples, provided by the compiler, that are needed to locate nonlocal variables on the register stack. These are, of course, a key component of implementing closures.

We'll talk about closures shortly, but before we do, we'll extend Functions with partial application of arguments.

Partial functions

A partial function application takes a subset of the arguments required to make a function call. These arguments must be stored for later.

Thus, a Partial object references the Function to be called and a list of arguments to give it when the call is finally executed.

Below is the definition of Partial. Note that it also contains a possible closure environment which, again, we'll arrive at momentarily.

#[derive(Clone)]
pub struct Partial {
    /// Remaining number of arguments required to activate the function
    arity: u8,
    /// Number of arguments already applied
    used: u8,
    /// List of argument values already applied
    args: CellPtr<List>,
    /// Closure environment - must be either nil or a List of Upvalues
    env: TaggedCellPtr,
    /// Function that will be activated when all arguments are applied
    func: CellPtr<Function>,
}

The arity and used members indicate how many arguments are expected and how many have been given. These are provided directly in this struct rather than requiring dereferencing the arity on the Function object and the length of the args list. This is for convenience and performance.

Each time more arguments are added to a Partial, a new Partial instance must be allocated and the existing arguments copied over. A Partial object, once created, is immutable.

Closures

Closures and partial applications have, at an abstract level, something in common: they both reference values that the function will need when it is finally called.

It's also possible, of course, to have a partially applied closure.

We can extend the Partial definition with a closure environment so that we can use the same object type everywhere to represent a function pointer, applied arguments and closure environment as needed.

Compiling a closure

The compiler, because it keeps track of variable names and scopes, knows when a Function references nonlocal variables. After such a function is defined, the compiler emits a MakeClosure instruction.

Referencing the stack with upvalues

The VM, when it executes MakeClosure, creates a new Partial object. It then iterates over the list of nonlocal references and allocates an Upvalue object for each, which are added to the env member on the Partial object.

The below code extract is from the function Thread::eval_next_instr() in the MakeClosure instruction decode and execution block.

The two operands of the MakeClosure operation - dest and function - are registers. function points at the Function to be given an environment and made into a closure Partial instance; the pointer to this instance will be written to the dest register.

                // This operation should be generated by the compiler after a function definition
                // inside another function but only if the nested function refers to nonlocal
                // variables.
                // The result of this operation is a Partial with a closure environment
                Opcode::MakeClosure { dest, function } => {
                    // 1. iter over function nonlocals
                    //   - calculate absolute stack offset for each
                    //   - find existing or create new Upvalue for each
                    //   - create closure environment with list of Upvalues
                    // 2. create new Partial with environment
                    // 3. set dest to Partial
                    let function_ptr = window[function as usize].get(mem);
                    if let Value::Function(f) = *function_ptr {
                        let nonlocals = f.nonlocals(mem);
                        // Create an environment array for upvalues
                        let env = List::alloc_with_capacity(mem, nonlocals.length())?;

                        // Iter over function nonlocals, calculating absolute stack offset for each
                        nonlocals.access_slice(mem, |nonlocals| -> Result<(), RuntimeError> {
                            for compound in nonlocals {
                                // extract 8 bit register and call frame values from 16 bit nonlocal
                                // descriptors
                                let frame_offset = (*compound >> 8) as ArraySize;
                                let window_offset = (*compound & 0xff) as ArraySize;

                                // look back frame_offset frames and add the register number to
                                // calculate the absolute stack position of the value
                                let frame = frames.get(mem, frames.length() - frame_offset)?;
                                let location = frame.base + window_offset;

                                // look up, or create, the Upvalue for the location, and add it to
                                // the environment
                                let (_, upvalue) = self.upvalue_lookup_or_alloc(mem, location)?;
                                StackAnyContainer::push(&*env, mem, upvalue.as_tagged(mem))?;
                            }

                            Ok(())
                        })?;

                        // Instantiate a Partial function application from the closure environment
                        // and set the destination register
                        let partial = Partial::alloc(mem, f, Some(env), &[])?;
                        window[dest as usize].set(partial.as_tagged(mem));
                    } else {
                        return Err(err_eval("Cannot make a closure from a non-Function type"));
                    }
                }

The Upvalue struct itself is defined as:

#[derive(Clone)]
pub struct Upvalue {
    // Upvalue location can't be a pointer because it would be a pointer into the dynamically
    // alloocated stack List - the pointer would be invalidated if the stack gets reallocated.
    value: TaggedCellPtr,
    closed: Cell<bool>,
    location: ArraySize,
}

An Upvalue is an object that references an absolute register stack location (that is the location member.)

The initial value of closed is false. In this state, the location on the stack that contains the variable must be a valid location. That is, the stack can not have been unwound yet. If the closure is called, Upvalues in this state are simply an indirection between the function and the variable on the register stack.

The compiler is able to keep track of variables and whether they are closed over. It emits bytecode instructions to close Upvalue objects when variables on the stack go out of scope.

This instruction, CloseUpvalues, copies the variable from the register stack to the value member of the Upvalue object and sets closed to true.

From then on, when the closure reads or writes to this variable, the value on the Upvalue object is modified rather than the location on the register stack.

Global values

pub struct Thread {
    ...
    globals: CellPtr<Dict>,
    ...
}

The outermost scope of a program's values and functions are the global values. We can manage these with an instance of a Dict. While a Dict can use any hashable value as a key, internally the VM will only allow Symbols to be keys. That is, globals must be named objects.

Let's dive into the compiler!