Main
Links
Games
Main
Download
Documentation
Development
Bug list

The Sierra PMachine

Lars Skovlund, Dark Minister and Christoph Reichenbach

Version 1.0, 6. July 1999

This document describes thee design of the Sierra PMachine (the virtual CPU used for executing SCI programs). It is a special CPU, in the sense that it is designed for object oriented programs.

There are three kinds of memory in SCI: Variables, objects, and stack space. The stack space is used in a Last-In-First-Out manner, and is primarily used for temporary space in a routine, as well as passing data from one routine to another. Note that the stack space is used bottom-up by the original in- terpreter, instead of the more usual top-down. I don't know if this has any significance for us.

Scripts are loaded into the PMachine by creating a memory imagee of it on the heap. For this reason, the script file format may seem a bit obscure at times. It is optimized for in-memory performance, not readability. It should be mentioned here that a lot of fixup stuff is done by the interpreter. In the script files, all addresses are specified as script-relative. These are converted to absolute offsets. The species and superClass fields of all objects are converted into pointers to the actual class etc.

There are four types of variables. These are called global, local, temporary, and parameter. All four types are simple arrays of 16-bit words. A pointer is kept for each type, pointing to the list that is currently active. In fact, only the global variable list is constant in memory. The other pointers are changed frequently, as scripts are loaded/unloaded, routines called, etc. The variables are always referenced as an index into the variable list. I'll explain the four types below - the names in parantheses will be used occasionally in the rest of the text:

Local variables (LocalVar)

This variable type is called "local" because it belongs to a specific script. Each script may have its own set of local variables, defined by script block type 10. As long as the code from a specific script is running, the local variables for that script are "active" (pointed to by the mentioned pointer).

Global variables

These, like the local variables, reside in script space (in fact, they are the local variables of script 0!). But the pointer to them remains constant for the whole duration of the program.

Temporary variables

These are allocated by specific subroutines in a script. They reside on the PMachine stack and are allocated by the link opcode. The temp variables are automatically discarded when the subroutine returns.

Parameter variables

These variables also reside on the stack. They contain information passed from one routine to another. Any routine in SCI is capable of taking a variable number of parameters, if need be. This is possible because a list size is pushed as the first thing before calling a routine. In addition to this, a frame size is passed to the call* functions.

Objects

While two adjacent variables may be entirely unrelated, the contents of an object is always related to one task. The object, like the variable tables, provides storage space. This storage space is called properties. Depending on the instructions used, a property can be referred to by index into the object structure, or by property IDs (PIDs). For instance, the name property has the PID 17h, but the offset 6. The property IDs are assigned by the SCI compiler, and it is the "compatible" way of accessing object data. Whereas the offset method is used only internally by an object to access its own data, the PID method is used externally by objects to read/write the data fields of other objects. The PID method is also used to call methods in an object, either by the object itself, by another object, or by the SCI inter- preter. Yes, this really happens sometimes.

The PMachine "registers"

The PMachine can be said to have a number of registers, although none of them can be accessed explicitly by script code. They are used/changed implicitly by the script opcodes:

Acc - the accumulator. Used for result storage and input for a number of opcodes.
IP - the instruction pointer.[1] Points to the currently executing instruction
Vars - an array of 4 values, pointing to the current variables of each mentioned type
Object - points to the currently executing object.
SP - the current stack pointer. Note that the stack in the original SCI interpreter is used bottom-up instead of the more usual top-down.

The PMachine, apart from the actual instruction pointer, keeps a record of which object is currently executing.

The instruction set

The PMachine CPU potentially has 128 instructions (however, a couple of these are invalid and generate an error). Some of these instructions have a flag which specify whether the opcode has byte- or word-sized operands (I will refer to this as variably-sized parameters, as opposed to constant parameters). Other instructions have only one calling form. These instructions simply disregard the operand size flag. Ideally, however, all script instructions should be prepared to take variably-sized operands. Yet another group of instructions take both a constant parameter and a variably-sized parameter. The format of an opcode byte is as follows:

The instructions

The instructions are described below. I have used Dark Minister's text on the subject as a starting point, but many things have changed; stuff explained more thoroughly, errors corrected, etc. The first 23 instructions (up to, but not including, bt) take no parameters.

These functions are used in the pseudocode explanations:

pop(): sp -= 2; return *sp;
push(x): *sp = x; sp += 2; return x;

The following rules apply to opcodes:

Parameters are signed, unless stated otherwise. Sign extension is performed.
Jumps are relative to the posisition of the next operation.
*TOS refers to the TOS (Top Of Stack) element.
"tmp" refers to a temporary register that is used for explanation purposes only.

Notes

[1]

FreeSCI calls this the "Program Counter" or PC, which is the more general term.