ASM76 Specification

This document describes both the VM/76 virtual machine platform and the assembly language itself.

Definitions and conventions in this specification

Note that:

76 Virtual Machine and VM/76 CPU

Introduction

Instruction structure

An instruction is internally represented by a sequence of 10 bytes. Each instruction consists of three parts: an opcode and exactly two operands. For opcodes that need less than two operands, zero bytes are usually used to fill the unused part.

Byte # 0 1 2 3 4 5 6 7 8 9
Opcode Operand 1 Operand 2

Instruction sets

A variety of instruction sets are provided.

Registers

Special register Purpose Default value
$100..$103 Instruction Pointer 0x01000000
$104..$107 Stack Pointer 0x01003000
$109 A ⋚ B Flag 0
$110 Instruction Set Flags 0

$109

This flag register is used in CMPx.

$110

This flag register is used to enable or disable certain instruction sets. Each bit links with an instruction set.

Bit 0 is originally intended for 76-Base, but since that set can't be disabled, it has been of no use.

Bit # Instruction set
0 N/A
1 76-Float
2 76-Vector
3..6 (reserved)
7 BIOS Instructions

32-bit memory address

Address range Size Name Usage
0x0...0x400000 4MB Global memory For sharing data between CPU instances
0x400000...0x1000000 12MB IO For transferring data between the CPU (ASM76) and the outside world (VMDE)
0x1000000...∞ depends* Local memory For private use inside a CPU instance

* The size of the local memory is 16KB by default, which can be resized through LCMM.

The starting of the local memory is used to store the program. Thanks the machine architecture which does not have a protection system, the program can modify itself freely at runtime.

The ASM76 language syntax

Warning: The assembly source must be saved in Unix format (with \n as line ending).

# ASM76 Example
# This example is only demonstrative.
# Although it is syntactically correct, it is not meaningful or practical.

# Hash signs start comments.
# Only whole line comments are allowed,
# i.e. you can't start a commend in the middle of a line.

# Hexadecimal numbers.
# You can use lower or upper case letters in hexadecimal numbers.
# Of course, we don't recommend you to mix them.
LCMM 0xAaBb

# Decimal numbers and registers.
# Parameters are separated by spaces, not commas.
DATI 8012 $1

# Register numbers are decimal.
MVRL $1 $31

# A tag (label) can be used as an address.
JMPA [PlaceToStart]

# You can make adress tags by using [].
# Use [some_name] to tag the current adress.
[PlaceToStart]

# If you are bored with messing with register numbers,
# you can leave the tiring task to the assembler.
# Register variables are both readable and automatically assigned.
{AllocRegVar MyVar 8}
DATI 4 $MyVar
DIVL $1 $MyVar

# Things contained in a pair of curly brackets are called macros.
{FreeRegVar MyVar}

# You can place arbitrary data after HALT.
# Since the machine is HALTed, they will not get executed.
HALT
RAWD 0x0123 0x4567 0x89ab 0xcdef 0x0000

# Raw strings are supported.
# The new line and the star, which informs the assembler to save raw bytes,
# are stripped. However, a zero byte is appended to every line of string.
*Hello, world!
*A slow lazy fox swirled across the brown dog.

Appendix: object code format

The virtual machine can load a packed program format called the VM/76 object code. It is usually stored in a .obj file, which contains a program. The implementation has a ObjectCode module that deals with this format.

The format

Offset Description Example value Explanation
0 Magic number A3 EF A3 E2 56 4D 37 36 ‘obVM76’ in GB2312
8 Program size 1E 00 00 00 30 bytes, excluding the header and this size field
12 Instructions 01 00 07 00 00 00 06 00 00 00
00 00 00 00 00 00 00 00 00 00
28 00 00 00 00 00 00 00 00 00
ADDL $7 $6
NOOP
HALT
~ Random data 01 23 45 67 89 AB CD won't be loaded

Instruction set reference

Conventions

Pseudo instructions

These are not actually executable instructions, but rather like assembler macros.

Embedding raw data

Instruction Description
RAWD directly embed a piece of data into the program
FILL same as above, without having to specify the size

RAWD/FILL

These are the most powerful instruction in the computer world. With them, you can create anything directly.

For these instructions, the assembler will go back to work with bytes and insert the data directly into the program so that they can be used by the program by accessing the local memory — you usually want to prepend an address tag to get the address.

In order to be aligned to 10 bytes, which is the size of an instruction, there is only RAWD. RAWB/RAWI/RAWL are absent and RAWD accepts five ‘double-bytes.’

The FILL instruction is rather special: it accepts both integer constants and string literals as its ‘operand’. This makes printing functions easier to use. Note that it does not append a zero byte to the string literals automatically and you must do it yourself — though zero bytes are quite often encountered in the machine code and it'll stop printing very soon after getting through the end of the string.

76-Base

76-Base is the most essential instruction set. It is enabled by default, You don't need to enable it and you can not turn it off in any way.

Memory and registers

To simplify the description, we'll use some word macros.

Instruction Description
LCMM size_in_bytes set local memory size
DATx data $A $Adata
LDxA A $B $B ← [A]
LDxR $A $B $B ← [$A]
SLxA A $B [A] ← $B
SLxR $A $B [$A] ← $B
MOVx A B [B] ← [A]
MVRx $A $B $B$A
MVPx $A $B [$B] ← [$A]

LCMM

Specifiy the local memory size. It does not has a maximum limit in theory and is 16KB by default. As the command runs, the data in the memory will be cleared and initialized with zeros. Then it will copy the original data into the memory. If the original data is longer, it will be truncated.

DATx

Note that due to the limitation of the instruction size, the DATL instruction doesn't exist.

LDxR

For example:

DATI 0x00FF0000 $0
LDLR $0 $12

After executing the piece of code above, 8 bytes of data from addresses 0x00FF0000..0x00FF0007 will be stored in $12..$19.

Basic algebra

All mathematical operations takes the registers as unsigned numbers.

Instruction Description
ADDx $A $B $A$A + $B
MINx $A $B $A$A$B
MTPx $A $B $A$A × $B
DIVx $A $B $A ← ⌊$A ÷ $B
MODx $A $B $A$A mod $B
CMPx $A $B compare two long/int/byte arithmetically

DIVx/MODx

CMPx

Compare $A to $B. It updates $109 according to the result of comparision.

$109 Meaning
0 $A < $B
1 $A = $B
2 $A > $B

Logical operations

Instruction Description
ANDx $A $B bitwise logical AND
OR_x $A $B bitwise logical inclusive OR
XORx $A $B bitwise logical exclusive OR
NOTx $A boolean NOT for long/int/byte

NOTx

Means $A = !$A; in C.

Flow control

Instruction Description
NOOP waste some time
HALT halt the CPU and stop
JMPR $A jump to memory address stored in $A
JMPA address jump to memory address address
JILR/JIER/JIGR $A jump to memory address stored in $A if $109/$110/$111 = 0xFF
JILA/JIEA/JIGA address jump to address if $109/$110/$111 = 0xFF
CALR $A jump to memory address stored in $A and push the next instruction's address into stack
CALA address jump to address and push the next instruction's address into stack
RETN POP_ $100
PUSH $A length push registers from $A...$(A + length) onto the stack
POP_ $A length pop data from stack to registers $A...$(A + length)

76-Float

76-Float provides instructions for processing floating point values.

You need to set bit 1 of $110 to 1 first in order to make it work.

# Example code to enable 76-Float
PUSH $99 1
DATB 0x2 $99
OR_B $110 $99
POP $99 1

Basic floating point arithmetic

Instruction Description
   

76-Vector

76-Vector provides instructions for calculating vectors, just as in the OpenGL shader language.

You need to set bit 2 of $110 to 1 first in order to make it work.

Basic vector arithmetic

Instruction Description
   

BIOS Instructions

BIOS is an acronym for Basic Input/Output System, of course.

Interrupts are handled by the firmware attached to the VM. A BIOS can contain at most 255 interrupt handler numbered from 1 to 255. Interrupt #0 is defined as the null call. An interrupt accepts an address as its parameter.

You need to set bit 7 of $110 to 1 first in order to make it work.

BIOS Access

Instruction Description
INTX function_id address Do an operation provided by the BIOS
INTR $A address Same as above, but the function ID is put in $A

INTX/INTR

VMDE provides the following BIOS functions.

Note that functions 1–8 are unimplemented now.

Function ID BIOS Function Description
0 null(uintptr_t a) Return NOT (a XOR 0x76ABCDEF) — the null call
1 putc(uint8_t c) Print c as a byte character
2 puts(uint32_t addr) Print a null-terminated string starting from memory address addr
3 putlx(uint64_t x) Print x as a hexadecimal number
4 putix(uint32_t x) Print x as a hexadecimal number
5 putbx(uint8_t x) Print x as a hexadecimal number
6 putl(uint64_t x) Print x as a decimal number
7 puti(uint32_t x) Print x as a decimal number
8 putb(uint8_t x) Print x as a decimal number
11 GDrawable_render Documentation needed
12 GDrawable_renderOnce Idem
13 GDrawable_batch Idem
14 GDrawable_batchOnce Idem