Wednesday, 8 September 2010

The Beginnings of a 6502 Emulator in Haskell

It's been ages since I've written some assembly code, but I'm also enjoying learning Haskell. Therefore, my next random coding exercise is to code a simple CPU emulator.

The 6502 Processor is an 8-bit processor introduced in 1975 and it's still used in embedded systems. It was a hugely popular chip used by such classic machines as the Atari 2600, the original Nintendo Entertainment System and the BBC Micro.
The 6502 Micro Processor!

A CPU is defined by its instruction set architecture (ISA). According to Computer Architecture - A Quantitative Approach (seems to be the authoritative book on Computer Architecture) an ISA is defined by:

  1. Class  - Most processors today are general-purpose register architectures where operands are either registers or memory locations.  General purpose GPUs are a different class of ISA
  2. Memory Addressing - Almost every processor uses byte addressing to access memory.  Alignment can be an issue for performance and some processors such as the Itanium have more specific alignment requirements
  3. Addressing Modes - An ISA can specify various ways of addressing memory.  Examples include register, constant and displacement.  The 6502 processor supports a dozen or so different addressing modes.  A modern day Intel 64-bit processor supports even more including another level of indirection known as segment addressing. 
  4. Types and Sizes of operands - As an 8bit processor the 6502 just supports 8 bit operands.  A more sophisticated architecture like the 80x86 supports various sizes of integers and floating point
  5. Operations - There used to be a divide amongst RISC / CISC. Now it seems most modern processors are a combination of the two, though I guess GPUs are the emergence of simple instructions again?
  6. Control Flow Instructions - The various choices for branching.  The 6502 supports a simple selection of branching instructions that move the program counter based on some arithmetic test
  7. Encoding refers to how the assembler instructions are encoded into byte code.  There are two choices here, fixed length encoding which is easier to read but may take up more space and variable length encoding which is slower to deal with.
Thankfully the 6502 is one of the simplest processors available.  After reading this great description of the registers, I modelled the CPU with the following simple data structure.

The RAM (a maximum addressing space of 64Kb) is a mutable Data.Vector. The CPU consists of a number of registers.  The program counter indicates where the next instruction to execute is going to come from.  Jump instructions move the program counter to a new location. The xr and yr registers are commonly used to hold offsets or counters for memory addressing. The accumulator (ac) register is used by most arithmetic and logical operations. The status register contains various flags represented by Flag. These flags can be set as instructions are executed and are hopefully self-explanatory. Finally the stack pointer contains a pointer to the next free location on the stack. The stack is held within a 256 byte stack between 0x0100 and 0x01FF.

In order to access the memory we need to understand how the various addressing modes work, again the 6502 site provides a very clear description.

Accumulator indicates that the instruction works directly on the accumulator. Immediate is an 8 bit constant value within the constructor. In assembler this is usually indicated with #VAL. A ZeroPage address is a byte offset relative to 0, so this only allows indexing into the first 256 bytes of memory. ZeroPageX and ZeroPageY addressing modes use a zero page address together with the offset specified in the xr and yr registers. Relative addressing is only used by the branch instructions and gives a signed 8 bit number to indicate where the program counter should jump to. Absolute, AbsoluteX and AbsoluteY are like ZeroPage addressing but allow access to the full memory range because it supports a full 16 bit address range. Finally there are 3 indirect addressing modes. Indirect gives a 16 bit address which identifies the location of the LSB of another byte that contains the real target of the instruction. IndexedIndirect and IndirectIndexed is used similar to indirect but uses the xr and yr registers to index with offset.

Once we've got our address mode we need a handful of functions to read from the various memory addresses.

These functions do exactly as they say on the tin. With a few small exceptions... For example, readWord16 should never be called with an AddressMode of accumulator (you can read 16 bits from an 8 bit value). For now I've just left these functions with "error" definitions until I can think of a better way of expressing it.

After getting these bits and pieces in place it's time to look at the CPU instructions, apparently developed by Cyberdyne Systems and used in "The Terminator".

Instructions consist of a three letter op code, together with an optional argument.  For example, LDA loads the supplied value into the accumulator and sets the zero or negative fields appropriately. For some simple instructions no argument is needed (for example, CLC clears the carry flag but requires no argument).

Currently, I've represented instructions as just the three letter mnemonic and an optional address mode. In the future I'll try and make it more a specification by including the flags the operation is allowed to set, together with restricting to only the allowed addressing mode (for example, some instructions don't support all addressing modes, LDA wouldn't make much sense with a relative address).

Armed with this, all that needs to be done is implement the various operations. A simple function execute :: CPU -> Instruction -> IO () simply matches the instruction and executes it.

Finally after all that we can execute a really simple lump of code to see if it works.

In the example above we load a value into the accumulator. Shift it left (i.e. multiply by 2), store this value in some location in memory. Shift it left again (multiply by 4) and left once more (multiple by 8). We then add the original multipled by 8 with the value in the accumulator. For this really simple lump of code, it seems to work :)

My next plans are to:

  • Make things more type-safe (e.g. no wrong addressing modes, no modifying the wrong flags)
  • Write a simple parser using Parsec so that I can write little bits of assembler in a nicer syntax
  • Finish implementing the remainder of the operations (need to get the design right before I go much further)
  • Encode the instructions so that I actually use the program counter and potentially load in pre-existing code.
  • Adding some IO to actually make it useful...