A couple of people have recently asked for an introduction to how emulation actually works at a lower level. I will attempt to begin to answer that here 🙂 Some programming experience is extremely helpful to understand this.
Introduction
At their lowest level, digital computers run programs that composed ultimately of a stream of numbers (this is called machine language). For a given CPU chip, certain numbers always mean certain things. For instance, on a 6502, the number “105” translates to “ADC” or Add with Carry. It’s important to realize that every time the 6502 encounters instruction 105 it will always perform the same addition operation. You could, in fact, write a valid program composed entirely of instruction 105, although it wouldn’t be very interesting.
Incidentally, the 3-letter code for the instruction (“ADC” in this case) is what is called assembly language, sometimes abbreviated ASM. Most programs for 8 bit processors were written in this manner – it’s somewhat cryptic, but it also gives you full control with minimum size. The number itself is known as an opcode (operation code).
Inside the CPU there’s hardware to fetch the next instruction number from memory, figure out what it is, and execute it. Now, there’s nothing digital hardware can do that digital software can’t also do. So emulation starts off with a block of code (a CPU core) which does the same thing for a given type of processor, like a 6502 or Z80. In C-like pseudo-code, it would look something like this:
CPU_Start:
opcode = fetch_next_opcode();
if (opcode = 105) do_addition_with_carry();
else print “ERROR: Unknown opcode!”;
goto CPU_Start;
That is remarkably close to how most real CPU cores work, incidentally. The major difference is ususally that a C switch statement is used to efficiently go to the right place for each opcode.
Yeah, but where’s the wakka-wakka come from?
Each CPU has a limited amount of memory it can talk to. For 8 bit CPUs this was usually 64K (65,536 units). This means that each unit of memory has an address (similar to the one on your home) from 0 to 65,535. Some of this memory must be dedicated to the program itself (usually stored in ROMs). Some must be dedicated to a work area for the program (so-called work RAM). But addresses don’t have to be just memory – they can also be devices. For instance, in Pac-Man, one set of memory addresses controls the horizontal and vertical positions of Pac-Man. By writing different numbers there, the video hardware will dutifully draw Pac-Man at various places on the screen without further CPU intervention. (This is quite a time-saver for the CPU!) Such special addresses are called registers or sometimes switches.
For emulation, this is relatively easy. CPU cores don’t want to tie themselves to one machine, so they call out to other code in the emulator (sometimes called the memory manager or memory mapper). To read a memory location, they call out with an address and expect to get an answer back from elsewhere in the emulator. To write a memory location, they call out with the address and the new number to place at that address. Code elsewhere in the emulator figures out for the current game if that address is work RAM, program storage, or something else entirely. The complete set of valid memory addresses for a machine is called a memory map.
A simplified real-world example is Pac-Man once again. Addresses 0 to 16,383 are program storage (ROM chips in this case). Addresses 20,464 through 20,479 control the screen positions of Pac-Man and the ghosts. Address 20,480 contains the current status of the joystick. And so on. If you respond properly to all of the necessary addresses you get a running game. That is how basic emulation works.
