Slide 1

Slide 1 text

A journey into hardware emulation: building a GameBoy emulator from scratch In glorious Rust! 1

Slide 2

Slide 2 text

Agenda Introduction to emulation GameBoy hardware 101 Emulating the CPU Emulating the MMU and MBC Emulating the GPU Emulating the timer and keypad Putting the pieces together: Rust GameBoy emulator code walk Closing thoughts Surprise! Q&A 2

Slide 3

Slide 3 text

Disclaimer I'm no expert! Take everything with a grain of salt! 3

Slide 4

Slide 4 text

Introduction to emulation What is an emulator? A virtual hardware replica Basic structure of an emulator CPU, Memory, User input, Display and sound, etc. Emulation is hard Complex hardware design, many edge cases, runtime overhead, accuracy issues, lack of documentation, etc... 4

Slide 5

Slide 5 text

Introduction to emulation Emulation is important! Hardware preservation Every emulated system is a different world Custom hardware makes this specially difficult Emulation will never be 100% accurate It's virtually impossible! 5

Slide 6

Slide 6 text

GameBoy hardware 101 Zilog Z80 with custom modifications 8 KB RAM (+ Memory Banking) 8 KB Video RAM Screen size - 160 x 144 (23.040 pixels) Sound: 4 channels Serial Port (link cable) 8 buttons keypad 6

Slide 7

Slide 7 text

Emulating the CPU 8 bit CPU: 1 byte at a time Stack pointer (sp) Program counter (pc) Flags (f) Registers (a, b, c, d, e, h, l) 7

Slide 8

Slide 8 text

Emulating the CPU Emulation flow: Check for interrupts Fetch opcode from memory Look up opcode Call opcode function Store cycles taken Interrupts: Check actions from outside the CPU Handle them to perform I/O, etc. (keypad, serial port, etc.) 8

Slide 9

Slide 9 text

Emulating the CPU Opcodes: CALL a16 (0xCD) 0xCD => { // decrease current stack pointer to the current function self.registers.stack_pointer -= 2; // write address of the current instruction forward self.mmu.write_word( self.registers.stack_pointer, oldregs.program_counter + 2 ); // point the program counter to the current function self.registers.program_counter = self.read_word(); 6 // cycles taken to perform operation }, 9

Slide 10

Slide 10 text

Emulating the CPU ALU operations fn alu_and(&mut self, b: u8) { let r = self.registers.a & b; self.registers.flag(Z, r == 0); self.registers.flag(H, true); self.registers.flag(C, false); self.registers.flag(N, false); self.registers.a = r; } Perform basic operations Store result in register 10

Slide 11

Slide 11 text

Emulating the CPU After a very long time... Protip: copy and paste from your favorite implementation :) 11

Slide 12

Slide 12 text

Emulating the MMU 16-bit address space Address spaces mapped to: Sound, Timer, GPU VRAM, GPU OAM, etc. High RAM (Zero-page) 12

Slide 13

Slide 13 text

Emulating the MMU Memory Banking Controller Interface: read / write MMU: handles interrupts GPU Vertical and Horizontal Blank Keypad, Serial port, etc. 13

Slide 14

Slide 14 text

Emulating the MMU Address spaces and read/write functions pub fn read_byte(&mut self, address: u16) -> u8 { match address { 0x0000 ... 0x7FFF => { self.mbc.read_rom(address) }, 0x8000 ... 0x9FFF => { self.gpu.read_byte(address) }, 0xA000 ... 0xBFFF => { self.mbc.read_ram(address) }, } } 14

Slide 15

Slide 15 text

Emulating the MMU Memory Banking Controller: Original problem: cannot fit entire game in available memory Chips inside cartdridges Allow persistence Interfaced with MMU 15

Slide 16

Slide 16 text

Emulating the GPU Graphic Processing Unit 160x144 pixels 4 colors (shades of grey) 8k Video memory (VRAM): Store of graphic raw data 160 bytes OAM (Object Attribute Memory) Stores sprites attributes 16

Slide 17

Slide 17 text

Emulating the GPU Graphics data Tilesets: Maps Saves memory but complicates design Background and window Sprites Attributes (OAM) Palettes 17

Slide 18

Slide 18 text

Emulating the GPU GPU flow Scanline: Draw background and sprites Read VRAM / OAM data Horizontal blank Move from end of line to start of next line Vertical blank Move from bottom right to top left Timed like the original hardware 18

Slide 19

Slide 19 text

Emulating the GPU Renderscan algorithm (bg and window): 1. Calculate BG & window Y position based on current line 2. For each X until the total width of the screen: 1. Set pixel data from this line 2. Set background priority based on color 3. Calculate color from raw graphic data 4. Fetch graphic data from VRAM 5. Calculate base addresses to read from VRAM 19

Slide 20

Slide 20 text

Emulating the GPU Renderscan algorithm (sprites): 1. Iterate each 40 maximum objects stored in OAM 2. Check sprite is in the current line 3. Calculate position on the screen 4. Check object attributes: Y Flip (need to flip pixels) 5. Draw sprite pixels (8 pixels) 1. Check sprite pixel is still on screen 2. Check pixel priority (above or below) 3. Calculate color and save pixel to pixels Vector 20

Slide 21

Slide 21 text

Other hardware parts Timer Periodic actions Period-based algorithms How it works Divider Counter: 4 frequencies (programmable) Modulo 21

Slide 22

Slide 22 text

Other hardware parts Keypad 8 buttons: Start, Select, A, B, Up, Down, Right, Left Read and written by the MMU When a key is pressed, an interrupt is triggered Push stack to save current position Point to interrupt handling position Opcode is fetched and interrupt handled Return to previous stack to continue execution 22

Slide 23

Slide 23 text

Putting the pieces together GameBoy emulator written in Rust 23

Slide 24

Slide 24 text

Putting the pieces together Available features: Full CPU (Z80) MMU and MBC1 GPU Timer / Keypad Tons of docs! TODO: Sound MBC2-5 and save game states Tests :( 24

Slide 25

Slide 25 text

Putting the pieces together Code Walk time! 25

Slide 26

Slide 26 text

Closing thoughts Accuracy of emulated systems 100% accuracy is hard! Normally it's ok with some degree of accuracy In more complex systems things get complicated! GameCube, Nintendo 64, etc. Many, many games-specific hacks Gigantic hardware design and no documentation makes for difficult emulation! 100% accuracy emulation is slow! Emulating 100% takes a lot of power SNES 100% accurate requires >3GHZ processor! 100% accuracy is accurate: no hacks required for specific games 26

Slide 27

Slide 27 text

Closing thoughts Hardware and software analogies Hardware is full of "ñapas": what a surprise! Designing hardware and software are actually quite similar activities Hardware decisions are more tough: Once shipped, it's already there! Need to think thoroughly about the architecture Changes to hardware requires... people buying your new version! Hardware design is intrinsically more difficult: working with bits, basic operations, simple data structures to handle complex operations, etc. 27

Slide 28

Slide 28 text

Closing thoughts New technology learning approaches Look for 10+ projects that implement what you need Doesn't matter the language, focus on the architecture Understand every line! “ I like to totally understand every line of code I wrote - my method of understanding code is always the same and independent of language. Find an example program that works then reduce lines until I can reduce no more making a minimal example that works - then understand every line. - Joe Amstrong 28

Slide 29

Slide 29 text

Closing thoughts Simplicity betrayed: thoughts about complexity Read this article: Even simple hardware (like the GameBoy) reveal a vast and complex array of undocumented internal behavior. Simplicity Betrayed “ What is troublesome is the increased effort required by the host CPU to pull this off. The work involved is many times greater than before. [...] It is hard to believe that drawing a 30-year-old computer's display takes up so much of a modern system. This is one reason why accurate emulation takes so long to perfect. We can decide to make a better display, but today's platforms may not have the horsepower to accomplish it. 29

Slide 30

Slide 30 text

Extra! bare metal hacking Extracting the GameBoy loader boot room Problem: Boot room (Nintendo logo scrolling) is hardwired into the hardware We need to extract it in order to emulate 30

Slide 31

Slide 31 text

Extra! bare metal hacking Extracting the GameBoy loader boot room Solution: Extract boot loader bit by bit! Steps: Open GameBoy hardware and desolder Z80 chip Put the chip under a electron microscope Observe the wires and extract the data bit by bit 31

Slide 32

Slide 32 text

Extra! bare metal hacking 256 bytes (2048 bits) boot ROM 32

Slide 33

Slide 33 text

Extra! bare metal hacking Result: LD SP,$fffe ; $0000 Setup Stack XOR A ; $0003 Zero the memory from $8000-$9FFF ( LD HL,$9fff ; $0004 Addr_0007: LD (HL-),A ; $0007 BIT 7,H ; $0008 JR NZ, Addr_0007 ; $000a LD HL,$ff26 ; $000c Setup Audio LD C,$11 ; $000f LD A,$80 ; $0011 LD (HL-),A ; $0013 LD ($FF00+C),A ; $0014 INC C ; $0015 LD A,$f3 ; $0016 LD ($FF00+C),A ; $0018 LD (HL-),A ; $0019 LD A,$77 ; $001a LD (HL),A ; $001c 33

Slide 34

Slide 34 text

Extra! Creating games for the GameBoy Downloading and installing the GBDK Ready to use project: Compile and install GBDK compiler and linker Try the Hello World! https://github.com/albertofem/gameboy-gbdk- examples 34

Slide 35

Slide 35 text

Extra! Creating games for the GameBoy GBDK Hello World example Code Walk! 35

Slide 36

Slide 36 text

Links and documentation Available at the Safeboy and GameBoy examples repository SafeBoy: GBDK Example: https://github.com/albertofem/safeboy https://github.com/albertofem/gameboy-gbdk-examples 36

Slide 37

Slide 37 text

Q&A 37