Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A journey into hardware emulation: building a GameBoy emulator from scratch

A journey into hardware emulation: building a GameBoy emulator from scratch

Introductory talk about emulators and the implementation of a GameBoy emulator in Rust.


Alberto Fernández

July 21, 2016


  1. A journey into hardware emulation: building a GameBoy emulator from

    scratch In glorious Rust! 1
  2. Agenda Introduction to emulation GameBoy hardware 101 Emulating the CPU

    Emulating the MMU and MBC Emulating the GPU Emulating the timer and keypad Putting the pieces together: Rust GameBoy emulator code walk Closing thoughts Surprise! Q&A 2
  3. Disclaimer I'm no expert! Take everything with a grain of

    salt! 3
  4. Introduction to emulation What is an emulator? A virtual hardware

    replica Basic structure of an emulator CPU, Memory, User input, Display and sound, etc. Emulation is hard Complex hardware design, many edge cases, runtime overhead, accuracy issues, lack of documentation, etc... 4
  5. Introduction to emulation Emulation is important! Hardware preservation Every emulated

    system is a different world Custom hardware makes this specially difficult Emulation will never be 100% accurate It's virtually impossible! 5
  6. GameBoy hardware 101 Zilog Z80 with custom modifications 8 KB

    RAM (+ Memory Banking) 8 KB Video RAM Screen size - 160 x 144 (23.040 pixels) Sound: 4 channels Serial Port (link cable) 8 buttons keypad 6
  7. Emulating the CPU 8 bit CPU: 1 byte at a

    time Stack pointer (sp) Program counter (pc) Flags (f) Registers (a, b, c, d, e, h, l) 7
  8. Emulating the CPU Emulation flow: Check for interrupts Fetch opcode

    from memory Look up opcode Call opcode function Store cycles taken Interrupts: Check actions from outside the CPU Handle them to perform I/O, etc. (keypad, serial port, etc.) 8
  9. Emulating the CPU Opcodes: CALL a16 (0xCD) 0xCD => {

    // decrease current stack pointer to the current function self.registers.stack_pointer -= 2; // write address of the current instruction forward self.mmu.write_word( self.registers.stack_pointer, oldregs.program_counter + 2 ); // point the program counter to the current function self.registers.program_counter = self.read_word(); 6 // cycles taken to perform operation }, 9
  10. Emulating the CPU ALU operations fn alu_and(&mut self, b: u8)

    { let r = self.registers.a & b; self.registers.flag(Z, r == 0); self.registers.flag(H, true); self.registers.flag(C, false); self.registers.flag(N, false); self.registers.a = r; } Perform basic operations Store result in register 10
  11. Emulating the CPU After a very long time... Protip: copy

    and paste from your favorite implementation :) 11
  12. Emulating the MMU 16-bit address space Address spaces mapped to:

    Sound, Timer, GPU VRAM, GPU OAM, etc. High RAM (Zero-page) 12
  13. Emulating the MMU Memory Banking Controller Interface: read / write

    MMU: handles interrupts GPU Vertical and Horizontal Blank Keypad, Serial port, etc. 13
  14. Emulating the MMU Address spaces and read/write functions pub fn

    read_byte(&mut self, address: u16) -> u8 { match address { 0x0000 ... 0x7FFF => { self.mbc.read_rom(address) }, 0x8000 ... 0x9FFF => { self.gpu.read_byte(address) }, 0xA000 ... 0xBFFF => { self.mbc.read_ram(address) }, } } 14
  15. Emulating the MMU Memory Banking Controller: Original problem: cannot fit

    entire game in available memory Chips inside cartdridges Allow persistence Interfaced with MMU 15
  16. Emulating the GPU Graphic Processing Unit 160x144 pixels 4 colors

    (shades of grey) 8k Video memory (VRAM): Store of graphic raw data 160 bytes OAM (Object Attribute Memory) Stores sprites attributes 16
  17. Emulating the GPU Graphics data Tilesets: Maps Saves memory but

    complicates design Background and window Sprites Attributes (OAM) Palettes 17
  18. Emulating the GPU GPU flow Scanline: Draw background and sprites

    Read VRAM / OAM data Horizontal blank Move from end of line to start of next line Vertical blank Move from bottom right to top left Timed like the original hardware 18
  19. Emulating the GPU Renderscan algorithm (bg and window): 1. Calculate

    BG & window Y position based on current line 2. For each X until the total width of the screen: 1. Set pixel data from this line 2. Set background priority based on color 3. Calculate color from raw graphic data 4. Fetch graphic data from VRAM 5. Calculate base addresses to read from VRAM 19
  20. Emulating the GPU Renderscan algorithm (sprites): 1. Iterate each 40

    maximum objects stored in OAM 2. Check sprite is in the current line 3. Calculate position on the screen 4. Check object attributes: Y Flip (need to flip pixels) 5. Draw sprite pixels (8 pixels) 1. Check sprite pixel is still on screen 2. Check pixel priority (above or below) 3. Calculate color and save pixel to pixels Vector 20
  21. Other hardware parts Timer Periodic actions Period-based algorithms How it

    works Divider Counter: 4 frequencies (programmable) Modulo 21
  22. Other hardware parts Keypad 8 buttons: Start, Select, A, B,

    Up, Down, Right, Left Read and written by the MMU When a key is pressed, an interrupt is triggered Push stack to save current position Point to interrupt handling position Opcode is fetched and interrupt handled Return to previous stack to continue execution 22
  23. Putting the pieces together GameBoy emulator written in Rust 23

  24. Putting the pieces together Available features: Full CPU (Z80) MMU

    and MBC1 GPU Timer / Keypad Tons of docs! TODO: Sound MBC2-5 and save game states Tests :( 24
  25. Putting the pieces together Code Walk time! 25

  26. Closing thoughts Accuracy of emulated systems 100% accuracy is hard!

    Normally it's ok with some degree of accuracy In more complex systems things get complicated! GameCube, Nintendo 64, etc. Many, many games-specific hacks Gigantic hardware design and no documentation makes for difficult emulation! 100% accuracy emulation is slow! Emulating 100% takes a lot of power SNES 100% accurate requires >3GHZ processor! 100% accuracy is accurate: no hacks required for specific games 26
  27. Closing thoughts Hardware and software analogies Hardware is full of

    "ñapas": what a surprise! Designing hardware and software are actually quite similar activities Hardware decisions are more tough: Once shipped, it's already there! Need to think thoroughly about the architecture Changes to hardware requires... people buying your new version! Hardware design is intrinsically more difficult: working with bits, basic operations, simple data structures to handle complex operations, etc. 27
  28. Closing thoughts New technology learning approaches Look for 10+ projects

    that implement what you need Doesn't matter the language, focus on the architecture Understand every line! “ I like to totally understand every line of code I wrote - my method of understanding code is always the same and independent of language. Find an example program that works then reduce lines until I can reduce no more making a minimal example that works - then understand every line. - Joe Amstrong 28
  29. Closing thoughts Simplicity betrayed: thoughts about complexity Read this article:

    Even simple hardware (like the GameBoy) reveal a vast and complex array of undocumented internal behavior. Simplicity Betrayed “ What is troublesome is the increased effort required by the host CPU to pull this off. The work involved is many times greater than before. [...] It is hard to believe that drawing a 30-year-old computer's display takes up so much of a modern system. This is one reason why accurate emulation takes so long to perfect. We can decide to make a better display, but today's platforms may not have the horsepower to accomplish it. 29
  30. Extra! bare metal hacking Extracting the GameBoy loader boot room

    Problem: Boot room (Nintendo logo scrolling) is hardwired into the hardware We need to extract it in order to emulate 30
  31. Extra! bare metal hacking Extracting the GameBoy loader boot room

    Solution: Extract boot loader bit by bit! Steps: Open GameBoy hardware and desolder Z80 chip Put the chip under a electron microscope Observe the wires and extract the data bit by bit 31
  32. Extra! bare metal hacking 256 bytes (2048 bits) boot ROM

  33. Extra! bare metal hacking Result: LD SP,$fffe ; $0000 Setup

    Stack XOR A ; $0003 Zero the memory from $8000-$9FFF ( LD HL,$9fff ; $0004 Addr_0007: LD (HL-),A ; $0007 BIT 7,H ; $0008 JR NZ, Addr_0007 ; $000a LD HL,$ff26 ; $000c Setup Audio LD C,$11 ; $000f LD A,$80 ; $0011 LD (HL-),A ; $0013 LD ($FF00+C),A ; $0014 INC C ; $0015 LD A,$f3 ; $0016 LD ($FF00+C),A ; $0018 LD (HL-),A ; $0019 LD A,$77 ; $001a LD (HL),A ; $001c 33
  34. Extra! Creating games for the GameBoy Downloading and installing the

    GBDK Ready to use project: Compile and install GBDK compiler and linker Try the Hello World! https://github.com/albertofem/gameboy-gbdk- examples 34
  35. Extra! Creating games for the GameBoy GBDK Hello World example

    Code Walk! 35
  36. Links and documentation Available at the Safeboy and GameBoy examples

    repository SafeBoy: GBDK Example: https://github.com/albertofem/safeboy https://github.com/albertofem/gameboy-gbdk-examples 36
  37. Q&A 37