Slide 1

Slide 1 text

lowRISC - a first look Alex Bradbury @asbradbury @lowRISC ORConf 2014

Slide 2

Slide 2 text

What is lowRISC? ● An open-source SoC with a RISC-V CPU ○ Initially targeting 40 or 28nm ○ Released under an open, permissive license ○ Novel security features ○ Programmable I/O ○ AMBA bus ○ Performance: run Linux ‘well’ ● A Community Interest Company (i.e., we are not-for- profit) ○ Intend to manufacture the SoC in volume and produce low-cost development boards

Slide 3

Slide 3 text

Who are we? ● Robert Mullins - Computer Laboratory, University of Cambridge, co- founder of Raspberry Pi ● Gavin Ferris - Dreamworks, Radioscape (co-founder), Aspect Capital (former CIO) ● Alex Bradbury - Computer Laboratory, University of Cambridge and Raspberry Pi Technical advisory board: ● Krste Asanovic (UC Berkeley) ● Julius Baxter (OpenRISC) ● Bunnie Huang (Hacker) ● Dominic Rizzo (Google ATAP) ● Michael Taylor (UCSD)

Slide 4

Slide 4 text

lowRISC motivation ● A belief in open-source hardware ● Encourage innovation and semiconductor startups ● Research platform ● The opportunity for contributors to see their HDL used in a mass produced SoC ○ Regular tape-outs

Slide 5

Slide 5 text

How are we going to do this? ● Received an initial private donation ● Work with collaborators (e.g. Berkeley) ● Additional funding (e.g. research councils) ● Community ● OpenCores IP and tools ● Build the core dev team. Just advertised and filled two positions

Slide 6

Slide 6 text

Software ● RISC-V has GCC, LLVM, glibc ● Work with community on upstreaming these ● Add support for lowRISC extensions ● Target: RISC-V Debian port

Slide 7

Slide 7 text

Timeline to V1 ● Release of an initial FPGA version. Next 6 months ● Production of a test chip. End of 2015 ● Tape out of production silicon. 2016 ● Produce low cost development boards. “Raspberry Pi for grown-ups” :)

Slide 8

Slide 8 text

Target market and philosophy ● 100-200k boards is ~1 month of Raspberry Pi sales ● Hackers, tinkerers, researchers, the OSHW community ● Target the embedded, connected world. IoT ● Security is essential. We would be negligent to not consider how to improve on security features available in shipping processors ○ Tagged memory ○ Traditional features: RNG, crypto accelerator, encrypted off-chip memory, secure boot ● Flexible IO. Flexibility of a Zynq-like platform but in software. ○ Vendors incentivized to make low level peripherals arbitrarily different for lock-in and ‘differentiation’.

Slide 9

Slide 9 text

System diagram for test chip

Slide 10

Slide 10 text

Tagged memory Initial motivation: security. Associate each memory location with metadata (tags)

Slide 11

Slide 11 text

Shellcode injection via buffer overflow What happens if we pass an argument larger than the buffer? int main(int argc, char **argv) { char buffer[512]; if (argc > 1) strcpy(buffer, argv[1]); return 0; }

Slide 12

Slide 12 text

Shellcode injection via buffer overflow Solution: NX bit. Mark pages as being executable/not executable. All stack pages are non-executable What about overflows on the heap?

Slide 13

Slide 13 text

And the attacker responds with new tricks... ● Return-to-libc (overwrite RA to a handy function like system(const char*)) ● Return-Oriented Programming (ROP) and variants: JOP etc

Slide 14

Slide 14 text

Respond with more software countermeasures ● ASLR: randomize locations of the stack, heap, functions ● Stack canaries: put a secret value on the stack and detect if it is overwritten ● Software sandboxing (e.g. Android, iPhone, browser sandboxes)

Slide 15

Slide 15 text

Time for a new hardware counter- measure ● We call this class of attacks control flow hijack attacks ● All of these attacks (so far) require violating spatial memory safety, i.e. writing beyond the bounds of an object ○ More specifically, they require overwriting a code pointer Aim: protect code pointers from overwrites

Slide 16

Slide 16 text

Does this problem matter? Vulnerabilities in the CVE database with ‘high’ severity Source: 25 Years of Vulnerabilities: 1998-2012 - Sourcefire www. sourcefire. com/25yearsofvulns

Slide 17

Slide 17 text

Solution: tagged memory ● 2 tag bits per word. Storage overhead 2/64=~3% ● Tag cache logically extends width of word to 66 bits ● Tag bits copied into L2 and L1 cache lines

Slide 18

Slide 18 text

Protecting the return address Apply tag bits to the return address. If the buffer overflows, an exception is triggered. Do the same for VTable pointers.

Slide 19

Slide 19 text

But what about use-after-free? ● A temporal memory safety issue ● Problem: we can protect the vtable pointer for the lifetime of an object, but if the object is used after it was freed, the attacker could control the contents of that memory location. ● Solution: Check presence of tag bits. Augment with existing segregated allocator techniques ● A good example of the effort attackers are willing to go to: http://blog.exodusintel.com/2013/11/26/browser- weakest-byte/

Slide 20

Slide 20 text

Other uses of tagged memory We implement general purpose tagged memory that can be configured for use in a wide range of different scenarios: ● Infinite memory watchpoints ● Better version of traditional canaries ● Garbage collection ● Accelerate AddressSanitizer/ThreadSanitizer/MemorySanitizer ○ If larger tags are required, update shadow memory in the exception handler ● Locks on every word ● Apply tag bits to instructions to mark valid targets of indirect branches

Slide 21

Slide 21 text

Compatibility and implementation tasks Requirements: ● Addition of tag memory cache and widening of cache lines ● New instructions to manipulate tags ● Compiler modifications to protect and check RA and vtable pointers ● Modify memory allocator to clear tags upon free, and modify memcpy and memmove to copy tags ● Update kernel virtual memory system to persist tag bits when moving a page to secondary storage Compatibility: metadata in binaries can be used to rewrite instructions at load time

Slide 22

Slide 22 text

Minions for programmable I/O

Slide 23

Slide 23 text

Programmable I/O motivation ● Flexibility and ability to add completely new interfaces ● Ease to program (vs programmable logic) ● Off-load work from main core ○ Do more work close to I/O ○ Combine with tagged memory for more complex security policies/checks ● Filter I/O to only wake up the main core when needed ● Avoid writing HDL for all controllers and interfaces.

Slide 24

Slide 24 text

Related work ● Idea goes back at least to the CDC6600 multi-threaded I/O processors (‘peripheral processor units’) ● Motorola 68332, 68302 both had configurable, programmable ‘timer units’ ● TI PRUs ● XMOS ● NXP LPC4370 (M4 and 2xM0 for peripherals) ● ...

Slide 25

Slide 25 text

Minion architectural considerations ● I/O ‘shim’ ○ Bit-banging pins directly would be painful. ○ Provide a small amount of configurable logic, buffers, timers, and clocks to reduce overheads ○ Support routing physical pins to different cores ● Timing/events ○ Execute out of scratchpads to help provide bounded execution time ○ Precise timing (e.g. wait for counter) ● Considering multi-threaded operation and low-latency communication with main core (e.g. FIFO links) ● Minions are not coherent between themselves, but are coherent with main processors.

Slide 26

Slide 26 text

Roadmap ● Test chip: Rocket x (1 or 2) + IO + memory controller + tagged memory. Reusable research BGA package from UCSD ● V1: Complete, secure embedded SoC appropriate for headless applications. ● Ultimate ambition is an SoC with a broad set of features including GPU, appropriate for a mobile phone or set-top box. V1 is a stepping stone towards that goal. What is success for a project like lowRISC?

Slide 27

Slide 27 text

How you can help ● Join our efforts: http://www.lowrisc.org ● Join our announcement and discussion lists ● Help direct our plans ● FPGA version to come soon Questions?