Slide 1

Slide 1 text

LLVM Integrated Assembler for OpenRISC 1000 Simon Cook, Embecosm October 13, 2012 Copyright © 2012 Embecosm. Freely available under a Creative Commons license 0

Slide 2

Slide 2 text

Compiling with clang Without Integrated Assembler clang -target or1k -ccc-host-triple or1k-elf -v helloworld.c - clang <...> -o /tmp/helloworld-Q1WvbT.s -x c helloworld.c - or1k-elf-gcc -v -c -o /tmp/helloworld-BZmisI.o -x assembler /tmp/helloworld.Q1WvbT.s <...>lib/gcc/<...>/as -o /tmp/helloworld-BZmisI.o /tmp/helloworld.Q1WvbT.s - or1k-elf-gcc -v -o a.out /tmp/helloworld-BZmisI.o <...>libexec/gcc/<...>collect2 <...> /tmp/helloworld-BZmisI.o Copyright © 2012 Embecosm. Freely available under a Creative Commons license 2

Slide 3

Slide 3 text

Compiling with clang Without Integrated Assembler clang -target or1k -ccc-host-triple or1k-elf -v helloworld.c - clang <...> -o /tmp/helloworld-Q1WvbT.s -x c helloworld.c - or1k-elf-gcc -v -c -o /tmp/helloworld-BZmisI.o -x assembler /tmp/helloworld.Q1WvbT.s <...>lib/gcc/<...>/as -o /tmp/helloworld-BZmisI.o /tmp/helloworld.Q1WvbT.s - or1k-elf-gcc -v -o a.out /tmp/helloworld-BZmisI.o <...>libexec/gcc/<...>collect2 <...> /tmp/helloworld-BZmisI.o Copyright © 2012 Embecosm. Freely available under a Creative Commons license 2

Slide 4

Slide 4 text

Compiling with clang Without Integrated Assembler clang -target or1k -ccc-host-triple or1k-elf -v helloworld.c - clang <...> -o /tmp/helloworld-Q1WvbT.s -x c helloworld.c - or1k-elf-gcc -v -c -o /tmp/helloworld-BZmisI.o -x assembler /tmp/helloworld.Q1WvbT.s <...>lib/gcc/<...>/as -o /tmp/helloworld-BZmisI.o /tmp/helloworld.Q1WvbT.s - or1k-elf-gcc -v -o a.out /tmp/helloworld-BZmisI.o <...>libexec/gcc/<...>collect2 <...> /tmp/helloworld-BZmisI.o Copyright © 2012 Embecosm. Freely available under a Creative Commons license 2

Slide 5

Slide 5 text

Compiling with clang Without Integrated Assembler clang -target or1k -ccc-host-triple or1k-elf -v helloworld.c - clang <...> -o /tmp/helloworld-Q1WvbT.s -x c helloworld.c - or1k-elf-gcc -v -c -o /tmp/helloworld-BZmisI.o -x assembler /tmp/helloworld.Q1WvbT.s <...>lib/gcc/<...>/as -o /tmp/helloworld-BZmisI.o /tmp/helloworld.Q1WvbT.s - or1k-elf-gcc -v -o a.out /tmp/helloworld-BZmisI.o <...>libexec/gcc/<...>collect2 <...> /tmp/helloworld-BZmisI.o Copyright © 2012 Embecosm. Freely available under a Creative Commons license 2

Slide 6

Slide 6 text

LLVM Machine Code LLVM project to solve problems surrounding assembly, disassembly, etc. Wasteful to format a .s file and then call an assembler to parse this new file. Estimated 20% of compile time is assembly, gain some of this back. Avoids issues of incompatible/out of date assemblers. Uses definitions of instructions already defined in TableGen for the compiler. Copyright © 2012 Embecosm. Freely available under a Creative Commons license 3

Slide 7

Slide 7 text

TableGen Language for defining records (instructions, registers) in a flexible manner. Class based structure which allows many similar records to be created quickly. - class ALU_RR subOp, string asmstr, list pattern> : <...> { let Inst{25-21} = rD; <...> - def ADD : ALU1_RR<0x0, "l.add", add>; These records are then parsed to give build many different compiler components from the same definition - e.g. Assembler Tables, Disassembler Tables, Calling Convention Information, etc. Copyright © 2012 Embecosm. Freely available under a Creative Commons license 4

Slide 8

Slide 8 text

OR1K Implementation The same steps can be applied to most targets Four main components have been implemented - Assembly Parser - Instruction Encoder - Instruction Decoder - ELF Object Writer Copyright © 2012 Embecosm. Freely available under a Creative Commons license 5

Slide 9

Slide 9 text

OR1K Implementation Parsing Assembly Files Done primarily for debugging and testing - Currently aiming for working assembler, will make standalone later Using a MCAsmTargetParser and given instruction, parse operands to determine type and value. - e.g. r1 −→ Operands (nmenonic is also an operand) are then passed to TableGen'erated function which carries out type checking (l.add 1, 2, 3 makes no sense) and stores in instruction stream. - $ echo "l.add r1, r2, r3" | llvm-mc -arch or1k -show-inst - l.add r1, r2, r3 # # # > Copyright © 2012 Embecosm. Freely available under a Creative Commons license 6

Slide 10

Slide 10 text

OR1K Implementation Instruction Encoding Majority of this is already handled by TableGen - Uses the Insn field to store bit patterns and definitions on where operands go within the instruction Define register encodings - r1 −→ 0x1, etc. Define custom operand encodings - e.g. memory operands - Encode register as before, bitshift and or with immediate operand. It does not matter where an operand is in the instruction, TableGen manages this. Copyright © 2012 Embecosm. Freely available under a Creative Commons license 7

Slide 11

Slide 11 text

OR1K Implementation Instruction Decoding As before, most of work done by TableGen built functions. Requires instruction definitions to all be unambiguous. Added (primarily) for completeness, but allows more debugging to take place. Anything that required custom encoding requires custom decoding. Copyright © 2012 Embecosm. Freely available under a Creative Commons license 8

Slide 12

Slide 12 text

OR1K Implementation ELF Object Writing Defining all the important constants - Architecture identifier, etc. Defining relocations, LLVM fixups - So that the linker can do its job correctly Driven by TableGen Copyright © 2012 Embecosm. Freely available under a Creative Commons license 9

Slide 13

Slide 13 text

Testing Encoding and Decoding Instructions painstakingly encoded and tests added. # RUN: llvm-mc -arch=or1k -mattr=mul,div,ror -show-encoding %s | FileCheck %s l.add r1, r2, r3 # CHECK: # encoding: [0xe0,0x22,0x18,0x00] l.addi r3, r4, 2 # CHECK: # encoding: [0x9c,0x64,0x00,0x02] l.and r1, r2, r3 # CHECK: # encoding: [0xe0,0x22,0x18,0x03] l.andi r3, r4, 2 # CHECK: # encoding: [0xa4,0x64,0x00,0x02] Copyright © 2012 Embecosm. Freely available under a Creative Commons license 10

Slide 14

Slide 14 text

Testing Assembling Tests for each instruction have been written for assembly and disassembly. Copyright © 2012 Embecosm. Freely available under a Creative Commons license 11

Slide 15

Slide 15 text

The Future Near term - Fix bugs (primarily calculating relocations) - Add or1k specific object testcases - More testing in general (making -integrated-as the default) Not-so-near term - More standalone assembly features (i.e. gas alternative) Long Term - Mainline Adoption - Further LLVM Integration - e.g. lld (LLVM linker) Copyright © 2012 Embecosm. Freely available under a Creative Commons license 12

Slide 16

Slide 16 text

The Future Near term - Fix bugs (primarily calculating relocations) - Add or1k specific object testcases - More testing in general (making -integrated-as the default) Not-so-near term - More standalone assembly features (i.e. gas alternative) Long Term - Mainline Adoption - Further LLVM Integration - e.g. lld (LLVM linker) Copyright © 2012 Embecosm. Freely available under a Creative Commons license 12

Slide 17

Slide 17 text

The Future Near term - Fix bugs (primarily calculating relocations) - Add or1k specific object testcases - More testing in general (making -integrated-as the default) Not-so-near term - More standalone assembly features (i.e. gas alternative) Long Term - Mainline Adoption - Further LLVM Integration - e.g. lld (LLVM linker) Copyright © 2012 Embecosm. Freely available under a Creative Commons license 12

Slide 18

Slide 18 text

Thank You Thank You www.embecosm.com Copyright © 2012 Embecosm. Freely available under a Creative Commons license 13