Upgrade to Pro — share decks privately, control downloads, hide ads and more …

LLVM Assembler for OpenRISC 1000

Simon Cook
October 13, 2012

LLVM Assembler for OpenRISC 1000

Simon Cook

October 13, 2012
Tweet

More Decks by Simon Cook

Other Decks in Technology

Transcript

  1. LLVM Integrated Assembler for OpenRISC 1000 Simon Cook, Embecosm October

    13, 2012 Copyright © 2012 Embecosm. Freely available under a Creative Commons license 0
  2. Compiling with clang Without Integrated Assembler clang -target or1k -ccc-host-triple

    or1k-elf -v helloworld.c - clang <...> -o /tmp/helloworld-Q1WvbT.s -x c helloworld.c - or1k-elf-gcc -v -c -o /tmp/helloworld-BZmisI.o -x assembler /tmp/helloworld.Q1WvbT.s <...>lib/gcc/<...>/as -o /tmp/helloworld-BZmisI.o /tmp/helloworld.Q1WvbT.s - or1k-elf-gcc -v -o a.out /tmp/helloworld-BZmisI.o <...>libexec/gcc/<...>collect2 <...> /tmp/helloworld-BZmisI.o Copyright © 2012 Embecosm. Freely available under a Creative Commons license 2
  3. Compiling with clang Without Integrated Assembler clang -target or1k -ccc-host-triple

    or1k-elf -v helloworld.c - clang <...> -o /tmp/helloworld-Q1WvbT.s -x c helloworld.c - or1k-elf-gcc -v -c -o /tmp/helloworld-BZmisI.o -x assembler /tmp/helloworld.Q1WvbT.s <...>lib/gcc/<...>/as -o /tmp/helloworld-BZmisI.o /tmp/helloworld.Q1WvbT.s - or1k-elf-gcc -v -o a.out /tmp/helloworld-BZmisI.o <...>libexec/gcc/<...>collect2 <...> /tmp/helloworld-BZmisI.o Copyright © 2012 Embecosm. Freely available under a Creative Commons license 2
  4. Compiling with clang Without Integrated Assembler clang -target or1k -ccc-host-triple

    or1k-elf -v helloworld.c - clang <...> -o /tmp/helloworld-Q1WvbT.s -x c helloworld.c - or1k-elf-gcc -v -c -o /tmp/helloworld-BZmisI.o -x assembler /tmp/helloworld.Q1WvbT.s <...>lib/gcc/<...>/as -o /tmp/helloworld-BZmisI.o /tmp/helloworld.Q1WvbT.s - or1k-elf-gcc -v -o a.out /tmp/helloworld-BZmisI.o <...>libexec/gcc/<...>collect2 <...> /tmp/helloworld-BZmisI.o Copyright © 2012 Embecosm. Freely available under a Creative Commons license 2
  5. Compiling with clang Without Integrated Assembler clang -target or1k -ccc-host-triple

    or1k-elf -v helloworld.c - clang <...> -o /tmp/helloworld-Q1WvbT.s -x c helloworld.c - or1k-elf-gcc -v -c -o /tmp/helloworld-BZmisI.o -x assembler /tmp/helloworld.Q1WvbT.s <...>lib/gcc/<...>/as -o /tmp/helloworld-BZmisI.o /tmp/helloworld.Q1WvbT.s - or1k-elf-gcc -v -o a.out /tmp/helloworld-BZmisI.o <...>libexec/gcc/<...>collect2 <...> /tmp/helloworld-BZmisI.o Copyright © 2012 Embecosm. Freely available under a Creative Commons license 2
  6. LLVM Machine Code LLVM project to solve problems surrounding assembly,

    disassembly, etc. Wasteful to format a .s file and then call an assembler to parse this new file. Estimated 20% of compile time is assembly, gain some of this back. Avoids issues of incompatible/out of date assemblers. Uses definitions of instructions already defined in TableGen for the compiler. Copyright © 2012 Embecosm. Freely available under a Creative Commons license 3
  7. TableGen Language for defining records (instructions, registers) in a flexible

    manner. Class based structure which allows many similar records to be created quickly. - class ALU_RR<bits<4> subOp, string asmstr, list<dag> pattern> : <...> { let Inst{25-21} = rD; <...> - def ADD : ALU1_RR<0x0, "l.add", add>; These records are then parsed to give build many different compiler components from the same definition - e.g. Assembler Tables, Disassembler Tables, Calling Convention Information, etc. Copyright © 2012 Embecosm. Freely available under a Creative Commons license 4
  8. OR1K Implementation The same steps can be applied to most

    targets Four main components have been implemented - Assembly Parser - Instruction Encoder - Instruction Decoder - ELF Object Writer Copyright © 2012 Embecosm. Freely available under a Creative Commons license 5
  9. OR1K Implementation Parsing Assembly Files Done primarily for debugging and

    testing - Currently aiming for working assembler, will make standalone later Using a MCAsmTargetParser and given instruction, parse operands to determine type and value. - e.g. r1 −→ <MCOperand Reg:2> Operands (nmenonic is also an operand) are then passed to TableGen'erated function which carries out type checking (l.add 1, 2, 3 makes no sense) and stores in instruction stream. - $ echo "l.add r1, r2, r3" | llvm-mc -arch or1k -show-inst - l.add r1, r2, r3 # <MCInst #17 ADD # <MCOperand Reg:2> # <MCOperand Reg:3> # <MCOperand Reg:4>> Copyright © 2012 Embecosm. Freely available under a Creative Commons license 6
  10. OR1K Implementation Instruction Encoding Majority of this is already handled

    by TableGen - Uses the Insn field to store bit patterns and definitions on where operands go within the instruction Define register encodings - r1 −→ 0x1, etc. Define custom operand encodings - e.g. memory operands - Encode register as before, bitshift and or with immediate operand. It does not matter where an operand is in the instruction, TableGen manages this. Copyright © 2012 Embecosm. Freely available under a Creative Commons license 7
  11. OR1K Implementation Instruction Decoding As before, most of work done

    by TableGen built functions. Requires instruction definitions to all be unambiguous. Added (primarily) for completeness, but allows more debugging to take place. Anything that required custom encoding requires custom decoding. Copyright © 2012 Embecosm. Freely available under a Creative Commons license 8
  12. OR1K Implementation ELF Object Writing Defining all the important constants

    - Architecture identifier, etc. Defining relocations, LLVM fixups - So that the linker can do its job correctly Driven by TableGen Copyright © 2012 Embecosm. Freely available under a Creative Commons license 9
  13. Testing Encoding and Decoding Instructions painstakingly encoded and tests added.

    # RUN: llvm-mc -arch=or1k -mattr=mul,div,ror -show-encoding %s | FileCheck %s l.add r1, r2, r3 # CHECK: # encoding: [0xe0,0x22,0x18,0x00] l.addi r3, r4, 2 # CHECK: # encoding: [0x9c,0x64,0x00,0x02] l.and r1, r2, r3 # CHECK: # encoding: [0xe0,0x22,0x18,0x03] l.andi r3, r4, 2 # CHECK: # encoding: [0xa4,0x64,0x00,0x02] Copyright © 2012 Embecosm. Freely available under a Creative Commons license 10
  14. Testing Assembling Tests for each instruction have been written for

    assembly and disassembly. Copyright © 2012 Embecosm. Freely available under a Creative Commons license 11
  15. The Future Near term - Fix bugs (primarily calculating relocations)

    - Add or1k specific object testcases - More testing in general (making -integrated-as the default) Not-so-near term - More standalone assembly features (i.e. gas alternative) Long Term - Mainline Adoption - Further LLVM Integration - e.g. lld (LLVM linker) Copyright © 2012 Embecosm. Freely available under a Creative Commons license 12
  16. The Future Near term - Fix bugs (primarily calculating relocations)

    - Add or1k specific object testcases - More testing in general (making -integrated-as the default) Not-so-near term - More standalone assembly features (i.e. gas alternative) Long Term - Mainline Adoption - Further LLVM Integration - e.g. lld (LLVM linker) Copyright © 2012 Embecosm. Freely available under a Creative Commons license 12
  17. The Future Near term - Fix bugs (primarily calculating relocations)

    - Add or1k specific object testcases - More testing in general (making -integrated-as the default) Not-so-near term - More standalone assembly features (i.e. gas alternative) Long Term - Mainline Adoption - Further LLVM Integration - e.g. lld (LLVM linker) Copyright © 2012 Embecosm. Freely available under a Creative Commons license 12
  18. Thank You Thank You www.embecosm.com Copyright © 2012 Embecosm. Freely

    available under a Creative Commons license 13