Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Making archive IL2C #6-55 dotNET600 2018

Making archive IL2C #6-55 dotNET600 2018

発表時点のIL2Cの中心技術の解説を行ったスライドです。

IL2Cは、.NETのアセンブリをC言語のコードに変換するツールセットです。
ビデオ録画も参照して下さい:
YouTube: https://www.youtube.com/watch?v=Y--YjQQLdcg

What’s the IL2C​
* Building scheme​

Translation details​
* The runtime types – primitive and string​
* How works the garbage collector​
* The value type / boxing​
* The enum types​
* The delegate types​
* How works exceptions​
* How works virtual methods (virtual, override and interface implementations)​

-----

.NET Conf 2019 meetup in AICHI
https://centerclr.connpass.com/event/143949

Kouji Matsui

October 05, 2019
Tweet

More Decks by Kouji Matsui

Other Decks in Programming

Transcript

  1. Kouji Matsui – kozy, kekyo • NAGOYA city, AICHI pref.,

    JP • Twitter – @kozy_kekyo, @kekyo2 / Facebook • Self employed (I’m looking for a job) • Microsoft Most Valuable Professional VS and DevTech 2015- • Certified Scrum master / Scrum product owner • Center CLR organizer. • .NET/C#/F#/IL/metaprogramming or like… • Bike rider
  2. Agenda Abstract What’s the IL2C ◦ Building scheme Translation details

    ◦ The runtime types – primitive and string ◦ How works the garbage collector ◦ The value type / boxing ◦ The enum types ◦ The delegate types ◦ How works exceptions ◦ How works virtual methods (virtual, override and interface implementations)
  3. Abstract WARNING: ◦ This session contains very complex technology topics

    reached for LEVEL 600 or above. ◦ Are you ready? ;) How works and aiming for tiny resource requirements by the IL2C? How works AOT (ahead of time compilation) by the IL2C? What’s done, doing and will do the IL2C project?
  4. Agenda Abstract What’s the IL2C ◦ Building scheme Translation details

    ◦ The runtime types – primitive and string ◦ How works the garbage collector ◦ The value type / boxing ◦ The enum types ◦ The delegate types ◦ How works exceptions ◦ How works virtual methods (virtual, override and interface implementations)
  5. What’s the IL2C A translator for ECMA-335 CIL/MSIL to C

    language. F# Code IL2C Target native binary C# Code
  6. What’s the IL2C IL2C's implementation priorities, we're aiming for: ◦

    Better predictability for runtime costs, better human readability for the IL2C translated C source code. ◦ Very tiny footprint requirements, we are thinking about how fit between tiny embedded system and large system with many resources. ◦ Better code/runtime portability, minimum requirements are only C99 compiler. ◦ Better interoperabilities for exist C libraries, we can use standard .NET interop technics (likely P/Invoke.) ◦ Contains seamless building system for major C toolkits, for example: CMake system, Arduino IDE, VC++ ...
  7. What’s the IL2C IL2C Assembly (*.dll) C Language (*.c, *.h)

    C Language (*.c, *.h) Target dev C language compiler (VS, Arduino, mbed and etc…) Target native binary C# Code C# Compiler (Roslyn) F# Compiler (FCS) F# Code Another compiler
  8. C language source code CIL / MSIL ECMA-335 specific binaries

    What’s the IL2C IL2C Assembly (*.dll) C Language (*.c, *.h) C Language (*.c, *.h) Target dev C language compiler (VS, Arduino, mbed and etc…) Target native binary C# Code C# Compiler (Roslyn) F# Compiler (FCS) F# Code
  9. Building schemes (aiming for) IL2C Assembly (*.dll) C Language (*.c,

    *.h) C Language (*.c, *.h) Target dev C language compiler (VS, Arduino, mbed and etc…) Target native binary C# Code C# Compiler (Roslyn) NuGet (*.dll) C Language (*.c, *.h) IL2C Runtime (*.c, *.h) Prebuilt Libraries 3rd party Libraries
  10. Agenda Abstract What’s the IL2C ◦ Building scheme Translation details

    ◦ The runtime types – primitive and string ◦ How works the garbage collector ◦ The value type / boxing ◦ The enum types ◦ The delegate types ◦ How works exceptions ◦ How works virtual methods (virtual, override and interface implementations)
  11. The runtime types – primitive and string typedef aliases ◦

    stdint.h, stdbool.h, wchar_t, float.h byte short int long sbyte ushort uint ulong uint8_t int16_t int32_t int64_t int8_t uint16_t uint32_t uint64_t float double bool char IntPtr UIntPtr float double bool wchar_t intptr_t uintptr_t
  12. The runtime types – primitive and string System.String – variable

    strage space System_String_VTable System_String (heap) vptr0__ string_body__ “ABCDEFGHIJ\0”
  13. The runtime types – primitive and string Constant literal string

    System_String_VTABLE System_String (.rdata) vptr0__ string_body__ const wchar_t[] (.rdata) “ABCDEFGHIJ\0”
  14. IL2C_REF_HEADER (.rdata) The runtime types – primitive and string Constant

    literal string System_String vptr0__ string_body__ const wchar_t[] (.rdata) “ABCDEFGHIJ\0” pNext type gcMark VTABLE
  15. Agenda Abstract What’s the IL2C ◦ Building scheme Translation details

    ◦ The runtime types – primitive and string ◦ How works the garbage collector ◦ The value type / boxing ◦ The enum types ◦ The delegate types ◦ How works exceptions ◦ How works virtual methods (virtual, override and interface implementations)
  16. How works the garbage collector Basic strategy – mark and

    sweep algorithm (NOT include compaction) Root Root Root 1. Clear mark 2. Set mark 3. Free unmarked
  17. How works the garbage collector: Phase 1 IL2C_REF_HEADER (heap) pNext

    type gcMark IL2C_REF_HEADER (heap) pNext g_pBeginHeader GCMARK_NOMARK GCMARK_NOMARK
  18. How works the garbage collector: Phase 2 IL2C_REF_HEADER (heap) pNext

    type gcMark g_pBeginFrame GCMARK_LIVE GCMARK_LIVE GCMARK_LIVE System_String vptr0__ string_body__ “ABCDEFGHIJ\0”
  19. How works the garbage collector: Phase 2 Function1() Function2() Function3()

    local1 local2 local1 local2 local3 g_pBeginFrame local2 GCMARK_LIVE null null null Step2 details local1 null 2
  20. How works the garbage collector: Phase 3 IL2C_REF_HEADER (heap) pNext

    type gcMark IL2C_REF_HEADER (heap) pNext g_pBeginHeader GCMARK_LIVE GCMARK_NOMARK GCMARK_NOMARK GCMARK_NOMARK GCMARK_LIVE
  21. Agenda Abstract What’s the IL2C ◦ Building scheme Translation details

    ◦ The runtime types – primitive and string ◦ How works the garbage collector ◦ The value type / boxing ◦ The enum types ◦ The delegate types ◦ How works exceptions ◦ How works virtual methods (virtual, override and interface implementations)
  22. IL2C_REF_HEADER (heap) System_ValueType The value type / boxing pNext type

    gcMark vptr0__ System_Int32 System_Int32 Boxed 32bit (4bytes) Bit exactly sizeof(IL2C_REF_HEADER) + sizeof(System_ValueType) + sizeof(System_Int32)
  23. IL2C_REF_HEADER (heap) System_ValueType The value type / boxing pNext type

    gcMark vptr0__ System_Int32 Unboxed (System_Int32*) System_Int32 local 1. Got pointer 2. Dereference (Copy)
  24. The value type / boxing Valid? The value type is

    restricted for marking “virtual” by Roslyn. Therefore we can force virtualize by assigning the interface. Will boxed for call Q2:
  25. The value type / boxing : Q1 System_Int32 (System_Int32*) System_Int32_ToString(

    System_Int32* this) 1. Got pointer 2. Arg0 (this) Copy-free invoking
  26. IL2C_REF_HEADER (heap) System_ValueType The value type / boxing : Q3

    pNext type gcMark vptr0__ System_Int32 System_Int32_VTABLE (.rdata) offset__ System_Int32_Equals(…) System_Object_Finalize(…) System_Int32_GetHashCode(…) System_Int32_ToString(…) System_Int32_ToString( System_Int32* this) Where’s unboxing?
  27. The value type / boxing : Q3 System_Int32_VTABLE (.rdata) offset__

    System_Int32_Equals_VFunc(…) System_Object_Finalize(…) System_Int32_GetHashCode_VFunc(…) System_Int32_ToString_VFunc(…) System_Int32_ToString_VFunc( System_ValueType* this) System_Int32_ToString( System_Int32* this) Unboxing (And not copy) We can (have to) manipulate the instance fields
  28. The value type / boxing : Q2 System_Int32_IFoo_VTABLE (.rdata) offset__

    System_Int32_ToString_VFunc(…) System_Int32_ToString_VFunc( System_ValueType* this) System_Int32_ToString( System_Int32* this) Unboxing (And not copy) We can manipulate the instance fields, but gonna discard
  29. The value type / boxing My strategy was: ◦ Q1:

    The value type methods can access their fields by using unboxed-raw- pointer. (Copy-free access) → ◦ Q2: The interface implementation way can same procedure → We have to copy the instance… ◦ Q3: The value type virtual methods same procedure… → Will fix this problem…
  30. Agenda Abstract What’s the IL2C ◦ Building scheme Translation details

    ◦ The runtime types – primitive and string ◦ How works the garbage collector ◦ The value type / boxing ◦ The enum types ◦ The delegate types ◦ How works exceptions ◦ How works virtual methods (virtual, override and interface implementations)
  31. The enum types Q: How implement the enum types will

    you use the C language? translate C# (IL) C
  32. The enum types Q: How implement the enum types will

    you use the C language? translate C# (IL) C It has 2 problems
  33. The enum types A1: The enum symbol names are globally.

    We have to able to apply different names each enum types… C Combined namespace Combined namespace and each value symbol
  34. The enum types A2: In the C language, enum types

    can’t annotate the storage space. C consoleapplication1.c(3): error C2059: syntax error: ':' In the C++, we can use this syntax
  35. The enum types Final results for IL2C way: C The

    IL2C has to calculate real value at each symbols…
  36. The enum types Final results for IL2C way: ◦ Each

    enum types NOT derived from System.Enumtype. ◦ We can implicitly convert by bidirection both enum types and integer types. Convert from Int32 to Int32EnumType implicitly. C# IL
  37. Agenda Abstract What’s the IL2C ◦ Building scheme Translation details

    ◦ The runtime types – primitive and string ◦ How works the garbage collector ◦ The value type / boxing ◦ The enum types ◦ The delegate types ◦ How works exceptions ◦ How works virtual methods (virtual, override and interface implementations)
  38. The delegate types The delegate types almost always inherited from

    System.MulticastDelegate. ◦ System.Delegate type has two member for “Method” and “Target.” ◦ The derived delegate type has the type-safe method named “Invoke.” System.Object System.Delegate System.MulticastDelegate System.EventHandler void Invoke( object sender, EventArgs e) MethodInfo Method; object Target;
  39. The delegate types “Method” member is the information for callee.

    ◦ If callee method is static (not instance method), the “Target” is null. private static void Form_Clicked( object sender, EventArgs e) static method callee method System.Delegate Target Method Target (instance) is nothing null
  40. class Bar.Form The delegate types ◦ If callee method is

    instance (not static method), the “Target” refer to instance. System.Delegate private void Form_Clicked( object sender, EventArgs e) Target Method instance method Target (instance) callee method
  41. The delegate types The IL2C way puts the code with

    instance detection expression. System_EventHandler_Invoke(…) C C# Detect the “Target” is not NULL
  42. The delegate types The instance method into value type can

    assign to delegate and valid. Refer to: public override string System.Int32.ToString() C#
  43. The delegate types The instance method into value type can

    assign to delegate and valid. boxing Resolve for virtual method pointer IL
  44. IL2C_REF_HEADER (heap) The delegate types The boxing opcode is very

    important in this case. Because the “Target” member type is System.Object, so can’t store the native value or managed pointer. value type System.Int32 System.Delegate public override string ToString() Target (System.Object) Method instance method Target (boxed)
  45. The delegate types Illustrates, the System.MulticastDelegate can hold multiple delegates

    into the delegate. The implementation is the list of delegates. It contains delegate[] C#
  46. The delegate types Illustrates, the System.MulticastDelegate can contain multiple delegates

    into one delegate. The implementation is the list of delegates. MulticastDelegate invocationList … delegate[] [0] [1] [2] … System.Delegate … System.Delegate … System.Delegate …
  47. IL2C_REF_HEADER (heap) IL2C_REF_HEADER (heap) IL2C_REF_HEADER (heap) IL2C_REF_HEADER IL2C_REF_HEADER (heap) The

    delegate types We know, these instances came from the heap… MulticastDelegate invocationList … delegate[] [0] [1] [2] … System.Delegate … System.Delegate … System.Delegate … Too many allocation!!
  48. The delegate types The IL2C way, all delegate types aggregate

    into the one implement at “System_Delegate.” It contains list of delegate and it’s NON-array. Most important thing is the delegate storage size is VARIABLE. System_Delegate … … [0] Target [0] Method [1] Target [1] Method private static void Form_Clicked( object sender, EventArgs e) private static void Form_Clicked( object sender, EventArgs e) Variable Invocation table
  49. IL2C_REF_HEADER (heap) The delegate types It will reduce heap management

    cost System_Delegate … … [0] Target [0] Method [1] Target [1] Method private static void Form_Clicked( object sender, EventArgs e) private static void Form_Clicked( object sender, EventArgs e)
  50. The delegate types Delegate.Combine() and Delegate.Remove() calculate and store invocation

    table only once allocated delegate instance. C Allocate once Delegate.Combine(…) Combine the invocation tables
  51. The delegate types Overall the IL2C translate to the specialized

    delegate invoker: C Invocation table Invoke by function pointer For all method
  52. Agenda Abstract What’s the IL2C ◦ Building scheme Translation details

    ◦ The runtime types – primitive and string ◦ How works the garbage collector ◦ The value type / boxing ◦ The enum types ◦ The delegate types ◦ How works exceptions ◦ How works virtual methods (virtual, override and interface implementations)
  53. How works exceptions Important global unwind things: ◦ We have

    to find the matched exception filter (“catch” type clause.) ◦ If not found, try to find more filters crawling from stack frame top to bottom. Illustrated: Function1() Function2() Function3() Function4() C Function5() null main() Stack frame top Stack frame bottom g_pBeginFrame
  54. How works exceptions The unwind feature is likely stack crawling

    on the stack frame, but we have to complete exception handling with bit different: ◦ We have to use the “Exception frame.” instead the stack frame. ◦ It’s only instantiate for each exception blocks. Exception block 2 RaiseNestedBlockLocal() null main() Exception block 1 Exception frame 1 Exception frame 2 g_pTopUnwindTarget
  55. How works exceptions How works the exception frames on the

    global unwind? RaiseNestedBlockGlobal() null main() Exception frame 1 g_pTopUnwindTarget RaiseNestedBlockCallee() Exception frame 2 Exception block 1 Exception block 2
  56. How works exceptions This is the exception frame and overall

    usage. C Function3() Exception frame 3 g_pTopUnwindTarget Function1() null main() Exception frame 1 Exception frame 2 Function2() This function doesn’t contain any exception block
  57. How works exceptions The IL2C handles exception by the way:

    1. Raise exception with strongly-typed instance. 2. The runtime crawl the exception frames from “Top exception frame” to bottom. The “g_pTopUnwindFrame” pointer refers top frame. 3. Call the exception filter function from each exception frames. 4. If found the exception type from exception filter function, IL2C found catchable block. Send the execution context into it (unwind done). 5. If can’t find any cachable block at the bottom frame, the IL2C raise “Unhandled exception” state.
  58. How works exceptions What’s the exception filter? C Index number

    for local exception frame Method name “catch (Exception)” Result is filter number (unique in each filter)
  59. How works exceptions If a exception block contain multiple caught

    blocks? C# Contains multiple caught into one exception block
  60. How works exceptions If a exception block contain multiple caught

    blocks? C Result is filter number Series checks exception type
  61. How works exceptions The “Filter number” is used for branching

    identity at the method body. Illustrated bone of the exception block: C Use this filter function The filter number
  62. How works exceptions “il2c_try” “il2c_catch” “il2c_end_try” and another exception handling

    helpers: Declared by C language macro. How works these macros details??
  63. How works exceptions C Exception frame declared each try block

    branch using setjmp() result to try and catch block Saved current execution context
  64. How works exceptions C Use longjmp() function. The execution context

    will unwind to “saved.” How the exception raising side?
  65. How works exceptions A lot… a lot of topics for

    the exception handling, this session uncovererd: ◦ How works “rethrow” feature? ◦ How works nested rethrow feature? ◦ How works nested local exception block? ◦ How works finally block? ◦ How works filter-fault blocks? ◦ The setjmp and longjmp way (named sjlj) is slower. Can we improve by another way? ◦ How works asynchronous exception? (NullReferenceException, ArithmeticExceptionand etc…)
  66. Agenda Abstract What’s the IL2C ◦ Building scheme Translation details

    ◦ The runtime types – primitive and string ◦ How works the garbage collector ◦ The value type / boxing ◦ The enum types ◦ The delegate types ◦ How works exceptions ◦ How works virtual methods (virtual, override and interface implementations)
  67. How works virtual methods The runtime type information: IL2C_REF_HEADER (heap)

    pNext type gcMark System_String vptr0__ string_body__ “ABCDEFGHIJ\0” IL2C_RUNTIME_TYPE_DECL (.rdata) pTypeName flags bodySize baseType vptr0 markTarget interfaceCount … “System.String” (UTF8 string) [System.Object] IL2C_RUNTIME_TYPE_DECL Copy … System_String_VTABLE__ …
  68. How works virtual methods C IL2C_TYPE_REFERENCE (objref) IL2C_TYPE_VALUE (value type)

    IL2C_TYPE_VARIABLE (variable storage size) … The C language not valid using for sizeof() expression via empty storage space type. So this field coverered size is 0.
  69. How works virtual methods The virtual method table (VTABLE) works:

    IL2C_RUNTIME_TYPE_DECL … vptr0 … System_String vptr0__ string_body__ System_String_VTABLE (.rdata) offset__ System_String_Equals(…) System_Object_Finalize(…) System_String_GetHashCode(…) System_String_ToString(…) System_String_GetHashCode( System_String* this) 0
  70. How works virtual methods How works invoking virtual method? C

    VirtualBaseType_VTABLE__ offset__ GetStringFromInt32(…) … VirtualBaseType_GetStringFromInt32( VirtualBaseType* this, int32_t value) 0 vptr0__
  71. How works virtual methods How works invoking virtual method? C

    VirtualBaseType_VTABLE__ offset__ GetStringFromInt32(…) … 0 VirtualBaseType_GetStringFromInt32( VirtualBaseType* this, int32_t value) Subtract offset from this pointer. But the “offset__” field always 0… arg0 = this - offset__
  72. How works virtual methods The interface virtual method table works:

    TYPE [VirtualNewAndImplementType] … VirtualNewAndImplementType VirtualNewAndImplementType_VTABLE VirtualNewAndImplementType_GetStringFromInt32( VirtualNewAndImplementType* this) vptr0__ vptr0__ vptr_IInterfaceType1 … … VirtualNewAndImplementType_IInterfaceType1_VTABLE offset__ TYPE [IInterfaceType1] … interfaceType vptrInterface void GetStringFromInt32 () offset__ 0 4 +0 +4 arg0 = this - offset__ The interface VTABLE contains this pointer’s offset. Subtract offset to adjust pointer.
  73. How works virtual methods How works invoking interface virtual method?

    C VirtualNewImplements_IInterfaceType1_VTABLE offset__ GetStringFromInt32(…) … 4 True same calculation both the class VTABLE and interface VTABLE. arg0 = this - offset__ VirtualNewAndImplementType_GetStringFromInt32( VirtualNewAndImplementType* this)
  74. How works virtual methods C# 1. Find the interface from

    mostly derived 2. Listup methods from found to same signature 3. Find the method from mostly overrided and NOT newslot