Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Metaprogramming, Metaobject Protocols, Gradual Type Checks: Optimizing the "Unoptimizable" Using Old Ideas

Stefan Marr
October 20, 2019

Metaprogramming, Metaobject Protocols, Gradual Type Checks: Optimizing the "Unoptimizable" Using Old Ideas

Metaobject Protocols and Type Checks, do they have much in common? Perhaps not from a language perspective. However, under the hood of a modern virtual machine, they turn out to show very similar behavior and can be optimized very similarly.

This talk will go back to the days of Terminator 2, The Naked Gun 2 1/2, and Star Trek VI. We will revisit the early days of just-in-time compilation, the basic insights that are still true, and see how to apply them to metaprogramming techniques of different shapes and forms.

Stefan Marr

October 20, 2019
Tweet

More Decks by Stefan Marr

Other Decks in Research

Transcript

  1. Metaprogramming, Metaobject Protocols,
    Gradual Type Checks
    Optimizing the “Unoptimizable”
    Using Old Ideas
    Stefan Marr
    Athens, October 2019
    Creative Commons
    Attribution-ShareAlike
    4.0 License

    View full-size slide

  2. Got a Question?
    Feel free to interrupt me!
    2

    View full-size slide

  3. The “Unoptimizable”

    View full-size slide

  4. The Knights on the Quest for
    Excellent Performance
    4
    Excellent
    Performance
    Reflection
    Metaprogramming
    Gradual
    Typing
    Metaobject
    Protocols

    View full-size slide

  5. Metaprogramming
    Reflection, Introspection, Intercession
    obj.invoke("methodName", [arg1])
    obj.getField("name")
    obj.setField("name", value)
    obj.getDefinedFields()

    5
    Reflection
    Metaprogramming

    View full-size slide

  6. Metaprogramming
    6
    meth.invoke() 1.7x slower
    Dynamic Proxies 7.5x slower
    Reflection
    Metaprogramming Excellent
    Performance
    Don’t Use It
    Inn

    View full-size slide

  7. Metaobject Protocols
    LoggingClass extends Metaclass {
    writeToField(obj, fieldName, value) {
    console.log(`${fieldName}: ${value}`)
    obj.setField(fieldName, value)
    }
    }
    7
    Metaobject
    Protocols Excellent
    Performance

    View full-size slide

  8. Metaobject Protocols
    LoggingClass extends Metaclass {
    writeToField(obj, fieldName, value) {
    console.log(`${fieldName}: ${value}`)
    obj.setField(fieldName, value)
    }
    }
    8
    Metaobject
    Protocols
    Excellent
    Performance
    AOP

    View full-size slide

  9. Gradual Typing
    async addMessage(user: User, message) {
    const msg = `src="/image/${user.profilePicture}">
    ${message}`;
    this.outputElem
    .insertAdjacentHTML(
    'beforeend', msg);
    9
    Excellent
    Performance
    Gradual
    Typing

    View full-size slide

  10. Gradual Typing
    async addMessage(user: User, message) {
    const msg = `src="/image/${user.profilePicture}">
    ${message}`;
    this.outputElem
    .insertAdjacentHTML(
    'beforeend', msg);
    10
    Excellent
    Performance
    Gradual
    Typing
    Somewhat True…
    The whole Truth
    is a little more complex

    View full-size slide

  11. The Knights Found New Homes
    11
    Excellent
    Performance
    Reflection
    Metaprogramming Gradual
    Typing
    Metaobject
    Protocols
    AOP
    Land of Engineering Short Cuts

    View full-size slide

  12. The story could have ended here…
    12

    View full-size slide

  13. If it wouldn’t be for the 90’s
    13

    View full-size slide

  14. An “everything has been
    done in Lisp” Talk
    14
    Smalltalk
    Self

    View full-size slide

  15. The (Movie) Heroes of ‘91
    15
    Polymorphic Inline Caches Just-in-time Compilation Maps (Hidden Classes)
    Terminator 2 The Naked Gun 2 1/2 Star Trek VI

    View full-size slide

  16. Key Papers*
    16
    *Lots of necessary work
    afterwards, but lay foundations

    View full-size slide

  17. POLYMORPHIC INLINE CACHES (PICS)
    A technique for lookup caching.
    17

    View full-size slide

  18. A Class Hierarchy of Widgets
    18
    class Widget {
    fitsInto(width) {
    return this.width <= width;
    }
    }
    class Button extends Widget {}
    class RadioButton extends Button {}
    fn findAllThatFit(arr, width) {
    const result = [];
    for (const w of arr)
    if (w.fitsInto(width))
    result.append(w)
    return result;
    }

    View full-size slide

  19. Lookups can be frequent and costly
    19
    class Widget {
    fitsInto(width) {
    return this.width <= width;
    }
    }
    class Button extends Widget {}
    class RadioButton extends Button {}
    fn findAllThatFit(arr, width) {
    const result = [];
    for (const w of arr)
    if (w.fitsInto(width))
    result.append(w)
    return result;
    }
    RadioButton
    Button
    fitsInto
    Widget
    superclass
    superclass
    For each fitsInto call
    hasMethod: 3x
    getSuperclass: 2x

    View full-size slide

  20. Solution: Lookup Caching
    20
    w.fitsInto(width)
    could be various functions,
    but we don’t need to do the same lookup repeatedly
    method
    method
    (in case we see
    widget of
    different class)
    PIC: check for receiver and jump to
    method directly in machine code
    Useful because:
    • Most sends are monomorphic
    • Few are polymorphic
    • And just a couple are megamorphic

    View full-size slide

  21. The Terminator PIC Eliminates
    Potential Variability
    22

    View full-size slide

  22. JUST IN TIME COMPILATION
    Generating Machine Code at Run Time
    23

    View full-size slide

  23. Just-in-time Compilation
    • Produces native code, optimized, avoiding the
    overhead of interpretation
    • At run time, can utilize knowledge about
    program execution
    • Ahead-of-time compilation, i.e., classic static
    compilation can only guess how a program is
    used
    24
    Missing Slide
    added after the talk

    View full-size slide

  24. With PICs, we can know
    25
    class Widget {
    fitsInto(width) {
    return this.width <= width;
    }
    }
    class Button extends Widget {}
    class RadioButton extends Button {}
    fn findAllThatFit(arr, width) {
    const result = [];
    for (const w of arr)
    if (w.fitsInto(width))
    result.append(w)
    return result;
    }
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    RadioButton Widget.fitsInto
    Array
    RadioButton
    Integer

    View full-size slide

  25. And Generate Efficient Code
    26
    fn findAllThatFit(arr, width) {
    const result = [];
    for (const w of arr)
    if (w.width(field)
    <=(int)
    width)
    result.append(w)
    return result;
    }
    1
    2
    3
    4
    5
    6
    7
    inlined fitsInto
    Specialized to types
    Resulting in Efficient
    Machine Code

    View full-size slide

  26. Polymorphic Inline Caches
    • Give us type
    information
    • Enable method
    inlining
    27

    View full-size slide

  27. Our Hero has a lot of Luck
    and Achieves his Goal
    28
    Relies on Developers
    not utilizing flexibility
    And, understanding
    90’s humor

    View full-size slide

  28. MAPS, HIDDEN CLASSES,
    OBJECT SHAPES
    Structural Information for Objects
    29

    View full-size slide

  29. The Power of Dynamic Languages
    30
    o = {foo: 33} Object with 1 field
    o.bar = new Object() Object with 2 fields
    o.float = 4.2 Object with 3 fields
    o.float = "string" And you can
    store anything

    View full-size slide

  30. Data Representation for Objects
    31
    o = {foo: 33}
    o.bar = new Object()
    o.baz = "string"
    o.float = 4.2
    Obj
    1
    2
    3

    8
    foo
    33
    "string"
    4.2
    bar
    baz
    float
    Full Power of Dynamic Languages: Rarely Used

    View full-size slide

  31. Maps, Hidden Classes, Object Shapes
    32
    o = {foo: 33}
    o.bar = new Object()
    o.baz = "string"
    o.float = 4.2
    Obj
    33
    4.2
    "string"
    Shape
    1: foo(int)
    2: baz(ptr)
    3: float(float)
    4: bar(ptr)
    Combined with inline caches, a
    field access is a simple access
    at memory offset

    View full-size slide

  32. Our Hero brings
    Structure and Logic to the Chaos
    33

    View full-size slide

  33. METAPROGRAMMING
    Is there hope for our Knight
    from the Land of Engineering Shortcuts
    to find the true treasure?
    34
    Reflection
    Metaprogramming
    Excellent
    Performance

    View full-size slide

  34. Reflective Method Invocation
    35
    cnt.invoke("+", [1])
    Let’s look at the addition first

    View full-size slide

  35. Reflective Method Invocation
    36
    cnt.invoke("+", [1])
    Int.+ method
    cnt.+(1)

    View full-size slide

  36. Optimizing Reflective Method Invocation
    37
    cnt.invoke("+", [1])
    + string
    Int.+ method
    Nesting of Lookup Caches
    Eliminates Potential Variability
    * string
    Int.* method

    View full-size slide

  37. Zero-Overhead Metaprogramming: Reflection and Metaobject Protocols Fast and without Compromises.
    Marr, S., Seaton, C. & Ducasse, S. (2015). PLDI’15
    Simple Metaprogramming: Zero Overhead
    38
    http://stefan-marr.de/papers/pldi-marr-et-al-zero-overhead-metaprogramming-artifacts/
    Reflective & Direct:
    Identical Machine Code!

    View full-size slide

  38. Ruby Image Processing Code using
    Metaprogramming
    39





    10.0
    12.5
    15.0
    17.5
    20.0
    Compose Color Burn
    Compose Color Dodge
    Compose Darken
    Compose Difference
    Compose Exclusion
    Compose Hard Light
    Compose Hard Mix
    Compose Lighten
    Compose Linear Burn
    Compose Linear Dodge
    Compose Linear Light
    Compose Multiply
    Compose Normal
    Compose Overlay
    Compose Pin Light
    Compose Screen
    Compose Soft Light
    Compose Vivid Light
    Speedup over unoptimized
    (higher is better)
    Speedup over unoptimized
    (higher is better)

    View full-size slide

  39. METAOBJECT PROTOCOLS
    Is there hope for our Knight
    from the Land of Engineering Shortcuts
    to find the true treasure?
    40
    Excellent
    Performance
    Metaobject
    Protocols

    View full-size slide

  40. Metaobject Protocols
    WriteLogging extends Metaclass {
    writeToField(obj, fieldName, value) {
    console.log(`${fieldName}: ${value}`)
    obj.setField(fieldName, value)
    }
    }
    41
    Redefine the Language from within the Language

    View full-size slide

  41. Problem
    obj.field = 12;
    writeToField(obj, "field", 12)
    fn writeToField(obj, fieldName, value) {
    console.log(`${fieldName}: ${value}`)
    obj.setField(fieldName, value)
    }
    }
    42
    turns into
    AOP
    Looks very Hard!

    View full-size slide

  42. Ownership-based Metaobject Protocol
    Building a Safe Actor Framework
    class ActorDomain extends Domain {
    fn writeToField(obj, fieldIdx, value) {
    if (Domain.current() == this) {
    obj.setField(fieldIdx, value);
    } else {
    throw new IsolationError(obj);
    }
    }
    /* ... */
    }
    43
    http://stefan-marr.de/research/omop/

    View full-size slide

  43. An Actor Example
    44
    actor.fieldA := 1
    semantic depends
    on metaobject
    AD.writeToField
    Cache Desired Language Semantics
    Eliminates Potential Variability
    Std write

    View full-size slide

  44. OMOP Overhead
    45
    meta-tracing partial evaluation
    Overhead: 4% (min. -1%, max. 19%) Overhead: 9% (min. -7%, max. 38%)




























































































































































    1.00
    1.05
    1.10
    1.15
    1.20
    Bounce
    BubbleSort
    Dispatch
    Fannkuch
    Fibonacci
    FieldLoop
    IntegerLoop
    List
    Loop
    Permute
    QuickSort
    Recurse
    Storage
    Sum
    Towers
    TreeSort
    WhileLoop
    DeltaBlue
    Mandelbrot
    NBody
    Richards
    Runtime Ratio to run without OMOP
    SOMMT

















































    0.8
    1.0
    1.2
    1.4
    Bounce
    BubbleSort
    Dispatch
    Fannkuch
    Fibonacci
    FieldLoop
    IntegerLoop
    List
    Loop
    Permute
    QuickSort
    Recurse
    Storage
    Sum
    Towers
    TreeSort
    WhileLoop
    DeltaBlue
    Mandelbrot
    NBody
    Richards
    Runtime Ratio to run without OMOP
    SOMPE

    View full-size slide

  45. Our Heroes
    46
    Eliminates
    Potential
    Variability
    Makes it Fast,
    With a little Luck

    View full-size slide

  46. GRADUAL TYPING
    Is there hope for our Knight
    from the Land of Engineering Shortcuts
    to find the true treasure?
    47
    Excellent
    Performance
    Gradual
    Typing

    View full-size slide

  47. Gradual Typing without
    Run-Time Semantics
    async addMessage(user: User, message) {
    const msg = `src="/image/${user.profilePicture}">
    ${message}`;
    this.outputElem
    .insertAdjacentHTML(
    'beforeend', msg);
    48
    Very Useful in Practice.
    And rather popular.
    Gradual
    Typing

    View full-size slide

  48. Transient Gradual Typing
    type Vehicle = interface {
    registration
    registerTo(_)
    }
    type Department = interface { code }
    var companyCar: Vehicle := object {
    method registerTo(d: Department) { print "{d.code}" }
    }
    companyCar.registerTo(
    object { var name := "R&D" })
    49
    Types are shallow.
    Method names matter,
    but arguments don’t.
    Object only has name,
    no code method
    Assignment to
    registerTo(d: Department) should error

    View full-size slide

  49. Transient Gradual Typing
    tmp := object {
    method registerTo(d) {
    typeCheck d is Department
    print "{d.code}" }
    }
    typeCheck tmp is Vehicle
    var companyCar = tmp
    companyCar.registerTo(
    object { var name := "R&D" })
    50
    Very simple semantics. Other Gradual
    systems have blame, and are more complex
    Possibly many
    many checks.
    Looks very Hard!
    Gradual
    Typing

    View full-size slide

  50. How to get rid of these checks without
    losing run-time semantics ?
    tmp := object {
    method registerTo(d) {
    typeCheck d is Department
    print "{d.code}" }
    }
    typeCheck tmp is Vehicle
    var companyCar = tmp
    companyCar.registerTo(
    object { var name := "R&D" })
    51

    View full-size slide

  51. Shapes to the Rescue
    52
    Shape
    1: foo(int)
    2: baz(ptr)
    3: float(float)
    4: bar(ptr)
    Shape
    1: foo(int)
    2: baz(ptr)
    3: float(float)
    4: bar(ptr)
    Implicitly
    Compatible to:
    - Type 1
    - Type 2
    1. Check object is compatible
    2. Shape implies compatibility

    View full-size slide

  52. Final optimized code
    tmp := object {
    method registerTo(d) {
    check d hasShape s1
    print "{d.code}" }
    }
    check tmp hasShape s2
    var companyCar = tmp
    companyCar.registerTo(
    object { var name := "R&D" }) 53
    need to do type check only once per lexical location
    s1.code
    s2.registerTo
    JIT Compiler can remove
    redundant checks

    View full-size slide

  53. Works Well!
    54
    OOPSLA’17
    ECOOP’19

    View full-size slide

  54. Transient Typechecks Are (Almost) Free
    55

    View full-size slide

  55. Our Heroes
    56
    Eliminates
    Potential Variability
    Provides a Supporting
    Structure
    Makes it Fast
    With a little Luck
    Polymorphic Inline Caches Just-in-time Compilation
    Maps, Hidden Classes,
    Shapes

    View full-size slide

  56. Things I didn’t talk about
    Failure cases:
    Deoptimization
    An Efficient
    Implementation of SELF a
    Dynamically-Typed Object-
    Oriented Language Based
    on Prototypes.
    Chambers, C., Ungar, D. &
    Lee, E. (1989). OOPSLA’89
    Object shapes are useful
    for other things
    Efficient and Thread-Safe
    Objects for Dynamically-
    Typed Languages.
    B. Daloze, S. Marr, D.
    Bonetta, and H.
    Mössenböck. OOPSLA'16
    58
    And many other
    modern optimizations

    View full-size slide

  57. Our Knights Made it With Some Help
    of our 90’s Heroes
    59
    Excellent
    Performance
    Reflection
    Metaprogramming
    Gradual
    Typing
    Metaobject
    Protocols

    View full-size slide

  58. Key Papers*
    60
    *Lots of necessary work
    afterwards, but lay foundations

    View full-size slide

  59. Research and Literature
    • Efficient Implementation of the
    Smalltalk-80 System.
    Deutsch, L. P. & Schiffman, A. M.
    (1984). POPL’84
    • Optimizing Dynamically-Typed
    Object-Oriented Languages With
    Polymorphic Inline Caches.
    Hölzle, U., Chambers, C. & Ungar,
    D. (1991). ECOOP’91
    • Zero-Overhead
    Metaprogramming: Reflection
    and Metaobject Protocols Fast
    and without Compromises.
    Marr, S., Seaton, C. & Ducasse, S.
    (2015). PLDI’15
    • Optimizing prototypes in V8
    https://mathiasbynens.be/notes/
    prototypes
    • https://mathiasbynens.be/notes/
    shapes-ics
    • https://mrale.ph/blog/2012/06/0
    3/explaining-js-vms-in-js-inline-
    caches.html
    61

    View full-size slide

  60. Research and Literature
    • An Efficient Implementation of
    SELF a Dynamically-Typed
    Object-Oriented Language Based
    on Prototypes.
    Chambers, C., Ungar, D. & Lee, E.
    (1989). OOPSLA’89
    • An Object Storage Model for the
    Truffle Language Implementation
    Framework.
    A. Wöß, C. Wirth, D. Bonetta, C.
    Seaton, C. Humer, and H.
    Mössenböck. PPPJ’14.
    • Storage Strategies for Collections
    in Dynamically Typed Languages.
    C. F. Bolz, L. Diekmann, and L.
    Tratt. OOPSLA’13.
    • Memento Mori: Dynamic
    Allocation-site-based
    Optimizations. Clifford, D., Payer,
    H., Stanton, M. & Titzer, B. L.
    (2015). ISMM’15
    62

    View full-size slide

  61. Research and Literature
    • Virtual Machine Warmup Blows Hot
    and Cold. Barrett, E., Bolz-Tereick, C.
    F., Killick, R., Mount, S. & Tratt, L.
    (2017). OOPSLA’17
    • Quantifying Performance Changes
    with Effect Size Confidence Intervals.
    Kalibera, T. & Jones, R.
    (2012). Technical Report, University
    of Kent.
    • Rigorous Benchmarking in
    Reasonable Time. Kalibera, T. &
    Jones, R. (2013). ISMM’13
    • How Not to Lie With Statistics: The
    Correct Way to Summarize
    Benchmark Results. Fleming, P. J. &
    Wallace, J. J. (1986). Commun. ACM
    • SIGPLAN Empirical Evaluation
    Guidelines
    https://www.sigplan.org/Resources/E
    mpiricalEvaluation/
    • Systems Benchmarking Crimes,
    Gernot Heiser
    https://www.cse.unsw.edu.au/~gern
    ot/benchmarking-crimes.html
    • Benchmarking Crimes: An Emerging
    Threat in Systems Security. van der
    Kouwe, E., Andriesse, D., Bos, H.,
    Giuffrida, C. & Heiser, G. (2018).
    arxiv:1801.02381
    • http://btorpey.github.io/blog/2014/0
    2/18/clock-sources-in-linux/
    • Generating an Artefact From a
    Benchmarking Setup as Part of CI
    https://stefan-
    marr.de/2019/05/artifacts-from-ci/
    63

    View full-size slide