Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Make the compiler sweat, chill in runtime by Simon Lindholm

Make the compiler sweat, chill in runtime by Simon Lindholm

Malmö C++ User Group - Meeting 0x2 - June 26, 2018

https://www.meetup.com/Malmo-C-User-Group/events/251265413/

Modern C++ offers some powerful features for doing work in compile time instead of in runtime. We'll explore some of these features, for a faster, smaller and more correct runtime.

Simon Lindholm is an independent contractor, mostly working with Linux and various embedded platforms.

89b8f4a7429270308ac301bf7605f974?s=128

Ólafur Waage

June 26, 2018
Tweet

Transcript

  1. Malmö C++ User Group Meeting 0x2

  2. Make the compiler sweat, chill in runtime Malmö C++ User

    Group Meeting 0x2
  3. i.e. Malmö C++ User Group Meeting 0x2

  4. Doing stuff in compile time Malmö C++ User Group Meeting

    0x2
  5. i.e. Malmö C++ User Group Meeting 0x2

  6. ...metaprogramming Malmö C++ User Group Meeting 0x2

  7. 7 Who am I? • Started programming BASIC on C64

    and Apple II • Borland C++ 3.0 ca. 1994 • Studied at LTH • Worked mostly with embedded stuf
  8. 8 Embedded systems • Limited memory • Limited CPU power

    • Important with correctness • Mostly dominated by C • Somewhat conservative attitude towards C++ (C with classes...)
  9. 9 The process leading up to compile-time trickery • “It

    would be really nice if it was possible to...” • “Is it possible?” • “It is!” • ... or “...well, sort of”
  10. 10 What’s this talk about? • Just go through a

    couple of these ideas • Nothing new under the sun • These ideas presented as inspiration
  11. 11 Like a fashion show

  12. Let’s just dive in...

  13. Example 1: CRC32

  14. 14 CRC32 • CRC = Cyclic Redundancy Check • Many

    diferent variations • 32 bit most common size • Robust against burst errors
  15. 15 CRC32: Basic idea Data CRC32 Data CRC32 CRC32 CRC32()

    CRC32() Same? Alice Bob
  16. 16 CRC32: Polynomials • The algorithm is specified by a

    polynomial • A trillion diferent standards • Ethernet, ZIP, etc.: x32 + x26 + x23 + x22 + x16 + x12 + x11 + x10 + x8 + x7 + x5 + x4 + x2 + x + 1 • ...or just 0xEDB88320 • Can just treat it as a magic number
  17. 17 CRC32: API • Basically just want a class: –

    Constructor: CRC32(uint32_t polynomial) – Method: add(const uint8_t * data, size_t N) – Method: uint32_t result()
  18. CRC32 Implementation 1 Simple & slow

  19. 19 CRC32 - Implementation 1

  20. None
  21. 21 CRC32 - Implementation 1 • Calls calc_crc_value() for each

    byte • Pure function - output only depends on the input • Only 256 possible inputs • → Precomputation table!
  22. CRC32 Implementation 2 Precomputation table

  23. 23 CRC32 - Implementation 2

  24. None
  25. 25 CRC32 - Implementation 2 • Table lookup instead of

    calc_crc_value() for each byte • Takes about 1K memory (256 * 4) • The code to generate the table also takes a small amount of memory • Less flexible - can’t choose polynomial at runtime (could get around this...)
  26. CRC32 Implementation 3 Static precomputation table

  27. 27 CRC32 - Implementation 3 (Identical to version 2)

  28. None
  29. 29 CRC32 - Implementation 3 • Don’t have to generate

    the table in runtime • Gets rid of the code to generate the table • Very inflexible
  30. CRC32 Implementation 4 • Precomputation table • Static • Flexible

  31. 31 CRC32 - Implementation 4

  32. 32 make_compile_time_array() ? (seq & gen_seq stolen from Stack Overflow)

  33. 33 make_compile_time_array() ?

  34. 34 make_compile_time_array() ?

  35. 35 make_compile_time_array() ?

  36. 36 make_compile_time_array() ?

  37. 37 CRC32 - Implementation 4

  38. 38 CRC32 - Implementation 4 • Don’t have to generate

    the table in runtime • Gets rid of the code to generate the table • Pretty flexible • General strategy for any lookup table (sin, cos, etc.?)
  39. Example 2: Safe(r) buffers

  40. 40 C-style bufer handling • Pointer arithmetic: uint8_t * •

    memset(), memcpy(), memcmp() • Common when doing low-level stuf (raw access to pixel bufer, etc.) • Error prone • A lot of bounds checking can be had in compile time instead of runtime
  41. 41 Benefits of compile-time bounds checks • Catch errors earlier

    (compiler error) • Can skip those checks in runtime • Safer code • Faster code • Smaller code
  42. 42 We make some bufer classes FixedConstBuffer<N> ConstBuffer Buffer FixedBuffer<N>

  43. Buffer classes FixedConstBuffer<N> ConstBuffer Buffer FixedBuffer<N> Const Non-Const Size known

    at compile- time Size only known in runtime
  44. Buffer classes FixedConstBuffer<N> ConstBuffer Buffer FixedBuffer<N> DynamicBuffer StaticBuffer<N> std::unique_ptr<uint8_t[]> uint8_t

    buf_[N] Const Non-Const Size known at compile- time Size only known in runtime
  45. 45 Class skeletons

  46. 46 Class skeletons

  47. 47 Class skeletons

  48. 48 Class skeletons (No ownership of data)

  49. 49 Class skeletons

  50. 50 Class skeletons (Owners of data)

  51. 51 The ownership classes • StaticBufer – As the name

    suggest, pretty static – Put it on the stack, or as static data • DynamicBufer – Essentially just wrapper of std::unique_ptr<uint8_t> – So, std::move() if you want to transfer ownership
  52. 52 The wrapper classes Bufer, ConstBufer, FixedBufer<>, FixedConstBufer<> • FixedBufer<>

    & FixedConstBufer<> – Single pointer (uint8_t* or const uint8_t*) • Bufer & ConstBufer – Pointer + std::size • Hence, they are meant for pass-by-value Do this! Not this!
  53. 53 Methods • take(I,N) & skip(I) Pointer arithmetic • set_zero(),

    fill_with(X) memset() • copy(dst, src) memcpy() • equals(buf1, buf2) memcmp() • operator<<(std::ostream&) [debugging..] • ... tons of overloads for these...
  54. 54 Example: StaticBufer<>

  55. 55 Example: StaticBufer<>

  56. 56 Example: StaticBufer<> Compile error!

  57. 57 static_assert()

  58. 58 Example: DynamicBufer

  59. 59 Example: DynamicBufer Runtime error!

  60. 60 Copying & comparing Is essentially equivalent to: ...but with

    proper bounds checking
  61. 61 Copying & comparing • If both bufer types fixed,

    bounds checking happen at compile time • Otherwise, in runtime
  62. 62 Assembly instructions ...just a few instructions (with optimizations enabled)

  63. 63 Summary & reflections • These bufer classes by no

    means perfect • Should implement operator overloading so they behave the same as C-style pointers • Complete brain-fart to implement copy(), equals() etc. as standalone functions • Main point: Most of the bounds checks that can be done in compile time are being done in compile time.
  64. Bonus round: Parsing JSON

  65. 65 Why in the world would you want to? •

    Data you want to integrate into your code somehow – Cross-language message specifications – Hardware configurations – etc. • JSON: Common, convenient, simple
  66. 66 Generate C++ code with script? • Depending on build

    system, can be messy to integrate • Adds requirement of Python etc for the build system • Actually pretty convenient most of the time • *BUT* aesthetically displeasing to have to resort to a separate language to do something that SHOULD be possible in pure C++
  67. 67 What’s really our goal here? • Parsing JSON is

    actually pretty simple. It’s really just: – Numbers (12, -34, 0.23, 1.3e-4, etc.) – Strings (“foo”, “bar”, “hey\nho!”) – Booleans (true / false) – Lists [ 1, 2, “foo”, [ ] ] – Dictionaries { “foo”: 129, “bar”: [1,2,3] } – null • ...but we want to do it in compile time now
  68. 68 What’s really our goal here? • Let’s not spend

    too much time on preamble • ...just dive in head first and see where we end up
  69. 69 Some limitations... • Template parameters are most often types

    (std::vector<int>) • ...but perfectly fine to use values as well (std::array<int,10>) • But not all values are welcome, we aren’t allowed to use strings as template arguments, for instance
  70. 70 Some limitations... This is fine

  71. 71 Some limitations... This is fine This is not

  72. 72 Let’s combine a few things • Variadic templates •

    User-defined literals • ...and unfortunately a C-macro (yuck!)
  73. 73 ...and suddenly we can use strings as template arguments

  74. 74 We can print the strings easily (useful for debugging)

    ...but let’s do something a bit more general, (becuase we will end up wanting to print quite a lot of diferent types)
  75. But let’s make something a bit more general

  76. 76 And now we can print our strings thusly: Before:

    After: Almost the same, but the $<> notation is trivial to extend to other types - and we’ll have a lot of them
  77. 77 How do we read the JSON data to begin

    with? • Ideally, we would just want to be able to read it as a regular file • Not possible in compile time - there are no constexpr I/O functions • We can almost use #include, but not quite • With can get something reasonably satisfactory using raw string literals
  78. 78 Alt 1. Embed the JSON directly into the C++

    code Look somewhat Ok, but we have to copy/paste our JSON data into our code - not nice...
  79. 79 Alt 2. #include the JSON data JSON somewhat more

    independent from the C+ + code, but we need it wrapped in R”( )”_js
  80. 80 meta::list • Skipping a ton of details here •

    But let’s look at a general list class/template we’ll be using quite a bit • Really just a wrapper for std::tuple, but with a bunch of helpers
  81. meta::list

  82. 82 Some examples of list methods

  83. 83 Of course have a bunch of similar string methods

  84. 84 Some are really trivial

  85. 85 Some are pretty simple

  86. 86 Some are a bit more complex Remember seq/gen_seq from

    CRC32?
  87. 87 With the two meta-classes list and str, we can

    start building a parser • The stuff we want to parse out is – Numbers (12, -34, 0.23, 1.3e-4, etc.) – Strings (“foo”, “bar”, “hey\nho!”) – Booleans (true / false) – Lists [ 1, 2, “foo”, [ ] ] – Dictionaries { “foo”: 129, “bar”: [1,2,3] } – null • (For simplicity, we won’t support floats - only integers)
  88. 88 Our building blocks for the data, will be

  89. 89 Our building blocks for the data, will be

  90. 90 Our building blocks for the data, will be

  91. 91 Our building blocks for the data, will be

  92. 92 Our building blocks for the data, will be (Dictionaries

    will be treated as just lists of key-value pairs)
  93. 93 Mapping JSON <--> types • So if we have

    some JSON data like this: • Then we want our resulting type to be something like this:
  94. 94 The most basic parser: Literal

  95. (I think I’ve overdone the Comic Sans joke...)

  96. 96 The most basic parser: Literal

  97. 97 The most basic parser: Literal Helper to create successful/failed

    result-types
  98. 98 Next most basic parser: from_func (could do with a

    better name)
  99. 99 Next most basic parser: from_func (could do with a

    better name) So, if we only have some constexpr versions of the <cctype> functions...
  100. 100 Next most basic parser: from_func (could do with a

    better name) Then we can use our from_func as such: And we have equivalents of the simple regexes:
  101. 101 A few other basic parsers ...which lets us combine

    other parsers
  102. 102 A few other basic parsers ...which lets us combine

    other parsers
  103. 103 A few other basic parsers ...which lets us combine

    other parsers
  104. 104 A few other basic parsers ...which lets us combine

    other parsers
  105. 105 Putting it all together

  106. 106 Putting it all together

  107. 107 A small test program

  108. Compiling

  109. 109 JSON: In summary • It’s sort of doable to

    parse JSON at compile time • This implementation is ridiculously slow • Don’t do stuf like this in real projects • ...but it does show that you CAN do a lot of things in compile time that might not have been previously possible
  110. That’s all I had

  111. You can find the code on: datakod.se/gitlab

  112. Just a fashion show

  113. Thank you for listening!

  114. Thank you for listening! Malmö C++ User Group Meeting 0x2

    datakod.se/gitlab