Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Performance optimization with Code as Data in C...

Sponsored · Ship Features Fearlessly Turn features on and off without deploys. Used by thousands of Ruby developers.

Performance optimization with Code as Data in Clojure

Slides from the talk presented at FunctionalConf 2016 on 2016-October-14

Avatar for Shantanu Kumar

Shantanu Kumar

October 14, 2016
Tweet

More Decks by Shantanu Kumar

Other Decks in Technology

Transcript

  1. Who am I? • Principal Engineer at Concur • Author

    of “Clojure High Performance Programming” • Open Source contributor: https://github.com/ kumarshantanu • Using Clojure since early 2009 • @kumarshantanu on Twitter
  2. Vocabulary • Profiling • Latency: Median, Average, 99 Percentile •

    Throughput: Time window, Sustained • System Load
  3. Profiling • Benchmarking • Performance metrics collection • Baseline •

    Simulating load • Sampling • Tracing (Instrumentation)
  4. Code as Data • Opportunity to construct/manipulate code • Macros

    (compile time only) • Eval • Challenges • Debugging, Stack traces • Hard to Compose • JVM’s inlining budget
  5. Macros: String concatenation • Stringer `strcat` (alternative to clojure.core/str) •

    Stringer `strdel` (alternative to clojure.string/join) • In-place `java.lang.StringBuilder` manipulation • Caveats: Macro, Not fns
  6. Macros: Formatted string • Stringer `strfmt` (alternative to clojure.core/format) •

    In-place `java.lang.StringBuilder` manipulation • Faster than Java’s `String.format(..)`! • Caveats: Compile-time format string, Only common formatting specifiers supported
  7. Not macro: Textual table • Stringer `strtbl` (instead of clojure.pprint/print-table)

    • Uses arrays and eager `StringBuilder` manipulation • All optimizations need not leverage code as data!
  8. Macros: Logging • Logging library: clojure/tools.logging • Level macros: Disabled

    levels don’t eval • Cambium (extends clojure/tools.logging) • Same as c.t.l: Disabled levels don’t eval • How it composes: Macros emitting macros
  9. Faster web routing with Eval https://github.com/ kumarshantanu/calfpath Image source: https://www.pexels.com/photo/timelapse-

    photography-of-vehicle-on-concrete-road-near-in-high- rise-building-during-nighttime-169677/
  10. Eval: Web routing • Baseline: Iteration through route data •

    Optimization: Code generation with Eval • Optimization technique: Loop unrolling* • Constraint: Ahead-of-time code generation *https://en.wikipedia.org/wiki/Loop_unrolling
  11. Latency Breakup • Latency measurement using macro • Measure only

    selected points (low overhead) • Thread-local metrics context • Field use: Instrumentation • Field use: Trigger report on threshold exceed
  12. Latency Breakup Chart | :name |:cumulative|:cumul-%|:individual|:indiv-%| :thrown?| |------------------+-----------+--------+-----------+--------+-------------------| |web.post.order |

    357.82 ms|100.00 %| 103.037 ms| 28.80 %| | | biz.item.fetch | 204.407 ms| 57.13 %| 150.723 ms| 42.12 %| | | db.item.fetch | 53.684 ms| 15.00 %| 53.684 ms| 15.00 %| | | queue.post.order| 50.376 ms| 14.08 %| 50.376 ms| 14.08 %|java.lang.Exception|
  13. Benchmarks: Hardware • Intel i7-4770 @ 3.40 GHz processor •

    Quad-core, 64-bit physical machine • L1d cache: 32K, L1i cache: 32K • L2 cache: 256K, L3 cache: 8192K • RAM: 16GB
  14. Benchmarks: Software • OS: CentOS 7.2 (stock kernel 3.10.0-327.el7.x86_64) •

    Java: Oracle JDK 1.8.0_102_b14 • JVM args: “-server -Xms2048m -Xmx2048m” • Clojure 1.8.0 • Criterium 0.4.4, Citius 0.2.3 • Stringer 0.3.0, Calfpath 0.4.0, Espejito 0.1.1