$30 off During Our Annual Pro Sale. View Details »

Babashka: a meta-circular Clojure interpreter for the command line @ Strange Loop 2023

Michiel Borkent
September 21, 2023
690

Babashka: a meta-circular Clojure interpreter for the command line @ Strange Loop 2023

Babashka is a Clojure interpreter for cross platform scripting. It is available as a single binary that starts instantly. It makes Clojure a viable replacement for writing bash scripts. Babashka comes with a handful of libraries out of the box (JSON, command line parsing, etc.) and supports loading libraries from the Clojure ecosystem. The interpreter is written in a meta-circular approach, akin to Structure and Interpretation of Computer Programs. It is compiled to a single binary using GraalVM native-image which is the reason it starts fast, but also uses less memory than a JVM, Clojure's original runtime. As such, babashka brings together many exciting technologies to broaden the reach of Clojure even more. This talk explores the high level use cases of babashka, its impact on the Clojure community, its history, technical implementation details and the author's approach to open source development.

Michiel Borkent

September 21, 2023
Tweet

Transcript

  1. Babashka


    Michiel Borkent


    @borkdude


    2023-09-21


    A Meta-Circular Clojure Interpreter For The Command Line
    1

    View Slide

  2. Clojure and me
    • University: functional programming Miranda, internship
    Common Lisp


    • CS lecturer: Java ... Clojure as a way to learn about the JVM


    • Clojure is Lisp on the JVM that emphasises FP


    • I tried other languages. Haskell, PureScript, Scala, ... I just
    love Clojure + can make a living from it ¯\_(ツ)_/¯


    • I want to use Clojure's standard library everywhere
    2

    View Slide

  3. Is Clojure dead (yet)?
    • Making a modest income from Clojure OSS for the past two years


    • Clojurists Together: supporting the OSS Clojure ecosystem


    • Nubank (1200+ Clojure devs), largest online bank in South America


    • Red Planet, Grif
    fi
    n, Penpot, Roam Research, Logseq, Pitch, Apps
    fl
    yer, Cisco,
    Walmart, Exoscale, Nextjournal, Metabase, ...


    • Once you know Clojure, you don't ask that many questions on StackOver
    fl
    ow


    • Never breaking changes: ideal for long lived stable projects (as is the JVM)


    • Babashka is a sub-community within the Clojure community with its own
    conference


    • Long live "dead" languages!
    👻
    3

    View Slide

  4. • Native Clojure scripting tool, single binary, no JVM


    • Can be used to replace “the grey areas” of bash


    • Easily installable via script, brew (macOS, linux), aur (linux),
    scoop (Windows)
    $ time bb
    -
    e '(+ 1 2 3)'

    6

    0.00s user 0.00s system 67% cpu 0.013 total
    4

    View Slide

  5. Startup time: clj (JVM) vs bb (native-image)
    bb (n.image) + lots of deps loaded at
    startup
    JVM Clojure + lots of deps loaded at startup
    Vanilla JVM Clojure, no deps
    5

    View Slide

  6. Example script: compile some
    Java
    6

    View Slide

  7. Example script: pst
    $ pst.clj

    04
    :
    58
    7

    View Slide

  8. Cross platform! 😎
    8

    View Slide

  9. Interop done by Clojure lib
    $ pst.clj

    04
    :
    58
    9

    View Slide

  10. Included libs
    • clojure.{core, edn, java.shell, java.io,
    pprint, set, string, test, walk, zip}


    • clojure.tools.cli


    • clojure.core.async


    • clojure.data.csv


    • cheshire.core (JSON)


    • clojure.xml


    • cognitect.transit


    • clj-yaml


    • httpkit.server


    • babashka.http-client,
    babashka.process, babashka.fs


    • many others
    🔋 🔋
    10

    View Slide

  11. Included libs
    • Babashka aims to hold the promise of Clojure: No API changes


    • A script that runs now, should run forever


    • Included libraries should hold to this promise as well (Clojure ecosystem culture)


    • No removing of libs (even if they are deprecated)


    • Imagine unix command line tools changing their output every year...


    • Spec-ulation talk by Rich Hickey
    🔋 🔋
    11

    View Slide

  12. 12

    View Slide

  13. 13

    View Slide

  14. • CLI tools with instant startup! (< 10ms)


    • clj-kondo: linter and static analyzer for Clojure


    • jet: convert between JSON, EDN and Transit
    +
    native-image
    14

    View Slide

  15. 15

    View Slide

  16. jet (jq-ish for Clojure)

    DSL -> scripting
    • Added a query DSL to jet, a GraalVM CLI:


    $ jet
    - -
    query '(map :id)'
    < < <
    '[{:id 1} {:id 2}]'

    [1 2]


    • Extend this DSL to signi
    fi
    cant subset of Clojure? Or just use
    clojure.core/eval ...?

    16

    View Slide

  17. native-image + eval
    Clojure compiler emits JVM bytecode but
    GraalVM native-image transforms
    bytecode to native "ahead of time"
    17

    View Slide

  18. "Constraints force creativity and innovation"

    - Thomas Wuerthinger

    View Slide

  19. Small Clojure Interpreter
    (def f (sci/eval
    -
    string "
    # (
    + 1 2 %)"))


    (f 1)


    ; ; = >
    4
    • Works on JVM / GraalVM native-image / JS


    • Sandboxing


    • Works in CLJS advanced compiled apps


    • Supports almost all of Clojure


    • https://github.com/babashka/sci
    19

    View Slide

  20. SCI: adding libs
    (require '[cheshire.core :as json])


    (def sci
    -
    opts

    {:namespaces

    {'cheshire.core

    {'generate
    -
    string json/generate
    -
    string}}})



    (sci/eval
    -
    string


    "(require '[cheshire.core :as json])


    (json/generate
    -
    string {:a 1})"


    sci
    -
    opts)

    ; ; = >
    "{\"a\"
    :
    1}"
    20

    View Slide

  21. SCI: meta-circular
    • Implements language features in terms of similar host language
    features


    • Self-interpreter: interpreted language = host-language


    • Function application can be implemented using apply


    • loop in SCI is implemented using loop in Clojure


    • Pure core functions like +, assoc, etc. can just be direct references to
    host functions


    • Side-effecting functions often re-implemented to make SCI sandboxed
    21

    View Slide

  22. Meta-circular function
    application (simpli
    fi
    ed)
    22

    View Slide

  23. let bindings (simpli
    fi
    ed)
    23

    View Slide

  24. lambda (simpli
    fi
    ed)
    24

    View Slide

  25. SCI's initial (naive) approach
    Parser

    (string -> s-expr)
    Evaluator

    (s-expr -> result)
    "(+ 1 x)"

    -> '(+ 1 x)
    Walk s-expression


    Decide that + is a symbol that denotes function,

    resolve '+ to clojure.core/+


    Decide that 1 is constant and resolve to itself


    Decide that symbol x is a local, look up local binding in ctx

    Finally, call + on 1 and resolved value for 'x, e.g. 2

    '(+ 1 x) -> 3
    25
    Lots of work happening over and over which could be done only once...
    (for [x (range 100000)]

    (+ 1 x))

    View Slide

  26. 26

    View Slide

  27. Do as much preparation


    as you in analysis
    Do as little as possible here

    (avoid repeated analysis)
    27

    View Slide

  28. Analysis -> Node AST
    28

    View Slide

  29. Performance
    (time

    (loop [val 0 cnt 10000000]

    (if (pos? cnt)

    (recur (inc val) (dec cnt))

    val)))

    "Elapsed time: 575.72325 msecs"

    10000000
    • Performance was not the initial goal, later concern


    • Example: Loop of 10M iterations, conditional, 3 function calls


    • This is fast in compiled code, but SCI is interpreter
    (Measured on Macbook Air M1)
    29

    View Slide

  30. Comparison:

    Python loop perf
    (Measured on Macbook Air M1)
    30

    View Slide

  31. SCI loop perf
    16176 ms
    991 ms

    (15x faster than July 2020)
    (Measured on linux amd64 machine)
    31

    View Slide

  32. Accessing locals
    https://github.com/babashka/sci/issues/416
    • Hash-map is dynamically sized,

    implementation easy and thread-safe


    • But
    fi
    xed sized arrays are 15x faster...


    • Need to pre-allocate + pre-calculate indexes


    • Copy over closed over values to each new closure

    (thread safety)


    • Set local value: (assoc bindings 'a 1) -> (aset bindings 0 1)


    • Read a: (get bindings 'a) -> (aget bindings 0)


    • Read c:(get bindings 'c) -> (aget bindings 3)

    (let [a 1, b 2]

    (fn [x]

    (let [c (+ a b x)]

    (inc c)))
    Closed over locals: a, b


    Function argument: x


    Let binding: c
    32

    View Slide

  33. fn + loop / recur
    33

    View Slide

  34. Interop (Thread/sleep 100)
    • SCI classes must be con
    fi
    gured

    {:classes {'java.lang.Thread Thread}

    :imports {'Thread 'java.lang.Thread}


    • Graal re
    fl
    ect-con
    fi
    g.json keeps classes + metadata around for re
    fl
    ection


    • Methods are looked up via Java re
    fl
    ection API (cached) and invoked


    • Room for performance improvements by looking up method at analysis
    time
    34

    View Slide

  35. Binary size
    35

    View Slide

  36. Babashka pods
    • External binaries that expose namespaces + functions to bb via RPC


    • Moar GraalVM binaries, but can also be built in Golang, Haskell, etc. as long as
    they return the right data


    • Gives room for experimentation before including more libs


    • Access to other language ecosystems
    36

    View Slide

  37. SCI on JS
    37

    View Slide

  38. clj-kondo hooks using SCI
    38

    View Slide

  39. Building on Clojure + JVM /
    GraalVM
    39

    View Slide

  40. 1st Babashka-conf
    40

    View Slide

  41. Babashka workshop (Rahul De)
    41

    View Slide

  42. Babooka (Daniel Higginbotham)
    42

    View Slide

  43. Thank you for sponsoring my open source!
    https://opencollective.com/babashka/contribute
    https://github.com/sponsors/borkdude
    43

    View Slide

  44. Conclusion
    Clojure might not be the best language for everything, like
    scripting


    Clojure is the best language for scripting ;)
    44

    View Slide