Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Pelemay Backend: A memory-saving, fault-tolerant and distributed collection of Nx compilers and backends for embedded systems

Pelemay Backend: A memory-saving, fault-tolerant and distributed collection of Nx compilers and backends for embedded systems

Susumu Yamazaki (ZACKY)

September 07, 2023
Tweet

More Decks by Susumu Yamazaki (ZACKY)

Other Decks in Programming

Transcript

  1. Pelemay Backend: A memory-saving, fault-
    tolerant and distributed collection of Nx
    compilers and backends for embedded systems
    Susumu Yamazaki (ZACKY)


    This work was partially supported by the Asahi Kohsan Group Research Support Program of the Kitakyushu Foundation
    for the Advancement of Industry Science and Technology.
    1
    ©︎
    2023 Susumu Yamazaki

    View full-size slide

  2. About Susumu Yamazaki (ZACKY)
    • This slides are in my Speaker Deck

    https://speakerdeck.com/zacky1972


    • From Japan 🇯🇵.


    • An organizer of ElixirConf JP.


    • Associate Professor at Univ. of Kitakyushu.


    • My hobby in my childhood was to describe
    science fiction stories!


    • I wanted to write longer stories, like Perry
    Rhodan, but my advantage was to write shorter
    stories…
    2
    ©︎
    2023 Susumu Yamazaki

    View full-size slide

  3. Background
    • You know that de facto standard frameworks of machine learning for Elixir are Nx, Axon
    and their ecosystem.


    • The talk by Sean Moriarity at this ElixirConf showed the positioning strategy of MLOps
    towards distributed and parallel computing with multiple GPUs for LLM!


    • We are quite much inspired by his talk.


    • However, current focuses of Nx, Axon and their ecosystem, especially EXLA, are
    unsuitable for most embedded systems due to lack of GPUs.


    • So, we have been developing Pelemay Backend, a lightweight Nx backend
    specialized for embedded systems, since 2022.
    3
    ©︎
    2023 Susumu Yamazaki

    View full-size slide

  4. Our Positioning Strategy
    4
    ©︎
    2023 Susumu Yamazaki

    View full-size slide

  5. Lesson Learned of Pelemay Backend 1st ed.
    • We developed Pelemay Backend 1st ed. in 2022.


    • This proves utilization of OpenBLAS as an Nx backend.


    • BLAS means Basic Linear Algebra Subroutines, which has been developed and
    sophisticated since the FORTRAN era.


    • OpenBLAS is an open-source software compatible with BLAS, and has faster
    implementation with SIMD or vector instructions for most ISAs, including ARM and
    RISC-V, than that written in C.


    • We implemented a partial builder that can compile only necessary modules of
    OpenBLAS, and a prototype backend using it.
    5
    ©︎
    2023 Susumu Yamazaki

    View full-size slide

  6. Lesson Learned of Pelemay Backend 2nd ed.
    • Next, we have developed Pelemay Backend 2nd ed. since 2023.


    • One of its concepts is component-based for maintainability, based on Aspect-oriented programming
    (AOP).


    • That is, we will develop a backend generator to decorate the specified based backend with the functions
    before and after a set of functions in the backend.


    • The set can be specified with the style of AspectJ, an AOP language, and with grouping written in HexDocs
    of Nx, for example, Aggregates, Backend, Conversion, and so on.


    • The another is memory-saving. We proved that converting ONNX for ResNet to Axon and loading it require
    9GB memory. That is too much to execute them on an embedded system.


    • However, the Sean’s talk shows the roadmap to realize memory-saving processing for LLM. Then, we
    will wait for the realization.
    6
    ©︎
    2023 Susumu Yamazaki

    View full-size slide

  7. What Pelemay Backend focuses on
    • Thus, now, we’ll focus on implementation of the component-based architecture with OpenBLAS.


    • Some module focuses on only multiplication of matrix and matrix.


    • Some module focuses on only addition with vectors or matrices with scalar multiplication.


    • Some module focuses on only scalar multiplication.


    • Some module focuses on only dividing large vectors or matrices into smaller pieces.


    • Other unfrequent operations are delegated to the default backend.


    • Such many simple modules collaborate to operate given numerical functions.


    • This makes architecture simpler to maintain than monolith.


    • This approach is to accumulate shorter stories towards a longer and longer story!
    7
    ©︎
    2023 Susumu Yamazaki

    View full-size slide

  8. To get source of Pelemay Backend
    • https://github.com/zeam-vm/pelemay_backend


    • Look forward to our future progress of such
    accumulated stories!


    • Thank you!
    8
    ©︎
    2023 Susumu Yamazaki

    View full-size slide