Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Just-in-Time With Numba

Just-in-Time With Numba

Event: Remote Python Pizza 1.0
Date: 25 April 2020
Location: Remote (@ your couch!)

Facing slow processing times in your numerical codes? In this lightning talk, we will explore JIT compilation in Numba.

Ong Chin Hwee

April 25, 2020
Tweet

More Decks by Ong Chin Hwee

Other Decks in Technology

Transcript

  1. Just-in-Time with Numba
    Presented by:
    Ong Chin Hwee (@ongchinhwee)
    25 April 2020
    Remote Python Pizza

    View Slide

  2. About me
    Ong Chin Hwee 王敬惠
    ● Data Engineer @ ST Engineering
    ● Background in aerospace
    engineering + computational
    modelling
    ● Contributor to pandas 1.0 release
    ● Mentor team at BigDataX
    @ongchinhwee

    View Slide

  3. Bottlenecks in a data science project
    ● Lack of data / Poor quality data
    ● Data Preprocessing
    ○ The 80/20 data science dilemma
    ■ In reality, it’s closer to 90/10
    ○ Slow processing speeds in Python!
    ■ Python runs on the interpreter, not compiled
    @ongchinhwee

    View Slide

  4. Compiled vs Interpreted Languages
    Written Code Compiler
    Compiled Code
    in Target
    Language
    Linker
    Machine Code
    (executable)
    Loader
    Execution
    @ongchinhwee
    @ongchinhwee

    View Slide

  5. Compiled vs Interpreted Languages
    Written Code Compiler
    Lower-level
    bytecode
    Virtual
    Machine
    Execution
    @ongchinhwee

    View Slide

  6. What is Just-in-Time?
    Just-In-Time (JIT) compilation
    ● Converts source code into native machine code at
    runtime
    ● Is the reason why Java runs on a Virtual Machine (JVM)
    yet has comparable performance to compiled languages
    (C/C++ etc., Go)
    @ongchinhwee

    View Slide

  7. Just-in-Time with Numba
    numba module
    ● Just-in-Time (JIT) compiler for Python that converts
    Python functions into machine code
    ● Can be used by simply applying a decorator (a wrapper)
    around functions to instruct numba to compile them
    ● Two modes of execution:
    ○ njit (nopython compilation of Numba-compatible code)
    ○ jit (object mode compilation with “loop-lifting”)
    @ongchinhwee

    View Slide

  8. Numba Compiler Architecture
    Lower-level
    bytecode
    Numba
    interpreter
    Numba IR
    Lowering
    (codegen)
    LLVM IR
    @ongchinhwee
    Type
    inference
    Typed
    Numba IR
    Machine
    Code
    (executable)
    LLVM JIT
    Compiler
    IR: Intermediate Representation

    View Slide

  9. Numba Compiler Architecture
    Lower-level
    bytecode
    Numba
    interpreter
    Numba IR
    Lowering
    (codegen)
    LLVM IR
    @ongchinhwee
    Type
    inference
    Typed
    Numba IR
    Machine
    Code
    (executable)
    LLVM JIT
    Compiler
    Numba frontend
    Numba backend
    IR: Intermediate Representation

    View Slide

  10. Practical Implementation
    @ongchinhwee

    View Slide

  11. Initialize File List in Directory
    import numpy as np
    import os
    import sys
    import time
    DIR = './chest_xray/train/NORMAL/'
    train_normal = [DIR + name for name in os.listdir(DIR)
    if os.path.isfile(os.path.join(DIR, name))]
    No. of images in
    ‘train/NORMAL’: 1431
    @ongchinhwee

    View Slide

  12. With numba
    from PIL import Image
    from numba import jit
    @jit
    def image_proc(index):
    '''Convert + resize image'''
    im = Image.open(define_imagepath(index))
    im = im.convert("RGB")
    im_resized = np.array(im.resize((64,64)))
    return im_resized
    @ongchinhwee

    View Slide

  13. With numba
    from PIL import Image
    from numba import jit
    @jit
    def image_proc(index):
    '''Convert + resize image'''
    im = Image.open(define_imagepath(index))
    im = im.convert("RGB")
    im_resized = np.array(im.resize((64,64)))
    return im_resized
    Code runs in object mode (@jit)
    @ongchinhwee

    View Slide

  14. With numba
    start_cpu_time = time.clock()
    listcomp_output = np.array([image_resize(x) for x in
    train_normal])
    end_cpu_time = time.clock()
    total_tpe_time = end_cpu_time - start_cpu_time
    sys.stdout.write('List comprehension completed in {}
    seconds.\n'.format(
    total_tpe_time))
    Python-only:
    218.1 seconds
    After compilation:
    169.6 seconds
    @ongchinhwee

    View Slide

  15. With numba
    import numpy as np
    from numba import njit
    @njit
    def square(a_list):
    squared_list = []
    '''Calculate square of number in a_list'''
    for x in a_list:
    squared_list.append(np.square(x))
    return squared_list
    @ongchinhwee

    View Slide

  16. With numba
    import numpy as np
    from numba import njit
    @njit
    def square(a_list):
    squared_list = []
    '''Calculate square of number in a_list'''
    for x in a_list:
    squared_list.append(np.square(x))
    return squared_list
    Code runs in no-Python/native
    machine mode (@njit or
    @jit(nopython=true))
    @ongchinhwee

    View Slide

  17. With numba
    a_list = np.array([i for i in range(1,100000)])
    start_cpu_time = time.time()
    listcomp_array_output = square(a_list)
    end_cpu_time = time.time()
    total_tpe_time = end_cpu_time - start_cpu_time
    sys.stdout.write(
    'Elapsed (after compilation) {}
    seconds.\n'.format(total_tpe_time))
    Python-only:
    0.51544 seconds
    After compilation:
    0.00585 seconds
    @ongchinhwee

    View Slide

  18. Key Takeaways
    @ongchinhwee

    View Slide

  19. Just-in-Time with numba
    ● Just-in-Time (JIT) compilation with numba
    ○ converts source code from non-compiled languages
    into native machine code at runtime
    ○ may not work for some functions/modules - these are
    still run on the interpreter
    ○ significantly enhances speedups provided by
    optimized numerical codes
    @ongchinhwee

    View Slide

  20. Reach out to
    me!
    : ongchinhwee
    : @ongchinhwee
    : hweecat
    : https://ongchinhwee.me
    And check out my slides on:
    hweecat/talk_jit-numba

    View Slide