Slide 1

Slide 1 text

Kenta Murata, Xica Co., Ltd. (2024.02.03) Calling Julia functions from Streamlit applications How can multithread Python scripts interact with Julia safely?

Slide 2

Slide 2 text

introduce(myself) • Speaker: Kenta Murata • Working for Xica Co., Ltd. as Chief Research O ff i cer • Twitter: @KentaMurata • GitHub: @mrkn • OSS Activities (both are inactive due to childcare) • CRuby committer (since 2010) • Apache Arrow committer (since 2019) • Hobbies • Camera, Computer Science, Mathematics, Physics, etc.

Slide 3

Slide 3 text

Contents Background and Motivation Calling Julia from Python How Streamlit runs page scripts Calling Julia from a Streamlit application Demonstrations Conclusion

Slide 4

Slide 4 text

Background and Motivation

Slide 5

Slide 5 text

Background and Motivation What is Streamlit • A Python library that provides us the way to turn data scripts into sharable web apps.

Slide 6

Slide 6 text

Background and Motivation We need the Streamlit-like way for our Julia solution • We currently develop Pluto.jl notebooks for tasks requiring Julia functionality; however, this approach poses a steeper learning curve for our data analysts. • Streamlit-like way o ff ers a more intuitive and accessible user experience, especially for those with limited programming expertise. • By adopting a Streamlit-like way, we want to democratize access to Julia’s power, ensuring that advanced computational functionality are readily available to all analysts.

Slide 7

Slide 7 text

Calling Julia from Python

Slide 8

Slide 8 text

Calling Julia from Python How to do it? • There are two ways available: pyjulia and juliacall • pyjulia is the counterpart to PyCall.jl • juliacall is the counterpart to PythonCall.jl • Both libraries use Python’s and Julia’s C APIs to bridge the gap between both languages’ runtime environments

Slide 9

Slide 9 text

Calling Julia from Python Multithreading issue • Neither PyCall/pyjulia and PythonCall/juliacall support multithreading • Why? — This is because Julia’s C API isn’t designed to be called from threads that aren’t managed by the Julia’s runtime environment • When Julia is called from a thread that isn’t managed by Julia, a SEGV is triggered

Slide 10

Slide 10 text

Calling Julia from Python How does a SEGV occur? 1. Julia manages the current task information, what a task is running on the current thread, in the TLS (i.e. thread local storage) 2. Julia’s C API retrieves the current task information from the TLS 3. However, a thread outside of Julia doesn’t possess this information in the TLS, resulting in the Julia’s C API receiving a NULL pointer 4. Julia’s C API attempts to access the memory location pointed by the NULL pointer, thus leading to a SEGV

Slide 11

Slide 11 text

Calling Julia from Python The real example of a SEGV by calling Julia from a Streamlit application * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x18) frame #0: 0x000000013eece2cc libjulia-internal-debug.1.10.0.dylib`ijl_excstack_state at rtutils.c:307:28 304 JL_DLLEXPORT size_t jl_excstack_state(void) JL_NOTSAFEPOINT 305 { 306 jl_task_t *ct = jl_current_task; -> 307 jl_excstack_t *s = ct->excstack; 308 return s ? s->top : 0; 309 } 310 Target 0: (Python) stopped. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #define jl_current_task (container_of(jl_get_pgcstack(), jl_task_t, gcstack)) JL_CONST_FUNC jl_gcframe_t **jl_get_pgcstack(void) JL_NOTSAFEPOINT { return (jl_gcframe_t**)pthread_getspecific(jl_pgcstack_key); // <- retrieve from TLS }

Slide 12

Slide 12 text

Calling Julia from Python How to call Julia from a multithreaded Python program • Initialize Julia in the main thread • Avoid calling Julia from non-main threads • Call Julia only from the main thread every time • But how do we delegate calls to Julia to the main thread?

Slide 13

Slide 13 text

How Streamlit runs page scripts

Slide 14

Slide 14 text

How Streamlit runs page scripts The internals of Streamlit • Streamlit consists of the following parts: • Web server: Communicates with the web browsers • Runtime: Acts as the mediator between the web server and sessions • Session Manager: Manages sessions corresponding to clients • Session: Communicates with the individual client • Script Runner: Manages the script execution thread

Slide 15

Slide 15 text

How Streamlit runs page scripts Session initialization on the page load 1. Establish a new websocket connection between the client and the server 2. A new session corresponding to the websocket connection is created 3. Create the corresponding script runner for the session 4. The script runner starts a thread to run script running loop 5. The script running loop thread runs the main page script once, then wait the next “script rerun” request from the client

Slide 16

Slide 16 text

How Streamlit runs page scripts A Streamlit application is a multithreaded application • Streamlit creates a thread for each client to run a page script • The page script may run on multiple threads, simultaneously • We cannot call Julia from Streamlit application without speci fi c workarounds

Slide 17

Slide 17 text

Calling Julia from a Streamlit application

Slide 18

Slide 18 text

Calling Julia from a Streamlit application Simple approach doesn’t work • The direct approach below fails by SEGV $ cat main.py import streamlit as st from julia import Main st.write(Main.eval("2 + 2")) $ streamlit run main.py --server.port=8080 You can now view your Streamlit app in your browser. Network URL: http://10.110.5.39:8080 External URL: http://18.180.104.211:8080 [98859] signal (11.1): Segmentation fault in expression starting at none:0 Allocations: 2906 (Pool: 2897; Big: 9); GC: 0 Segmentation fault (core dumped)

Slide 19

Slide 19 text

Calling Julia from a Streamlit application How do we prevent calling Julia from non-main threads? Script Thread Script Thread Script Thread Session 1 Session 2 Session 3 Main Thread Julia How?

Slide 20

Slide 20 text

Calling Julia from a Streamlit application Use Streamlit’s event loop running on the main thread • Streamlit uses the async event loop to handle messages from the web server • We can retrieve the event loop by: Runtime.instance()._get_async_objs().eventloop • This event loop runs on the main thread • We can utilize this event loop to run a coroutine including Julia calls on the main thread by call_soon_threadsafe method

Slide 21

Slide 21 text

Demonstrations

Slide 22

Slide 22 text

streamlit-julia-caller project will soon be published on GitHub • This provides the features to call Julia in Streamlit’s page scripts • The julia_eval function evaluates Julia code in the main thread and returns the result • The julia_display function displays a given Julia object on the Streamlit page using methods such as st.write and st.image, which are chosen based on MIME types supported by the object in a manner similar to IJulia’s approach

Slide 23

Slide 23 text

streamlit-julia-caller project Known issues • Multiple sessions aren’t supported yet • juliacall/PythonCall should also be supported • No tests exist •

Slide 24

Slide 24 text

Conclusion

Slide 25

Slide 25 text

Conclusion • The issue on multithread about calling Julia from Python is explained • Streamlit’s internal mechanism of page execution is illustrated • Our approach to safely calling Julia from the script execution threads of Streamlit is introduced • Some simple example use cases of our approach are demonstrated • Existing known issues are shown

Slide 26

Slide 26 text

Other projects we are working on • We are also working on a pure Julia Streamlit clone • We attempt to completely reuse Streamlit’s client-side artifacts • We want to make it compatible with the existing Streamlit components • We will talk about this at the next JuliaTokyo even if we fail