CRuby committer. I'm permitted by my company to do any great things for Ruby ecosystem. In this year, I'm totally working for making tools for data science that are used with ap- plications written in Ruby 13 / 56
sualize living data in data science. 2D table data structure like a SQL table. In Ruby, we can use data frames with Daru (or Pandas via pycall as described later). 20 / 56
large amount of data [NMa- trix#362] Daru is less functionality for practical data sci- ence works. Less documented, so difficult to use. Reason of Drawbacks The small population of developers and users. 21 / 56
supports Apache Arrow. The core developer, Kohei Suto, is a member of Apache Arrow's PMC. Drawbacks Too young to use in production. Now only support data I/O, data manipulation is not supported. 27 / 56
from your Ruby code very naturally. Pycall consists of two parts: The Ruby binding library of libpython.so Object-oriented protocol gateway between Ruby and Python 34 / 56
pandas, matplotlib The following gems are future works: scikit-learn, seaborn, bokeh, keras, etc. You can use any Python libraries without wrapper gems 36 / 56
by Soren D Using the scikit-learn machine learning li- brary in Ruby using PyCall Implementing OCR using a Random Forest Classifier in Ruby Mai Nguyen's workshop material in KiwiRuby conference
all data scientists shouldn't want to use Ruby in their jobs They need the biggest powers of standard data tools like pandas in exploratory data analysis Ruby and Ruby on Rails are best for writing business web applications. 43 / 56
grate application written in Ruby and data process- ing systems written in Python 1. Referring the same database directly 2. RPC by serialized data like JSON 3. Directly call by pycall 44 / 56
the core of almost data tools. Pandas 2.0 will employ Apache Arrow as its core. PySpark already uses Apache Arrow to ex- change data between Python and Spark Red Data Tools is important for the future of Ruby's data science ecosystem. You should join Red Data Tools project if you are interested in Apache Arrow. https://red-data-tools.github.io/ 51 / 56
science I demonstrated an example usage of pycall I illustrates three patterns to integrate applica- tion written in Ruby and data processing system written in Python I talked about the future perspective 55 / 56