• Build tools for data science. • Not so much in business of making models, more of speeding up existing models. • We provide consulting, training, and support. • Developers of Pyfora (open source). www.ufora.com 3
is hard • What if my dataset is larger than RAM? • What if I really want to run my model on all of the data? • What if I want to fit many models at once? (For example, cross- validation or grid search). • Why do I need different tools for different scale? 5
• Cython and C-extensions are tricky. • Numpy is great, but can't do everything. • Why can't loops be fast? Custom classes? Higher- order functions? • (There are many other python implementations which address this problem: pypy, numba, …) 6
machine code (C-speed) during runtime. • Automatically-parallel: parallelism happens without the user needing to be aware of threads, processes, synchonization, etc. • All of this while writing ordinary python. (And we're open source: https://github.com/ufora/ufora) 7
that runs on one or more machines in your local network or in the cloud (on your own machines). This runs on docker. 2. A python package ( ) that sends code from your local python process to the backend for compilation and execution (aka, the client, or frontend). 8
Intel(R) Core(TM) i7-2600 quad-core (8 hyperthreaded) CPU, and utilizes all 8 cores. The same program in the local python interpreter takes 185.95 seconds and uses one core. 12
◦ is not allowed. is ok (and fast) ◦ , are not allowed. ◦ Numpy arrays are ok, but mutating operations are not allowed. • No operations can have side effects ◦ E.g., no writing to files. ◦ Side-effectful operations can still happen in the host Python process, while referencing objects in remote calls. ◦ is OK, but it is a no op. • All operations are deterministic ◦ E.g., no access to within a remote call. 17
an exception -- either at runtime or at "parse time" (when we ship code to the server and define it there). • In general, pyfora should always give the same answer as normal CPython, or it should throw an exception saying it can't handle that code. 18
parallel (when possible). The pyfora runtime searches for parallel operations as code executes, and these come in the form of independent function calls. For example, if we see We can execute these in parallel, due to our immutability assumptions. 19
language, Fora. • Fora has implicit parallelism and JIT-compilation built in. • Getting people to adopt a new language is not easy. • So we built a source-to-source compiler: Python -> Fora. • Now, Fora is our "bitcode". 22
us on github: https://github.com/ufora/ufora • Read our docs: http://docs.pyfora.com/ • Follow us on twitter: @uforainc • Checkout our Google groups: ufora-dev, ufora-user • Email me: [email protected] Try it out! Let us know your experiences and things you'd like implemented! 23