Slide 1

Slide 1 text

DYND: ENABLING COMPLEX ANALYTICS ACROSS THE LANGUAGE BARRIER IRWIN ZAID, CONTINUUM ANALYTICS

Slide 2

Slide 2 text

Part 1: The What and Why of DyND Part 2: DyND Snippets DYND: ENABLING COMPLEX ANALYTICS ACROSS THE LANGUAGE BARRIER

Slide 3

Slide 3 text

PYDATA LONDON 2016 What is DyND? DyND aims to be a modern library for array-oriented computing

Slide 4

Slide 4 text

PYDATA LONDON 2016 What is DyND? ‣ A library, not a language ‣ Helps do computation with arrays ‣ “Modern” means “uses software engineering practices from today, solving problems people have had for a while”

Slide 5

Slide 5 text

PYDATA LONDON 2016 What is DyND really? A type system and a callable object that operate together with an array container, engineered in C++ and bound to dynamic languages like Python

Slide 6

Slide 6 text

PYDATA LONDON 2016 What is DyND not? ‣ DyND is not NumPy 2.0
 - NumPy is really good at what it was designed to do: broadcasting 
 computation and stride-based manipulation
 - NumPy may not be “replaceable” — Fortran is still pretty widely used ‣ DyND is not a JIT for Python
 - See Numba for that ‣ DyND is not Boost.MultiArray
 - DyND is dynamic C++, templates are a hidden detail

Slide 7

Slide 7 text

PYDATA LONDON 2016 It’s 2016. Why write another array library? Because there are problems that are not being otherwise solved

Slide 8

Slide 8 text

PYDATA LONDON 2016 It’s 2016. Why write another array library? ‣ Example: Missing data
 - Values that may or may not be present
 - The masked arrays of numpy.ma are not sufficient, there is overhead related 
 to how the masked array is stored and NumPy is not always consistent 
 with how it treats mask arrays
 - Discussed at length in 2011, remains unsolved in 2016

Slide 9

Slide 9 text

PYDATA LONDON 2016 It’s 2016. Why write another array library? ‣ Example: Variable-length strings
 - NumPy can only efficiently represent strings of a predefined length
 - Variable-length strings have to be stored as Python objects

Slide 10

Slide 10 text

PYDATA LONDON 2016 It’s 2016. Why write another array library? ‣ Example: Custom types
 - NumPy dtypes are too primitive
 - How does one represent categorical data? Ragged dimensions? GPU data?
 - Cannot define user overloads on ufuncs, e.g. string concatenation

Slide 11

Slide 11 text

PYDATA LONDON 2016 It’s 2016. Why write another array library? ‣ Example: NumPy without the “Py”
 - Sometimes we don’t want to use Python
 - Why not have a representation of data that can go between several 
 languages? (R, Julia, Javascript, …)

Slide 12

Slide 12 text

PYDATA LONDON 2016 Four Traits of DyND ‣ Expressive ‣ Generic ‣ Extendable ‣ Pluggable
 
 


Slide 13

Slide 13 text

PYDATA LONDON 2016 Expressive ‣ DyND implements Datashape as its type system
 - A structured data description language, http://datashape.pydata.org
 type dimension * dtype
 
 var * int32
 3 * string
 4 * float64 Datashape: Struct type {x: int32, y: string, z: float64} Tabular data var * {x: int32, y: string, z: float64}

Slide 14

Slide 14 text

PYDATA LONDON 2016 Generic ‣ DyND’s type, callable, and array objects are reference- counted smart pointers that dynamically interpret data ‣ Types can be parameterized on other types
 - N * T, var[T], option[T] ‣ Callables can be transformed (in a functional sense) from inner operations to higher-order patterns 
 - Define the innermost operation, then build out the behavior you want with 
 predefined generic patterns
 - nd::functional::elwise([](int x, int y) { return x + y; });

Slide 15

Slide 15 text

PYDATA LONDON 2016 Extendable ‣ Types and callables are first-class objects that users should create directly

Slide 16

Slide 16 text

PYDATA LONDON 2016 Extendable ‣ Types and callables are first-class objects that users should create directly

Slide 17

Slide 17 text

PYDATA LONDON 2016 Pluggable ‣ DyND supports plug-in libraries
 - Define custom types and callables (or namespaces thereof) directly ‣ Use nd::set(“my_amazing_callable”, f) for a custom callable or nd::set(“my_amazing_namespace”, {{“my_amazing_callable”, f}, {“my_other_amazing_callable”, g}}) for a custom namespace ‣ Callables are dynamically propagated to Python, entirely removing the need for any user wrapper code

Slide 18

Slide 18 text

Part 1: The What and Why of DyND Part 2: DyND Snippets DYND: ENABLING COMPLEX ANALYTICS ACROSS THE LANGUAGE BARRIER

Slide 19

Slide 19 text

PYDATA LONDON 2016 Types ‣ Types are instances of simple classes
 - Write a class, get a type ‣ Types expose dynamic features to arrays
 - Either properties, like .real or .imag, or behavior, like .conj() ‣ Types can be kinds or patterns
 - Int, Scalar, Fixed, or Any; Fixed * T or (N * T, T) -> T

Slide 20

Slide 20 text

PYDATA LONDON 2016 Metadata and Data ‣ Array metadata can describe data other than strided
 - Offset (tuple or struct), indirect (pointer), ragged (variable-sized dimensions) ‣ Array data is poolable or allocatable in custom memory spaces
 - Variable-sized strings or dimensions; CUDA

Slide 21

Slide 21 text

PYDATA LONDON 2016 Fundamental Types

Slide 22

Slide 22 text

PYDATA LONDON 2016 Dimension Types

Slide 23

Slide 23 text

PYDATA LONDON 2016 Aggregate Types

Slide 24

Slide 24 text

PYDATA LONDON 2016 Option Type

Slide 25

Slide 25 text

PYDATA LONDON 2016 Symbolic Types

Slide 26

Slide 26 text

PYDATA LONDON 2016 Callable and Functionals ‣ Share functions alongside data
 - Callables are first-class objects that can be dynamically published ‣ Enable user-defined functions with generic patterns
 - Functionals like apply, elwise, reduction, multidispatch, outer, neighborhood, 
 and rolling transform one callable into another ‣ Built-in callable are overloadable
 - Users can define +, -, *, /, … for custom types

Slide 27

Slide 27 text

PYDATA LONDON 2016 Elementwise Functional

Slide 28

Slide 28 text

PYDATA LONDON 2016 Reduction Functional

Slide 29

Slide 29 text

PYDATA LONDON 2016 Multidispatch Functional

Slide 30

Slide 30 text

PYDATA LONDON 2016 Option Operations

Slide 31

Slide 31 text

PYDATA LONDON 2016 JSON Processing

Slide 32

Slide 32 text

PYDATA LONDON 2016 Thanks to… Mark Wiebe Ian Henriksen Stefan Krah Irwin Zaid

Slide 33

Slide 33 text

PYDATA LONDON 2016 Thanks to…

Slide 34

Slide 34 text

PYDATA LONDON 2016 Get DyND! conda install dynd-python -c dynd/channel/dev