Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Data Science, Python and the Functional Programming Revolution

Data Science, Python and the Functional Programming Revolution

Holger Peters

August 26, 2016
Tweet

More Decks by Holger Peters

Other Decks in Programming

Transcript

  1. Data Science, Python and the Functional Programming Revolution Holger Peters

    @data_hope EuroScipy 2016, Erlangen How to profit from functional principles when doing number-crunching.
  2. Functional programming: An umbrella term programs are built by composing

    functions expressions instead of statements
 evaluation instead of instruction Avoid memory mutation pure (side-effect free) functions higher-order functions Image licensed CC BY 2.0 https://flic.kr/p/wPKKto © flickr user surfergirl30
  3. FP: A constant innovator Python Features originating in FP •

    Garbage Collection • List Comprehension • Lazy Evaluation i.e. yield / generator expression • Interactive prompt (ipython) • Lambdas, higher-order functions
  4. Imperative vs. functional Python Imperative Statement exponentials = [] for

    x in values: exponentials.append(exp(x)) Functional Expression exponentials = [exp(x) for x in values] exponentials = map(exp, values) exponentials = np.exp(values) Mutation! Higher Order Function
  5. Imperative vs. functional Python Imperative Statement x = np.random.randn(size=100) mask

    = x < 0 res[mask] = 0 res[~mask] = np.sqrt(x[mask]) Functional Expression x = np.random.randn(size=100) np.where(x < 0, 0, np.sqrt(x))
  6. FP: A traveler's dictionary Imperative Functional method, procedure, callable (pure)

    function, lambda-functions mutable object value (immutable) execute evaluate statement expression loop higher order functions, recursion break / continue / goto Continuation, yield
  7. Linear regression "# $ = ' + ) * #*

    , -./ = ' + = (2 3 )5/2 pseudo−inverse Prediction Fit def linreg(X, y): return np.linalg.inv(X.T @ X) @ X.T @ y def linreg_pinv(X, y): return np.linalg.pinv(X) @ y Thinking on a more abstract level (Algebra) comes already natural to us! Linear algebra is about expressions. Math is functional. = / B ⋮ D , = #- = / ⋮ ,
  8. When FP is the only viable option What happens, once

    fundamental assumptions suddenly don't matter any more? CC © flick user „greg“ https://flic.kr/p/D2Dti
  9. Two common problems with Scientific Python Stack Dataset does not

    fit into RAM A core assumption of von Neumann, the random memory access, cannot be upheld! How can I write code for distributed programs over clusters? Can I use out-of-core methods to load data from disk on-demand? Implementation of Algorithm only uses one core Writing code as ”instructions for a CPU”, we typically write it in a way that it only utilizes one CPU core. Writing parallelized implementations can be challenging. Is there a better way?
  10. Dask: Out-of-core, distributed and parallization for numpy/pandas Dask works on

    expressions. Dask partitions data into chunks and evaluates expressions on these chunks using map and reduce. Dask can parallelize the evaluation of expressions. Dask can distribute the evaluation of expressions.
  11. How can map and reduce work chunkwise? Many reductive operations

    can be chunked: >>> np.sum(np.r_[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) 55 >>> sum([0, 1, 2, 3, 4, 5]) + sum([6, 7, 8, 9, 10]) 55 Elementwise mapping operations can be chunked: >>> 10. * np.r_[0, 1, 2, 3, 4, 5] array([ 0., 10., 20., 30., 40., 50.]) >>> np.hstack((10. * np.r_[0, 1, 2], ... 10. * np.r_[3, 4, 5])) array([ 0., 10., 20., 30., 40., 50.])
  12. Softmax: A vector ⃗ ∶ ℝJ is mapped, each component

    being constrained between 0 and 1 (so - ∶ ℝJ → (0,1)). Example: Normalized exponential a.k.a. “softmax function” - ⃗ = QRS(T5 T̂) ∑ QRS(T5 T̂) W XYZ , Beware of - > 2/', exp(2/' + ) overflows. With ̂ = max - ⃗ = QRS(Ta) ∑ QRS(TX) W XYZ ,
  13. # ̂ x = − ̂ exp_x= exp( − ̂)

    sum_x = ) exp( − ̂) J c./ exp_x sum_x - ⃗ = exp( − ̂) ∑ exp( − ̂) J c./ Softmax’ expression graph X_exp = np.exp(z - z.max()) return x_exp / x_exp.sum()
  14. Dask’s evaluation graph Map and Reduce Operations can be performed

    chunk wise parallelizable and serializable Example: Graph for softmax with four chunks of data
  15. How fast is parallelization by Dask, compared to alternatives? Image

    licensed CC-BY Flickr user ChristianSinclair https://flic.kr/p/aGo5fr
  16. Array length: 1e8 10 runs per box
 in memory (not

    out of core, not distributed) parallelized with 4 cores (virtual: 8)
  17. Comparison of softmax- implementations def softmax_numpy(x): e_x = np.exp(x -

    x.max()) return e_x / e_x.sum() def softmax_dask(x): x = da.from_array(x, chunks=int(x.shape[0] / CPU_COUNT), name='x') e_x = da.exp(x - x.max()) return (e_x / e_x.sum()).compute() def softmax_numexpr(x): mx = ne.evaluate('max(x)') e_x = ne.evaluate('exp(x - mx)') sum_of_exp = ne.evaluate('sum(e_x)') normalized = ne.evaluate('e_x / sum_of_exp') return normalized
  18. Comparison of softmax- implementations Cython with OpenMP from cython.parallel import

    prange from libc.math cimport exp @cython.nonecheck(False) @cython.cdivision(True) @cython.boundscheck(False) def softmax_openmp(np.ndarray[np.float64_t, ndim=1] x): cdef: int n = x.shape[0] int i np.float64_t s = 0.0 double max_x = np.max(x) np.ndarray[np.float64_t, ndim=1] e_x = np.empty(n) with nogil: for i in prange(n): e_x[i] = exp(x[i] - max_x) s += e_x[i] for i in prange(n): e_x[i] /= s return e_x
  19. Learning from this example With an API based on functional,

    pure-Python expressions, Dask gives you great performance, without losing any level of abstraction. Cython w/ OpenMP can be faster, but introduces complexity (natively-compiled code, low-level code). Image Flickr user Jordan Schwartz https://flic.kr/p/9B9HN3
  20. Summary It is better to focus on the problem &

    algorithmic properties when implementing, than to focus on implementation details of the machine. Image licensed CC BY 2.0 © flickr user dfaulder https://flic.kr/p/7z8NMj Functional programming – composing expressions – abstracts nicely over the machine- level. We rely on software – let software generate the most efficient implementation from your specifications. Ergonomy often trumps slight performance improvements.
  21. Conventional programming languages are growing ever more enormous, but not

    stronger. Inherent defects at the most basic level cause them to be both fat and weak: their primitive word-at-a-time style of programming inherited from their common ancestor — the von Neumann computer, their close coupling of semantics to state transitions, their division of programming into a world of expressions and a world of statements, [...], and their lack of useful mathematic properties for reasoning about programs. Abstract of John Backus' “Can Programming be Liberated from the von Neumann Style? A Functional Style and Its Algebra of Programs” ACM Turing Award 1977