Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Pythonic functional itertools for your data challenge

Pythonic functional itertools for your data challenge

Nowadays Python is very likely to be the first choice for developing machine learning or data science applications. Reasons for this are manifold, but very likely to be found in the fact that the Python language is amazing (⚠️ _opinionated_), and the open source community in the PyData ecosystem is absolutely fantastic (💙 _that's a fact_ [1](https://youtu.be/d9Qm3PPoYNQ?t=800) [2](https://pydata.org/diversity-inclusion/) [3](https://numfocus.org/)). In this context, one of the most remarkable features of the Python language is its ability in supporting multiple programming styles (from _imperative_ to _OOP_ and also _functional programming_). Thanks to this versatility, developers have their freedom to choose whichever programming style they prefer.

Functional programming is indeed very fascinating, and it is great for in-demand tasks such as _data filtering_ or _data processing_. Of course, this doesn't say anything about other paradigms, but sometimes the solution to a data problem could be more [naturally expressed](https://gist.github.com/leriomaggio/aef46a144119544df37649e46b51d64c) using a functional approach.

In this talk, we will discuss Python's support to functional programming, understanding the meaning of _pure functions_ (also _why mutable function parameters are **always** a bad idea_), and Python classes and modules that would help you in this style, namely `itertools`, `functools`, `map-reduce` data processing pattern.
As for reference data challenges, we will discuss _functional-style_ solutions to [Advent of Code](https://adventofcode.com/) coding puzzles, to make it fun, and interactive.

Valerio Maggio

April 21, 2023
Tweet

More Decks by Valerio Maggio

Other Decks in Programming

Transcript

  1. Pythonic functional
    (iter)tools
    for your data challenges
    [email protected]
    @leriomaggio
    Image Source: https://scryfall.com/card/apc/112/mystic-snake
    All content released under the Wizards of the Coast’s Fan Content Policy
    Valerio Maggio

    View Slide

  2. Still
    • Researcher and Data Scientist


    • ML/DL for BioMedicine


    • Data Scientists Advocate


    • SSI Fellow


    • Python Geek

    • Casual M:TG Player
    me
    pun
    Who
    🧙


    “a short summary of myself in logos”
    @mtg_lotus_vale
    I’m Valerio

    View Slide

  3. Refs & Mentions

    & Stocktone to Malone…

    View Slide

  4. This talk is absolutely brilliant!
    And This (this 👇) talk can be considered as a RECAP + COMPLEMENT

    View Slide

  5. Pythonic functional
    (iter)tools
    for your data challenges
    [email protected]
    @leriomaggio
    Image Source: https://scryfall.com/card/vis/155/snake-basket
    All content released under the Wizards of the Coast’s Fan Content Policy
    Clumsy and Convoluted
    Python code
    (that only you can possibly understand (maybe))
    (( ))
    Valerio Maggio

    View Slide

  6. Advent of Code 🎄https://adventofcode.com

    View Slide

  7. from Advent of Code to Coders of Advent
    Private Leaderboards to have fun with stats
    Public shout out to my friends and colleagues!


    You folks rock! 🫶
    Honorable Mentions: @Taifu

    @antigones, @edobld, @davrizzo, @Andrea

    @paulox, @veggero, @greenkey, @akiross,
    @DaBenny, @SaltySpaghetti
    Special Thanks:

    Daniel Holth &

    Michael C. Grant

    View Slide

  8. from Advent of Code to Coders of Advent
    https://github.com/leriomaggio/AoC
    We had only one rule:

    Just use the Standard Library
    Valerio: Why not using functional programming too sometimes?

    View Slide

  9. Algorithmic

    Adventures
    Image Source:https://scryfall.com/card/unh/108/remodel
    All content released under the Wizards of the Coast’s Fan Content Policy
    What to expect in AoC

    View Slide

  10. Python is fantastic!

    View Slide

  11. Python is fantastic!
    Support for many programming styles
    (including functional programming)
    without enforcing any!

    View Slide

  12. 1. Functions
    Functional Programming
    “(pure) functional programming” is a gradient among various languages
    Our definition: FP has to do with functions over stream of data
    🙃

    View Slide

  13. Functional Programming
    1. Pure functions
    •In FP, input
    fl
    ows through a set of functions


    •Every function should not have any side e
    ff
    ects 🧟
    •Every output must depend only on its input

    View Slide

  14. ReLU function
    max(0, x)
    Functional Programming
    1. Pure functions & Mutability
    ⚠ Side e
    ff
    ect
    🫶 List Comprehension

    View Slide

  15. • FP languages require functions to:


    1.To take another function as an argument


    2.To return another function to its caller


    • A higher-order function takes one or more functions as input and returns a new function.
    Lambda functions
    Functions are
    fi
    rst-class citizens in FP &
    Functional Programming
    1.5 High-Order Functions

    View Slide

  16. Iterable Stream of Data
    map
    map(f, )
    f( )
    f( )
    f( )
    f( )

    View Slide

  17. f
    ilter
    fi
    lter(f, )

    View Slide

  18. Image Source: https://scryfall.com/card/4ed/34/land-tax
    All content released under the Wizards of the Coast’s Fan Content Policy
    functools
    Higher-order functions and
    operations on callable objects

    View Slide

  19. Partial function application (“currying”)
    functools.partial

    View Slide

  20. Iterable Stream of Data
    functools.reduce
    reduce(f, ) f( , )
    f(
    f( , )
    , )
    =

    View Slide

  21. 1. Functions
    Functional Programming
    “(pure) functional programming” is a gradient among languages
    Our definition: FP has to do with functions over stream of data
    2. Iterators and Iterables!

    View Slide

  22. Functional Programming
    2. Iterators vs Iterables (vs Generators)
    • Iterator: an object representing a stream of data (key: Laziness)


    • Returns one element at a time - next(it)


    • If there are no more elements, a StopIteration exception is raised.


    • Iterable: an object is an iterable, if you could get an iterator for it
    • Generator: a convenient function to return an iterator over a stream of data


    • Function: return a value (i.e. data)


    • Generator: return an iterator which yields a value

    View Slide

  23. Functional Programming
    Common operations on iterator’s output:


    • Element wise ops (e.g. ReLU function)


    • Selection and
    fi
    ltering
    2. Iterators vs Iterables (vs Generators)
    listcomps & genexps (borrowed from Haskell)

    View Slide

  24. Functional Programming
    • Iterators are stateful


    • Once you’ve consumed the iterator, it’s gone. You’d need a new one!
    many built-ins return iterators (e.g. map, f
    i
    lter, enumerate, reversed, zip)
    ⚠ range is not an Iterator!
    2. Iterators vs Iterables (vs Generators)

    View Slide

  25. Functional Programming
    ⚠ range is not an Iterator!
    Cannot call next()
    Can call len()
    Is never consumed
    Is subscriptable

    View Slide

  26. The range_object is a lazy iterable
    • The distinction may matter for your program!


    1.You always expect an Iterator to be consumed


    2.You always expect to call next() on Iterators

    View Slide

  27. Functional built-ins

    View Slide

  28. Functional built-ins

    View Slide

  29. Functional built-ins

    View Slide

  30. itertools
    Image Source: https://scryfall.com/card/wth/130/harvest-wurm
    All content released under the Wizards of the Coast’s Fan Content Policy
    Functions creating iterators for e
    ff i
    cient looping

    View Slide

  31. Iterable Stream of Data
    itertools.accumulate
    accumulate(f, )
    f( , )
    f( , )
    f( , )
    f( , )
    f(
    f( , )
    , )

    View Slide

  32. Itertools
    Functions creating iterators for e
    ff i
    cient looping

    View Slide

  33. Itertools
    Functions creating iterators for e
    ff i
    cient looping
    Very e
    ff
    i
    cient over large stream of data
    In itertools since Py3.10

    View Slide

  34. Itertools
    Functions creating iterators for e
    ff i
    cient looping
    Will be In itertools

    in Py3.12

    View Slide

  35. Coding

    Challenges
    Image Source:https://scryfall.com/card/mmq/61/brainstorm
    All content released under the Wizards of the Coast’s Fan Content Policy
    Putting (almost) everything in
    practice with AoC + new 💎

    View Slide

  36. Warm up
    from Sonar Sweep - adventofcode.com/2021/day/1 - part 1
    Pairwise in 3.10
    Iterator does not have len

    View Slide

  37. Warm up
    from Sonar Sweep - adventofcode.com/2021/day/1 - part 2
    more-itertools
    Lookup

    View Slide

  38. Reduce at its best
    from Binary Diagnostic- adventofcode.com/2021/day/3 - part 1
    We need to slice the data vertically!

    View Slide

  39. Reduce at its best
    from Binary Diagnostic- adventofcode.com/2021/day/3 - part 1
    Walrus Operator
    Slice vertically

    View Slide

  40. Moving on an in
    f
    inite 2d Grid
    from Rope Bridge - adventofcode.com/2022/day/9 - part 1
    Versors coords: (-1, -1), (-1, 0), …

    View Slide

  41. Moving on a
    f
    inite 2d
    f
    inite Grid
    from Rope Bridge - adventofcode.com/2021/day/9 - part 1
    functools.starmap(f,
    *
    args)

    View Slide

  42. Moving on a
    f
    inite 2d
    f
    inite Grid (cont.)
    from Rope Bridge - adventofcode.com/2021/day/9 - part 1

    View Slide

  43. The key_fn is the key
    from Rope Bridge - adventofcode.com/2022/day/13 - part 2

    View Slide

  44. The key_fn is the key (cont.)
    from Rope Bridge - adventofcode.com/2022/day/13 - part 2
    functools.cmp_to_key(f)
    Quicksort, Timsort, Powersort - Algorithmic ideas, engineering tricks, and
    trivia behind CPython's new sorting algorithm

    Sebastian Wild | Saturday, 255ABC 3:15

    View Slide

  45. Why Functional Programming ?
    • High level: focus on the result rather than explicitly specifying the steps to get it.


    • Transparent: The behaviour of a pure function depends only on its inputs and
    outputs, without intermediary values.


    • That eliminates the possibility of side e
    ff
    ects, which facilitates debugging.


    • Parallelizable:

    Executions with no side e
    ff
    ects can more easily run in parallel with one another.


    • Data / Coding challenges ?


    • FP o
    ff
    ers an alternative way to express a solution that is more natural

    View Slide

  46. Neural Networks as High-Order Functions
    NN
    -
    >
    linear(relu(linear(…(relu(linear(X))))))
    Image: Jose-Luis Olivares/MIT
    linear
    ~
    >
    sum(itertools.starmap(operator.mul, zip(W_i, x_j)))
    https://github.com/python/cpython/issues/100485
    math.sumprod PR by Raimond Hattinger

    View Slide

  47. Functional Programming in Python By Example
    By Erik Welch
    https://bit.ly/functional-python-lightning

    View Slide

  48. References
    • Functional Programming HOWTO | https://docs.python.org/3/howto/functional.html


    • Joel Grus: Learning Data Science Using Functional Python | https://www.youtube.com/watch?
    v=ThS4juptJjQ


    • "The Joy of Functional Programming (for Data Science)" with Hadley Wickham

    https://www.youtube.com/watch?v=bzUmK0Y07ck


    • Functional Programming in Python | https://realpython.com/python-functional-programming/


    • Tour of Python Itertools | https://martinheinz.dev/blog/16


    • (How to Write a (Lisp) Interpreter (in Python)) | https://norvig.com/lispy.html


    • (An ((Even Better) Lisp) Interpreter (in Python)) | https://norvig.com/lispy2.html


    • https://github.com/norvig/pytudes/blob/main/py/lis.py

    View Slide

  49. Thank you very much

    for your kind attention
    Valerio Maggio
    [email protected]
    @leriomaggio

    View Slide