Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Functional Programming with Python by Dr.-Ing. Mike Müller

Functional Programming with Python by Dr.-Ing. Mike Müller

PyCon 2013

March 17, 2013
Tweet

More Decks by PyCon 2013

Other Decks in Technology

Transcript

  1. Functional Programming with Python
    PyCon US 2013
    Santa Clara, March 15, 2013
    Author: Dr.-Ing. Mike Müller
    E-Mail: [email protected]

    View Slide

  2. Introduction
    • Functional programming has a long history
    • Lisp 1958
    • Renaissance: F#, Haskell, Erlang …
    • Used in industry
    • Trading
    • Algorithmic
    • Telecommunication (Concurrency)

    View Slide

  3. Features of Functional Programming
    • Everything is a function
    • Pure functions without side effects
    • Immutable data structures
    • Preserve state in functions
    • Recursion instead of loops / iteration

    View Slide

  4. Advantages of Functional Programming
    • Absence of side effects can make your programs more robust
    • Programs tend to be more modular come and typically in
    smaller building blocks
    • Better testable - call with same parameters always returns
    same result
    • Focus on algorithms
    • Conceptional fit with parallel / concurrent programming
    • Live updates - Install new release while running

    View Slide

  5. Disadvantages of Functional Programming
    • Solutions to the same problem can look very different than
    procedural / object-oriented ones
    • Finding good developers can be hard
    • Not equally useful for all types of problems
    • Input/output are side effects and need special treatment
    • Recursion is "an order of magnitude more complex" than loops
    / iteration
    • Immutable data structures may increase run times

    View Slide

  6. Python's Functional Features - Overview
    • Pure functions (sort of)
    • Closures - hold state in functions
    • Functions as objects and decorators
    • Immutable data types
    • Lazy evaluation - generators
    • List (dictionary, set) comprehensions
    • functools, itertools, lambda, map, filter
    • Recursion - try to avoid, recursion limit has a reason

    View Slide

  7. Pure Functions - No Side Effects
    • No side effect, return value only
    • "shallow copy" problem
    def do_pure(data):
    """Return copy times two.
    """
    return data * 2
    • An overloaded * that modifies data or causes other side
    effects would make this function un-pure
    • No guarantee of pureness
    • Pure functions by convention

    View Slide

  8. Side Effects
    • Side effects are common
    def do_side_effect(my_list):
    """Modify list appending 100.
    """
    my_list.append(100)

    View Slide

  9. Functions are Objects
    def func1():
    return 1
    def func2():
    return 2
    >>> my_funcs = {'a': func1, 'b': func2}
    >>> my_funcs['a']()
    1
    >>> my_funcs['b']()
    2
    • Everything is an object

    View Slide

  10. Closures und "Currying"
    >>> def outer(outer_arg):
    >>> def inner(inner_arg):
    >>> return inner_arg + outer_arg
    >>> return inner
    >>> func = outer(10)
    >>> func(5)
    15
    >>> func.__closure__
    (,)
    >>> func.__closure__[0]

    >>> func.__closure__[0].cell_contents
    10

    View Slide

  11. Partial Functions
    • Module functools offers some tools for the functional
    approach
    >>> import functools
    >>> def func(a, b, c):
    ... return a, b, c
    ...
    >>> p_func = functools.partial(func, 10)
    >>> p_func(3, 4)
    10 3 4
    >>> p_func = functools.partial(func, 10, 12)
    >>> p_func(3)
    10 12 3

    View Slide

  12. Recursion
    def loop(n):
    for x in xrange(int(n)):
    a = 1 + 1
    def recurse(n):
    if n <= 0:
    return
    a = 1 + 1
    recurse(int(n) - 1)

    View Slide

  13. Recursion - Time it in IPython
    %timeit loop(1e3)
    10000 loops, best of 3: 48 us per loop
    %timeit recurse(1e3)
    1000 loops, best of 3: 687 us per loop
    • sys.setrecursionlimit(int(1e6)) and
    %timeit recurse(1e5) segfaulted my IPython kernel

    View Slide

  14. Lambda
    • Allows very limited anonymous functions
    • Expressions only, no statements
    • Past discussion to exclude it from Python 3
    • Useful for callbacks
    def use_callback(callback, arg):
    return callback(arg)
    >>> use_callback(lambda arg: arg * 2, 10)
    20

    View Slide

  15. Lambda - Not Essential
    • Always possible to add two extra lines
    • Write a function with name and docstring
    def double(arg):
    """Double the argument.
    """
    return arg *2
    >>> use_callback(double, 10)
    20

    View Slide

  16. List Comprehensions instead of map
    • Typical use of map
    >>> map(lambda arg: arg * 2, range(2, 6))
    [4, 6, 8, 10]
    • Replace with list comprehension
    >>> [x * 2 for x in range(2, 6)]
    [4, 6, 8, 10]

    View Slide

  17. List Comprehensions instead of filter
    • Typical use of filter
    >>> filter(lambda x: x > 10, range(5, 16))
    [11, 12, 13, 14, 15]
    • Replace with list comprehension
    >>> [x for x in range(5, 16) if x > 10]
    [11, 12, 13, 14, 15]

    View Slide

  18. Decorators
    • Application of closures
    >>> import functools
    >>> def decorator(func):
    ... @functools.wraps(func)
    ... def new_func(*args, **kwargs):
    ... print 'decorator was here'
    ... return func(*args, **kwargs)
    ... return new_func
    ...
    >>> @decorator
    ... def add(a, b):
    ... return a + b
    ...
    >>> add(2, 3)
    decorator was here
    5
    >>>

    View Slide

  19. Immutable Data Types - Tuples instead of Lists
    >>> my_list = range(10)
    >>> my_list
    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
    >>> my_tuple = tuple(my_list)
    >>> my_tuple
    (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
    • Contradicts the usage recommendation
    • Lists == elements of the same kind
    • Tuple == "named" elements

    View Slide

  20. Immutable Data Types - Freeze Sets
    >>> my_set = set(range(5))
    >>> my_set
    set([0, 1, 2, 3, 4])
    >>> my_frozenset = frozenset(my_set)
    >>> my_frozenset
    frozenset([0, 1, 2, 3, 4])
    • Can be used as dictionary keys

    View Slide

  21. Not Only Functional
    • Pure functional programs can be difficult to implement
    • Combine with procedural and object-oriented program parts
    • Choose right tool, for the task at hand
    • Develop a feeling where a functional approach can be
    beneficial

    View Slide

  22. Avoid Side Effects
    class MyClass(object):
    """Example for init-only definitions.
    """
    def __init__(self):
    self.attr1 = self._make_attr1()
    self.attr2 = self._make_attr2()
    @staticmethod
    def _make_attr1():
    """Do many things to create attr1.
    """
    attr1 = []
    # skipping many lines
    return attr1
    ....

    View Slide

  23. Avoid Side Effects
    • Set all attributes in __init__ (PyLint will remind you)
    • Actual useful application of static methods
    • Fewer side effects than setting attributes outside __init__
    • Your beloved classes and instances are still there
    • Inheritance without overriding __init__ and using super,
    child class implements own _make_attr1()

    View Slide

  24. Freeze Classes
    class Reader(object):
    def __init__(self):
    self.data = self._read()
    @staticmethod
    def _read():
    """Return tuple of tuple of read data.
    """
    data = []
    with open('data.txt') as fobj:
    for line in fobj:
    data.append(tuple(line.split()))
    return tuple(data)
    • Mutable data structures are useful for reading data
    • "Freeze" to get read-only version
    • No future, unwanted modifications possible

    View Slide

  25. Freeze Classes - One Liner Version
    • Still kind of readable
    class Reader(object):
    def __init__(self):
    self.data = self._read()
    @staticmethod
    def _read():
    """Return tuple of tuple of read data.
    """
    return tuple(tuple(line.split()) for line in open('data.txt'))

    View Slide

  26. Stepwise Freezing and Thawing I
    class FrozenUnFrozen(object):
    def __init__(self):
    self.__repr = {}
    self.__frozen = False
    def __getitem__(self, key):
    return self.__repr[key]
    def __setitem__(self, key, value):
    if self.__frozen:
    raise KeyError('Cannot change key %r' % key)
    self.__repr[key] = value
    def freeze(self):
    self.__frozen = True
    def unfreeze(self):
    self.__frozen = False

    View Slide

  27. Stepwise Freezing and Thawing II
    >>> fuf = FrozenUnFrozen()
    >>> fuf['a'] = 100
    >>> fuf['a']
    100
    >>> fuf.freeze()
    >>> fuf['a'] = 100
    Traceback (most recent call last):
    File "", line 1, in
    File "../freeze.py", line 9, in __setitem__
    raise KeyError('Cannot change key %r' % key)
    KeyError: "Cannot change key 'a'"
    >>> fuf['a']
    100
    >>> fuf.unfreeze()
    >>> fuf['a'] = 100
    >>>

    View Slide

  28. Use Case for Freezing
    • Legacy code: Where are data modified?
    • Complex systems: Detect unwanted modifications

    View Slide

  29. Immutable Data Structures - Counter Arguments
    • Some algorithms maybe difficult to implement
    • Can be rather inefficient - repeated re-allocation of memory
    • Antipattern string concatanation
    >>> s += 'text'
    • Try this in Jython and (standard-)PyPy

    View Slide

  30. Lazy Evaluation
    • Iterators and generators
    >>> [x * 2 for x in xrange(5)]
    [0, 2, 4, 6, 8]
    >>> (x * 2 for x in xrange(5))
    at 0x00F1E878>
    >>> sum(x *x for x in xrange(10))
    285
    • Saves memory and possibly CPU time

    View Slide

  31. Itertools - "Lazy Programmers are Good
    Programmers"
    • Module itertools offers tools for the work with iterators
    >>> it.izip('abc', 'xyz')

    >>> list(it.izip('abc', 'xyz'))
    [('a', 'x'), ('b', 'y'), ('c', 'z')]
    >>> list(it.islice(iter(range(10)), None, 8, 2))
    [0, 2, 4, 6]
    >>> range(10)[:8:2]
    [0, 2, 4, 6]

    View Slide

  32. Pipelining - Chaining Commands
    • Generators make good pipelines
    • Useful for workflow problems
    • Example parsing of a log file

    View Slide

  33. Pipelining - Example
    lines = read_forever(open(file_name))
    filtered_lines = filter_comments(lines)
    numbers = get_number(filtered_lines)
    sum_ = 0
    for number in numbers:
    sum_ += number
    print('sum: %d' % sum_)

    View Slide

  34. Coroutines - Push
    • Generators "pull" the data
    • Coroutines == generators with send()
    • Coroutines "push" the data

    View Slide

  35. Coroutines - Example
    # read_forever > filter_comments > get_number > TARGETS
    read_forever(open(file_name), filter_comments(get_number(TARGETS)))

    View Slide

  36. Conclusions
    • Python offers useful functional features
    • But it is no pure functional language
    • For some tasks the functional approach works very well
    • For some others much less
    • Combine and switch back and forth with oo and procedural
    style
    • "Stay pythonic, be pragmatic"

    View Slide

  37. Generators - Pull
    • Log file:
    35
    29
    75
    36
    28
    54
    # comment
    54
    56

    View Slide

  38. Generators - Pull - Import
    """Use generators to sum log file data on the fly.
    """
    import sys
    import time

    View Slide

  39. Generators - Pull - Read File
    def read_forever(fobj):
    """Read from a file as long as there are lines.
    Wait for the other process to write more lines.
    """
    counter = 0
    while True:
    line = fobj.readline()
    if not line:
    time.sleep(0.1)
    continue
    yield line

    View Slide

  40. Generators - Pull - Filter Out Comment Lines
    def filter_comments(lines):
    """Filter out all lines starting with #.
    """
    for line in lines:
    if not line.strip().startswith('#'):
    yield line

    View Slide

  41. Generators - Pull - Convert Numbers
    def get_number(lines):
    """Read the number in the line and convert it to an integer.
    """
    for line in lines:
    yield int(line.split()[-1])

    View Slide

  42. Generators - Pull - Initialize the Process I
    def show_sum(file_name='out.txt'):
    """Start all the generators and calculate the sum continuously.
    """
    lines = read_forever(open(file_name))
    filtered_lines = filter_comments(lines)
    numbers = get_number(filtered_lines)
    sum_ = 0
    try:
    for number in numbers:
    sum_ += number
    sys.stdout.write('sum: %d\r' % sum_)
    sys.stdout.flush()
    except KeyboardInterrupt:
    print 'sum:', sum_

    View Slide

  43. Generators - Pull - Initialize the Process II
    if __name__ == '__main__':
    import sys
    show_sum(sys.argv[1])

    View Slide

  44. Coroutines - Push
    • Log file:
    ERROR: 78
    DEBUG: 72
    WARN: 99
    CRITICAL: 97
    FATAL: 40
    FATAL: 33
    CRITICAL: 34
    ERROR: 18
    ERROR: 89
    ERROR: 46

    View Slide

  45. Coroutines - Push - Imports
    """Use coroutines to sum log file data with different log levels.
    """
    import functools
    import sys
    import time

    View Slide

  46. Coroutines - Push - Initialize with a Decorator
    def init_coroutine(func):
    functools.wraps(func)
    def init(*args, **kwargs):
    gen = func(*args, **kwargs)
    next(gen)
    return gen
    return init

    View Slide

  47. Coroutines - Push - Read the File
    def read_forever(fobj, target):
    """Read from a file as long as there are lines.
    Wait for the other process to write more lines.
    Send the lines to `target`.
    """
    counter = 0
    while True:
    line = fobj.readline()
    if not line:
    time.sleep(0.1)
    continue
    target.send(line)

    View Slide

  48. Coroutines - Push - Filter Out Comments
    @init_coroutine
    def filter_comments(target):
    """Filter out all lines starting with #.
    """
    while True:
    line = yield
    if not line.strip().startswith('#'):
    target.send(line)

    View Slide

  49. Coroutines - Push - Convert Numbers
    @init_coroutine
    def get_number(targets):
    """Read the number in the line and convert it to an integer.
    Use the level read from the line to choose the to target.
    """
    while True:
    line = yield
    level, number = line.split(':')
    number = int(number)
    targets[level].send(number)

    View Slide

  50. Coroutines - Push - Consumer I
    # Consumers for different cases.
    @init_coroutine
    def fatal():
    """Handle fatal errors."""
    sum_ = 0
    while True:
    value = yield
    sum_ += value
    sys.stdout.write('FATAL sum: %7d\n' % sum_)
    sys.stdout.flush()

    View Slide

  51. Coroutines - Push - Consumer II
    @init_coroutine
    def critical():
    """Handle critical errors."""
    sum_ = 0
    while True:
    value = yield
    sum_ += value
    sys.stdout.write('CRITICAL sum: %7d\n' % sum_)

    View Slide

  52. Coroutines - Push - Consumer III
    @init_coroutine
    def error():
    """Handle normal errors."""
    sum_ = 0
    while True:
    value = yield
    sum_ += value
    sys.stdout.write('ERROR sum: %7d\n' % sum_)

    View Slide

  53. Coroutines - Push - Consumer IV
    @init_coroutine
    def warn():
    """Handle warnings."""
    sum_ = 0
    while True:
    value = yield
    sum_ += value
    sys.stdout.write('WARN sum: %7d\n' % sum_)

    View Slide

  54. Coroutines - Push - Consumer V
    @init_coroutine
    def debug():
    """Handle debug messages."""
    sum_ = 0
    while True:
    value = (yield)
    sum_ += value
    sys.stdout.write('DEBUG sum: %7d\n' % sum_)

    View Slide

  55. Coroutines - Push - All Consumers
    TARGETS = {'CRITICAL': critical(),
    'DEBUG': debug(),
    'ERROR': error(),
    'FATAL': fatal(),
    'WARN': warn()}

    View Slide

  56. Coroutines - Push - Initialize
    def show_sum(file_name='out.txt'):
    """Start start the pipline.
    """
    # read_forever > filter_comments > get_number > TARGETS
    read_forever(open(file_name), filter_comments(get_number(TARGETS)))
    if __name__ == '__main__':
    show_sum(sys.argv[1])

    View Slide

  57. Conclusions
    • Python offers useful functional features
    • But it is no pure functional language
    • For some tasks the functional approach works very well
    • For some others much less
    • Combine and switch back and forth with oo and procedural
    style
    • "Stay pythonic, be pragmatic"

    View Slide