Functional Programming with Python by Dr.-Ing. Mike Müller

Afcfefa1f067d10bd021de0cc2e5e806?s=47 PyCon 2013
March 17, 2013

Functional Programming with Python by Dr.-Ing. Mike Müller

Afcfefa1f067d10bd021de0cc2e5e806?s=128

PyCon 2013

March 17, 2013
Tweet

Transcript

  1. Functional Programming with Python PyCon US 2013 Santa Clara, March

    15, 2013 Author: Dr.-Ing. Mike Müller E-Mail: mmueller@python-academy.de
  2. Introduction • Functional programming has a long history • Lisp

    1958 • Renaissance: F#, Haskell, Erlang … • Used in industry • Trading • Algorithmic • Telecommunication (Concurrency)
  3. Features of Functional Programming • Everything is a function •

    Pure functions without side effects • Immutable data structures • Preserve state in functions • Recursion instead of loops / iteration
  4. Advantages of Functional Programming • Absence of side effects can

    make your programs more robust • Programs tend to be more modular come and typically in smaller building blocks • Better testable - call with same parameters always returns same result • Focus on algorithms • Conceptional fit with parallel / concurrent programming • Live updates - Install new release while running
  5. Disadvantages of Functional Programming • Solutions to the same problem

    can look very different than procedural / object-oriented ones • Finding good developers can be hard • Not equally useful for all types of problems • Input/output are side effects and need special treatment • Recursion is "an order of magnitude more complex" than loops / iteration • Immutable data structures may increase run times
  6. Python's Functional Features - Overview • Pure functions (sort of)

    • Closures - hold state in functions • Functions as objects and decorators • Immutable data types • Lazy evaluation - generators • List (dictionary, set) comprehensions • functools, itertools, lambda, map, filter • Recursion - try to avoid, recursion limit has a reason
  7. Pure Functions - No Side Effects • No side effect,

    return value only • "shallow copy" problem def do_pure(data): """Return copy times two. """ return data * 2 • An overloaded * that modifies data or causes other side effects would make this function un-pure • No guarantee of pureness • Pure functions by convention
  8. Side Effects • Side effects are common def do_side_effect(my_list): """Modify

    list appending 100. """ my_list.append(100)
  9. Functions are Objects def func1(): return 1 def func2(): return

    2 >>> my_funcs = {'a': func1, 'b': func2} >>> my_funcs['a']() 1 >>> my_funcs['b']() 2 • Everything is an object
  10. Closures und "Currying" >>> def outer(outer_arg): >>> def inner(inner_arg): >>>

    return inner_arg + outer_arg >>> return inner >>> func = outer(10) >>> func(5) 15 >>> func.__closure__ (<cell at 0x10286f558: int object at 0x100313170>,) >>> func.__closure__[0] <cell at 0x10286f558: int object at 0x100313170> >>> func.__closure__[0].cell_contents 10
  11. Partial Functions • Module functools offers some tools for the

    functional approach >>> import functools >>> def func(a, b, c): ... return a, b, c ... >>> p_func = functools.partial(func, 10) >>> p_func(3, 4) 10 3 4 >>> p_func = functools.partial(func, 10, 12) >>> p_func(3) 10 12 3
  12. Recursion def loop(n): for x in xrange(int(n)): a = 1

    + 1 def recurse(n): if n <= 0: return a = 1 + 1 recurse(int(n) - 1)
  13. Recursion - Time it in IPython %timeit loop(1e3) 10000 loops,

    best of 3: 48 us per loop %timeit recurse(1e3) 1000 loops, best of 3: 687 us per loop • sys.setrecursionlimit(int(1e6)) and %timeit recurse(1e5) segfaulted my IPython kernel
  14. Lambda • Allows very limited anonymous functions • Expressions only,

    no statements • Past discussion to exclude it from Python 3 • Useful for callbacks def use_callback(callback, arg): return callback(arg) >>> use_callback(lambda arg: arg * 2, 10) 20
  15. Lambda - Not Essential • Always possible to add two

    extra lines • Write a function with name and docstring def double(arg): """Double the argument. """ return arg *2 >>> use_callback(double, 10) 20
  16. List Comprehensions instead of map • Typical use of map

    >>> map(lambda arg: arg * 2, range(2, 6)) [4, 6, 8, 10] • Replace with list comprehension >>> [x * 2 for x in range(2, 6)] [4, 6, 8, 10]
  17. List Comprehensions instead of filter • Typical use of filter

    >>> filter(lambda x: x > 10, range(5, 16)) [11, 12, 13, 14, 15] • Replace with list comprehension >>> [x for x in range(5, 16) if x > 10] [11, 12, 13, 14, 15]
  18. Decorators • Application of closures >>> import functools >>> def

    decorator(func): ... @functools.wraps(func) ... def new_func(*args, **kwargs): ... print 'decorator was here' ... return func(*args, **kwargs) ... return new_func ... >>> @decorator ... def add(a, b): ... return a + b ... >>> add(2, 3) decorator was here 5 >>>
  19. Immutable Data Types - Tuples instead of Lists >>> my_list

    = range(10) >>> my_list [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> my_tuple = tuple(my_list) >>> my_tuple (0, 1, 2, 3, 4, 5, 6, 7, 8, 9) • Contradicts the usage recommendation • Lists == elements of the same kind • Tuple == "named" elements
  20. Immutable Data Types - Freeze Sets >>> my_set = set(range(5))

    >>> my_set set([0, 1, 2, 3, 4]) >>> my_frozenset = frozenset(my_set) >>> my_frozenset frozenset([0, 1, 2, 3, 4]) • Can be used as dictionary keys
  21. Not Only Functional • Pure functional programs can be difficult

    to implement • Combine with procedural and object-oriented program parts • Choose right tool, for the task at hand • Develop a feeling where a functional approach can be beneficial
  22. Avoid Side Effects class MyClass(object): """Example for init-only definitions. """

    def __init__(self): self.attr1 = self._make_attr1() self.attr2 = self._make_attr2() @staticmethod def _make_attr1(): """Do many things to create attr1. """ attr1 = [] # skipping many lines return attr1 ....
  23. Avoid Side Effects • Set all attributes in __init__ (PyLint

    will remind you) • Actual useful application of static methods • Fewer side effects than setting attributes outside __init__ • Your beloved classes and instances are still there • Inheritance without overriding __init__ and using super, child class implements own _make_attr1()
  24. Freeze Classes class Reader(object): def __init__(self): self.data = self._read() @staticmethod

    def _read(): """Return tuple of tuple of read data. """ data = [] with open('data.txt') as fobj: for line in fobj: data.append(tuple(line.split())) return tuple(data) • Mutable data structures are useful for reading data • "Freeze" to get read-only version • No future, unwanted modifications possible
  25. Freeze Classes - One Liner Version • Still kind of

    readable class Reader(object): def __init__(self): self.data = self._read() @staticmethod def _read(): """Return tuple of tuple of read data. """ return tuple(tuple(line.split()) for line in open('data.txt'))
  26. Stepwise Freezing and Thawing I class FrozenUnFrozen(object): def __init__(self): self.__repr

    = {} self.__frozen = False def __getitem__(self, key): return self.__repr[key] def __setitem__(self, key, value): if self.__frozen: raise KeyError('Cannot change key %r' % key) self.__repr[key] = value def freeze(self): self.__frozen = True def unfreeze(self): self.__frozen = False
  27. Stepwise Freezing and Thawing II >>> fuf = FrozenUnFrozen() >>>

    fuf['a'] = 100 >>> fuf['a'] 100 >>> fuf.freeze() >>> fuf['a'] = 100 Traceback (most recent call last): File "<interactive input>", line 1, in <module> File "../freeze.py", line 9, in __setitem__ raise KeyError('Cannot change key %r' % key) KeyError: "Cannot change key 'a'" >>> fuf['a'] 100 >>> fuf.unfreeze() >>> fuf['a'] = 100 >>>
  28. Use Case for Freezing • Legacy code: Where are data

    modified? • Complex systems: Detect unwanted modifications
  29. Immutable Data Structures - Counter Arguments • Some algorithms maybe

    difficult to implement • Can be rather inefficient - repeated re-allocation of memory • Antipattern string concatanation >>> s += 'text' • Try this in Jython and (standard-)PyPy
  30. Lazy Evaluation • Iterators and generators >>> [x * 2

    for x in xrange(5)] [0, 2, 4, 6, 8] >>> (x * 2 for x in xrange(5)) <generator object <genexpr> at 0x00F1E878> >>> sum(x *x for x in xrange(10)) 285 • Saves memory and possibly CPU time
  31. Itertools - "Lazy Programmers are Good Programmers" • Module itertools

    offers tools for the work with iterators >>> it.izip('abc', 'xyz') <itertools.izip object at 0x00FA9F80> >>> list(it.izip('abc', 'xyz')) [('a', 'x'), ('b', 'y'), ('c', 'z')] >>> list(it.islice(iter(range(10)), None, 8, 2)) [0, 2, 4, 6] >>> range(10)[:8:2] [0, 2, 4, 6]
  32. Pipelining - Chaining Commands • Generators make good pipelines •

    Useful for workflow problems • Example parsing of a log file
  33. Pipelining - Example lines = read_forever(open(file_name)) filtered_lines = filter_comments(lines) numbers

    = get_number(filtered_lines) sum_ = 0 for number in numbers: sum_ += number print('sum: %d' % sum_)
  34. Coroutines - Push • Generators "pull" the data • Coroutines

    == generators with send() • Coroutines "push" the data
  35. Coroutines - Example # read_forever > filter_comments > get_number >

    TARGETS read_forever(open(file_name), filter_comments(get_number(TARGETS)))
  36. Conclusions • Python offers useful functional features • But it

    is no pure functional language • For some tasks the functional approach works very well • For some others much less • Combine and switch back and forth with oo and procedural style • "Stay pythonic, be pragmatic"
  37. Generators - Pull • Log file: 35 29 75 36

    28 54 # comment 54 56
  38. Generators - Pull - Import """Use generators to sum log

    file data on the fly. """ import sys import time
  39. Generators - Pull - Read File def read_forever(fobj): """Read from

    a file as long as there are lines. Wait for the other process to write more lines. """ counter = 0 while True: line = fobj.readline() if not line: time.sleep(0.1) continue yield line
  40. Generators - Pull - Filter Out Comment Lines def filter_comments(lines):

    """Filter out all lines starting with #. """ for line in lines: if not line.strip().startswith('#'): yield line
  41. Generators - Pull - Convert Numbers def get_number(lines): """Read the

    number in the line and convert it to an integer. """ for line in lines: yield int(line.split()[-1])
  42. Generators - Pull - Initialize the Process I def show_sum(file_name='out.txt'):

    """Start all the generators and calculate the sum continuously. """ lines = read_forever(open(file_name)) filtered_lines = filter_comments(lines) numbers = get_number(filtered_lines) sum_ = 0 try: for number in numbers: sum_ += number sys.stdout.write('sum: %d\r' % sum_) sys.stdout.flush() except KeyboardInterrupt: print 'sum:', sum_
  43. Generators - Pull - Initialize the Process II if __name__

    == '__main__': import sys show_sum(sys.argv[1])
  44. Coroutines - Push • Log file: ERROR: 78 DEBUG: 72

    WARN: 99 CRITICAL: 97 FATAL: 40 FATAL: 33 CRITICAL: 34 ERROR: 18 ERROR: 89 ERROR: 46
  45. Coroutines - Push - Imports """Use coroutines to sum log

    file data with different log levels. """ import functools import sys import time
  46. Coroutines - Push - Initialize with a Decorator def init_coroutine(func):

    functools.wraps(func) def init(*args, **kwargs): gen = func(*args, **kwargs) next(gen) return gen return init
  47. Coroutines - Push - Read the File def read_forever(fobj, target):

    """Read from a file as long as there are lines. Wait for the other process to write more lines. Send the lines to `target`. """ counter = 0 while True: line = fobj.readline() if not line: time.sleep(0.1) continue target.send(line)
  48. Coroutines - Push - Filter Out Comments @init_coroutine def filter_comments(target):

    """Filter out all lines starting with #. """ while True: line = yield if not line.strip().startswith('#'): target.send(line)
  49. Coroutines - Push - Convert Numbers @init_coroutine def get_number(targets): """Read

    the number in the line and convert it to an integer. Use the level read from the line to choose the to target. """ while True: line = yield level, number = line.split(':') number = int(number) targets[level].send(number)
  50. Coroutines - Push - Consumer I # Consumers for different

    cases. @init_coroutine def fatal(): """Handle fatal errors.""" sum_ = 0 while True: value = yield sum_ += value sys.stdout.write('FATAL sum: %7d\n' % sum_) sys.stdout.flush()
  51. Coroutines - Push - Consumer II @init_coroutine def critical(): """Handle

    critical errors.""" sum_ = 0 while True: value = yield sum_ += value sys.stdout.write('CRITICAL sum: %7d\n' % sum_)
  52. Coroutines - Push - Consumer III @init_coroutine def error(): """Handle

    normal errors.""" sum_ = 0 while True: value = yield sum_ += value sys.stdout.write('ERROR sum: %7d\n' % sum_)
  53. Coroutines - Push - Consumer IV @init_coroutine def warn(): """Handle

    warnings.""" sum_ = 0 while True: value = yield sum_ += value sys.stdout.write('WARN sum: %7d\n' % sum_)
  54. Coroutines - Push - Consumer V @init_coroutine def debug(): """Handle

    debug messages.""" sum_ = 0 while True: value = (yield) sum_ += value sys.stdout.write('DEBUG sum: %7d\n' % sum_)
  55. Coroutines - Push - All Consumers TARGETS = {'CRITICAL': critical(),

    'DEBUG': debug(), 'ERROR': error(), 'FATAL': fatal(), 'WARN': warn()}
  56. Coroutines - Push - Initialize def show_sum(file_name='out.txt'): """Start start the

    pipline. """ # read_forever > filter_comments > get_number > TARGETS read_forever(open(file_name), filter_comments(get_number(TARGETS))) if __name__ == '__main__': show_sum(sys.argv[1])
  57. Conclusions • Python offers useful functional features • But it

    is no pure functional language • For some tasks the functional approach works very well • For some others much less • Combine and switch back and forth with oo and procedural style • "Stay pythonic, be pragmatic"