Upgrade to Pro — share decks privately, control downloads, hide ads and more …

PyConZA 2014: "Monkeying around with Twisted" by Richard Spiers

Pycon ZA
October 03, 2014

PyConZA 2014: "Monkeying around with Twisted" by Richard Spiers

Twisted is an open source framework for writing network based services. It utilises an asynchronous, event driven model which allows the rapid development of custom network protocols. While Twisted makes implementing network services much easier, it comes with its own set of challenges. One of these challenges involves tracking the context of multiple requests. In a traditional web server the thread ID could be used to track a particular request and modify log entries appropriately. However, this does not work in Twisted as it utilises a single thread (generally speaking).

We have chosen to tackle this issue, amongst others, by monkey patching some of the Twisted subsystems.

Through this talk we will introduce Twisted as a viable option for programming network based services. No previous Twisted knowledge is required, and the concepts introduced will be explained through real world examples of restful controllers implemented in Twisted. This primer will be followed by a discussion on our solution to the context tracking problem as well as some of the other areas in which we have found monkey patching to be beneficial.

To summarise, this talk will:

* introduce the core concepts in Twisted
* discuss how we track related actions across multiple disparate services and nodes
* tag our log entries with identifying context
* handle time jumps on underlying infrastructure
* implement clean unit tests for different environments

The talk is aimed at a general Python audience so no previous experience with Twisted is required. However, we believe that our monkey patching approach will be of interest to people currently using Twisted as well.

Pycon ZA

October 03, 2014
Tweet

More Decks by Pycon ZA

Other Decks in Programming

Transcript

  1. Monkeying around with Twisted 1

  2. 2 Cloud

  3. 3 Nimbula Cloud

  4. 4 Cloud Nimbula Cape Town

  5. 5 Cloud Nimbula Python Cape Town

  6. 6 Cloud Nimbula Python Cape Town Twisted

  7. 7 Asynchronous Programming Twisted Talk Overview Tracking Context

  8. 8 Blocking Reactor Time Jumps Testing

  9. 9 Task 1 Task 2 Task 3 Event Driven Multiple

    Threads Single Thread
  10. 10 Task 1 Task 2 Task 3 Event Driven Multiple

    Threads Single Thread
  11. 11 Task 1 Task 2 Task 3 Event Driven Multiple

    Threads Single Thread
  12. 12 Examples Twisted Event Machine Node.js Nginx

  13. 13 Advantages Avoids Polling Scales well for IO Better suited

    for some domains
  14. 14 Disadvantages Risk of spaghetti code Risk of blocking event

    loop
  15. 15 Disadvantages Harder to debug First class functions useful

  16. 16

  17. 17 Allows responsive behaviour with multiple connections, without threads Why

    Twisted?
  18. 18 Written in Python Large protocol support Cross-platform Attractive qualities

  19. 19 Flexible and extensible Integration friendly base Attractive qualities

  20. 20 Reactor Loop Transports Protocols

  21. 21 Waits for events Events can be IO / time

    etc Reactor Loop
  22. 22 Transports Write (non blocking) TCP , UDP… Connection between

    endpoints
  23. 23 dataReceived Protocols HTTP , SSH… Actual protocol details

  24. 24 EXAMPLE TIME

  25. 25 class PyconZAServer(protocol.Protocol): def dataReceived(self, data): self.transport.write( random.choice( self.factory.talks)) self.transport.loseConnection()

  26. 26 class PyconZAServerFactory(protocol.Factory): protocol = PyconZAServer def __init__(self): self.talks =

    [] for line in open('PyconZA-talks'): self.talks.append(line)
  27. 27 reactor.listenTCP(8000, PyconZAServerFactory()) reactor.run()

  28. 28 class PyconZAClient(protocol.Protocol): def dataReceived(self, data): print data def connectionMade(self):

    self.transport.write(“One talk please?")
  29. 29 class PyconZAClientFactory(ClientFactory): protocol = PyconZAClient def clientConnectionLost(self, connector, reason):

    reactor.callLater(2,connector.connect)
  30. 30 reactor.connectTCP(‘127.0.0.1',8000, PyconZAClientFactory()) reactor.run()

  31. 31 Time to make things a bit more complicated!

  32. 32 Deferreds Representation of something that will have a result

    in the FUTURE
  33. 33 Deferreds Used to manage callback chain

  34. 34 With deferreds mydeferred = doSomeIO() mydeferred.addCallback(Sfunction) mydeferred.addErrback(Efunction)

  35. 35 Success Error Success Success Error Error

  36. 36 Time to make things a bit simpler!

  37. 37 Magical decorator that allows you to program in a

    linear fashion @inlineCallbacks
  38. 38 def search_remote_file(remote_file): d = dload(remote_file) d.addCallback(ready) d.addCallbacks(result, error) d.addCallboth(clean_up)

    Previously
  39. 39 @inlineCallbacks def search_remote_file(remote_file): tmp_file = yield dload(remote_file) result =

    search_file(tmp_file) …. With @inlineCallbacks
  40. 40 Tracking Context

  41. 41 Context Apply old context Save current context Apply new

    context Do something
  42. 42 @classmethod def set_current_context(cls, context): cls.current[thread.get_ident()] = context Our context

    class Context(object): current = {} @classmethod def get_current_context(cls): return cls.current.get(thread.get_ident(), None)
  43. 43 Context @contextmanager def this_context(context): old_context = get_context() set_context(context) try:

    yield finally: set_context(old_context)
  44. 44 Monkey Patching Time!

  45. 45 Context class DeferredWithContext(Deferred): def __init__(self, *args, **kwargs): self._context =

    base.get_context() Deferred.__init__(self, *args, **kwargs) Deferred = defer.Deferred ...
  46. 46 Context def callback(self, result): old_context = base.get_context() base.set_context(self._context) try:

    Deferred.callback(self, result) finally: base.set_context(old_context)
  47. 47 Context def errback(self, fail=None): old_context = base.get_context() base.set_context(self._context) try:

    Deferred.errback(self, fail) finally: base.set_context(old_context) defer.Deferred = DeferredWithContext
  48. 48 Context def _func_with_context(context, f, args, kwargs): old_context = base.get_context()

    base.set_context(context) try: return f(*args, **kwargs) finally: base.set_context(old_context)
  49. 49 Context def callLaterWithContext(sec, f, *args, **kwargs): context = base.get_context()

    return callLater(sec, _func_with_context, context, f, args, kwargs) def add_context_to_reactor(reactor): callLater = reactor.callLater reactor.callLater = callLaterWithContext
  50. 50 Context Apply context on receipt of request Decoding HTTP

    request Consuming AMQP message Now can insert context into requests As part of HTTP request As part of AMQP msg
  51. 51 This allows us to log with context

  52. 52 Logging Context class _Logger(logging.Logger): def _log(self, level, msg, *args,

    **kwargs): extra = kwargs.setdefault('extra', {}) if 'context' not in extra: extra['context'] = getattr(get_context(), 'id', None) return logging.Logger._log( self, level, msg, *args, **kwargs)
  53. 53 Blocking the reactor is bad! Jumps in time are

    bad! Stalling the reactor
  54. 54 Back to the event loop ! for event in

    events: event.process() self.doIteration(t) Actually
  55. 55 Several different reactors are provided by Twisted: select poll

    epoll IOCP GTK+ 2.0 Tkinter wxPython Win32 CoreFoundation
  56. 56 General Idea: Set of readers or writers Sets containing

    FileDescriptor instances which will be checked for read events or writability self._reads = set() self._writes = set()
  57. 57 General Idea: methods to add or remove these readers

    / writers Add a FileDescriptor instances for notification of data available to write or read addReader addWriter
  58. 58 General Idea: methods doing a read or write for

    selectables, method, fdset in ((r, "doRead", self._reads), (w,”doWrite", self._writes)): doRead doWrite
  59. 59 patch_reactor_for_time_travel (and detect blockages..)

  60. 60 addReader addWriter doIteration callLater Targets to patch

  61. 61 Targets to patch def addReader(reader): blockingIOTimer.monitorBlocking(reader) return oldAddReader(reader)

  62. 62 Targets to patch def addWriter(writer): blockingIOTimer.monitorBlocking(writer) return oldAddWriter(writer)

  63. 63 Targets to patch def doIteration(timeout): before = reactor.seconds() blockingIOTimer.reset()

    timeout = min(timeout, MAX_DELAY) oldIteration(timeout) sleep_end = blockingIOTimer.timeOfFirstCall() now = reactor.seconds()
  64. 64 Targets to patch if not sleep_end: sleep_end = now

    if sleep_end < before: resetTimers(sleep_end, before) elif MAX_JUMP and (sleep_end-before > MAX_JUMP): resetTimers(sleep_end, before) elif now - before > MAX_BLOCKING: log.warn('Total IO processing time blocked reactor for %s seconds', now-before)
  65. 65 Implementation blockingIOTimer = TimedReaderWriter(reactor, log)

  66. 66 TimedReaderWriter class TimedReaderWriter: def monitorBlocking(self, reader_or_writer): originalDoRead = getattr(reader_or_writer,

    'doRead', None) originalDoWrite = getattr(reader_or_writer, 'doWrite', None)
  67. 67 TimedReaderWriter class_ = reader_or_writer.__class__ ref = weakref.ref(reader_or_writer)

  68. 68 TimedReaderWriter def _doClassRead(): instance = ref() if instance is

    not None: return self._timeCall( class_.doRead.__get__(instance, class_)) if needsRead: def _doInstanceRead(): return self._timeCall(originalDoRead)
  69. 69 TimedReaderWriter _doRead = _doInstanceRead class_method = getattr(class_, 'doRead', None)

    class_func = getattr(class_method, 'im_func', None) instance_func = getattr(reader_or_writer.doRead, 'im_func', None) if class_func and instance_func and class_func == instance_func: _doRead = _doClassRead
  70. 70 TimedReaderWriter reader_or_writer.doRead = _doRead reader_or_writer.doRead.monitored = True Same process

    followed for writes
  71. 71 TimedReaderWriter This implementation now allows us to: Monitor for

    blocking calls Handle time jumps
  72. 72 Monitor for blocking reactor def _timeCall(self, func, *args, **kwargs):

    start = self.reactor.seconds() if not self.first_event: self.first_event = start try: result = func(*args, **kwargs) return result
  73. 73 Monitor for blocking reactor finally: end = self.reactor.seconds() if

    (end - start > BLOCKING): self.log.warn(‘ Potential blocking call discovered (took %s seconds) while executing %s’, end-start, func)
  74. 74 @contextmanager def timedCallContext(func, args, kwargs): before = reactor.seconds() try:

    yield finally: now = reactor.seconds() if now - before > BLOCKING: Monitor for blocking reactor
  75. 75 module = getattr(func, '__module__', '__main__') func_name = getattr(func, '__name__',

    repr(func)) log.warn('DelayedCall blocked reactor for %s seconds: %s.%s(%s%s%s)', now-before, module, func_name, ', '.join([repr(x) for x in args]), ', ' if (bool(kwargs) and bool(args)) else '', ', '.join(['%s=%s' % (k, repr(v)) for k, v in kwargs.items()])) Monitor for blocking reactor
  76. 76 Handle time jumps def resetTimers(now, before): delta = now

    - before if delta < 0: log.warn(“ Travelled back in time by %s seconds", -delta) else: log.warn(“ Travelled forward in time by %s seconds”, delta)
  77. 77 Handle time jumps for call in calls: call.time =

    call.time + delta
  78. 78 Implementing cleaner monkey patching for tests

  79. 79 MonkeyPatch() def __init__(self, globalContext, localContext, patchDict): self._originalDict = {}

    self._patchDict = patchDict self._context = {} for k, v in globalContext.iteritems(): self._context[k] = v for k, v in localContext.iteritems(): self._context[k] = v
  80. 80 MonkeyPatch() def __enter__(self): self._patch() def __exit__(self, type=None, value=None, traceback=None):

    self._unpatch()
  81. 81 MonkeyPatch() def _patch(self): for identifier, value in self._patchDict.iteritems(): self._originalDict[identifier]

    = self._getObjValue(identifier) self._setObjValue(identifier, value)
  82. 82 MonkeyPatch() def _unpatch(self): for identifier, value in self._originalDict.iteritems(): self._setObjValue(identifier,

    value) self._originalDict = {}
  83. 83 MonkeyPatch() Usage patch = {} patch[‘moduleA.submodule.VALUE’] = new_value patch[‘moduleB.submodule.VALUE’]

    = new_value ... with MonkeyPatch(globals(), locals(), patch): #Monkey patched scope .... # Monkey patching reverted
  84. 84 Thank you!