Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The integrator's guide to duct-taping

The integrator's guide to duct-taping

Slides of the presentation that I gave at EuroPython 2012: https://speakerdeck.com/u/simonedeponti/p/git-as-a-rifle

Simone Deponti

July 05, 2012
Tweet

More Decks by Simone Deponti

Other Decks in Programming

Transcript

  1. Me Developing in Python since 2005 @simonedeponti Works at Abstract

    Slightly paranoid about security and clean code
  2. What's in this talk Disclaimer: the image shown is not

    representative of the contained product A high level overview of integration problems A lot of quotes Not a lot of code examples1 (Hopefully funny) jokes 1For the joy of your optometrist
  3. The restaurant at the end of development Integrating suboptimal components

    or doing so in a suboptimal way, because you can't do otherwise. The Guide is definitive. Reality is frequently inaccurate.
  4. Guest selection The perfect guest: Has the best API Is

    under your control Doesn't want you to copy or send around big chunks of data The first nonabsolute number is the number of people for whom the table is reserved. This will vary during the course of the first three telephone calls to the restaurant, and then bear no apparent relation to the number of people who actually turn up…
  5. Am I duct-taping already? Using Python libraries isn't integrating. But

    sometimes you'll end up duct-taping them too. The civilized way to do it is wrapping. "The statistical likelihood," continued the autopilot primly, "is that other civilizations will arise. There will one day be lemon-soaked paper napkins. Till then there will be a short delay. Please return to your seat."
  6. Inappropriate clothing Not respecting the scope class Cache(object): def __init__(self,

    **params): self.params = params def __get__(self, instance, owner): return Connection(**params) Hiding things try: data = load_from_file(path) except IOError: raise UserError("'%s' does not exist" % path)
  7. Enter the Babel fish(es) Things to check: Multithreading and multiprocessing

    Unneeded duplication of data structures Are errors intelligible? Meanwhile, the poor Babel fish, by effectively removing all barriers to communication between different races and cultures, has caused more and bloodier wars than anything else in the history of creation.
  8. External processes Spawning an external application to do some work

    and get back the result. Immediate advantages: "It's a UNIX system... I know this!" "it is slightly cheaper; and it has the words DON'T PANIC inscribed in large friendly letters on its cover"
  9. Later disadvantages You don't know what they're up to Their

    execution time might be unpredictable Every time they die, Conan Doyle has a new plot idea All you really need to know for the moment is that the universe is a lot more complicated than you might think, even if you start from a position of thinking it's pretty damn complicated in the first place
  10. Going asyncronous While in theory promises to solve some of

    the problems above, it also introduces others. Use celery or any other equivalent framework. Some people, when confronted Now they with a problem, have think two problems. "I'll use threads". https://twitter.com/SteveStreza/status/176863405385326593
  11. The importance of being earnest When something goes wrong in

    production, you're left with Angry people Limited time A traceback A logfile “Funny,” he intoned funereally, “how just when you think life can’t possibly get any worse it suddenly does.”
  12. Debugging and testing Mock every call, and also test for

    common problems (i.e. also mock errors) Always log: arguments, stdout, stderr, environment, pid, effective user and effective group "We apologize for the inconvenience." God's Final Message to His Creation, written in letters of fire on the side of the Quentulus Quazgar Mountains.
  13. Performance Careful with that PIPE, Eugene Buffering can make the

    difference It uses os.fork() on UNIX, see man 2 fork for a list of things that can go wrong Windows handles things quite differently "Yeah," said the voice from under the table, "you go to pieces so fast people get hit by the shrapnel."
  14. Security concerns Popen can be as dangerous as eval() File

    descriptors are inherited by default Always call a binary that's decently protected against tampering Only an absolute idiot would be sitting where he was, so he was winning already. A common mistake that people make when trying to design something completely foolproof is to underestimate the ingenuity of complete fools.
  15. Services Multi dimensional problem Who manages it How remote is

    the service How integrated is the service How asyncronous is its interface Who "owns" it
  16. Common problems The most common problems arise from Network Interface

    changes, even if subtle (corner cases) Limitations of synchronous protocols Lack of integration with the security system There is an art, it says, or rather, a knack to flying. The knack lies in learning how to throw yourself at the ground and miss. […] Clearly, it is this second part, the missing, which presents the difficulties.
  17. If you "own" it You can ensure Predictable load patterns

    Interface doesn't change unexpectedly, and can change if needed Rare outages, and prompt notification Authentication and authorization through basic protocols
  18. But beware There are still pitfalls Security (network vs sockets)

    Tradeoffs can be complex (resource availability vs locality) "it is very easy to be blinded to the essential uselessness of them by the sense of achievement you get from getting them to work at all."
  19. Stuff that's not yours Probably the most common case of

    integration, and the one more likely to result in duct-taping. Areas to check Security Stability Performance Availability
  20. Security Keep in mind Always authenticate all players Use proven,

    widespread technologies Resist the urge to rely on hacks ◦ Whitelists ◦ Shared secrets ◦ Security by obscurity
  21. Single sign-ons Are rumored to be useful in killing vampires

    and other creatures What is exactly a single sign-on Select the best protocol/solution given all requirements • SSO ◦ Kerberos ◦ Shibboleth (SAML) • Consolidated login ◦ OpenID ◦ LDAP
  22. Reducing attack surface Less doors, more security If possible, always

    treat incoming data as untrusted (i.e. as user data) Keep a careful eye on: • Data being sent to you • Sensitive data you might be sending Conceal any sensitive data you might need to send, or make it less sensitive
  23. Attacks to ward off Always try to break into the

    system using CSRF XSS Man-in-the-middle All types of injection combined with one of the above
  24. Availability If you're using synchronous, unmediated protocols, services availability effects

    Your own availability Your stability Your performances On no account allow a Vogon to read poetry at you.
  25. Working around availability Synchronous availability problems can be mediated by

    Use of queue managers and retry protocols Caching Use of local mirrors and synchronization protocols
  26. Debugging and testing Mock and log Log stuff when it

    exits your code and comes in (not before or after processing) Create effective (and easily extendable) mocks in your tests Mock both successful events and errors
  27. Summary Duct-taping can be hard to do and maintain: therefore

    you must Always use the least expensive solution Maintain a stricter discipline Always maintain full knowledge of the stack "You're starting to sound like Marvin." "Marvin is the clearest thinker I know."
  28. Citations, thanks etc Douglas Adams - The Hitchhiker's Guide to

    the Galaxy @SteveStreza My customers (anonymized) Steven Spielberg Pink Floyd Oscar Wilde