Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Distributed Domain Destruction - Adventures in building distributed systems

Distributed Domain Destruction - Adventures in building distributed systems

For the last three years I have been building, maintaining and fighting a few projects that make use of distributed computing, parallel processing, message brokers, queues and workers. This is one of those "from the trenches" talks, where I will regale you with tales about the series of unfortunate events that may happen as your application grows in complexity. Tales like disk space fluctuations, importance of logging, NoSql problems, restructuring your order of execution in code for performance gains, short sighted albeit logical architectural decisions that will cost you in the long run.

Come hear about the agony you will experience when it starts falling apart, and the thrill you will feel when everything is running juuust right.

Vranac Srdjan

May 13, 2017
Tweet

More Decks by Vranac Srdjan

Other Decks in Programming

Transcript

  1. business owner, developer, consultant, mercenary, writing terrible code that performs

    exceptionally, wrangling elePHPants and Pythons, obsessed with process automation, interested in continuous integration and delivery, clean code, testing, best practices and distributed systems 2 — Srdjan Vranac, Code4Hire, CODEstantine 2.0
  2. THE SIMPLE DEFINITON It is a system where you can

    distribute processing of costly tasks to other workers. By costly taks I mean anything ranging to heavy computation and cpu utilization to long running processes. If we are talking in the context of web applications, basically anything that can't be done within a request is a candidate for distribution/background processing. 9 — Srdjan Vranac, Code4Hire, CODEstantine 2.0
  3. A MORE REALISTIC DEFINITION A distributed system is one in

    which the failure of a computer you didn't even know existed can render your own computer unusable. — Leslie B. Lamport, 1987 10 — Srdjan Vranac, Code4Hire, CODEstantine 2.0
  4. ...BECAUSE LOGS AND COUNTERS ARE LYING TO YOU 26 —

    Srdjan Vranac, Code4Hire, CODEstantine 2.0
  5. WHAT HAPPENED TO 40 MILLION RECORDS, WHERE HAVE THEY GONE?

    27 — Srdjan Vranac, Code4Hire, CODEstantine 2.0
  6. ARE WE WRITING DATA TO THE DYNAMODB? 30 — Srdjan

    Vranac, Code4Hire, CODEstantine 2.0
  7. THE LOGS ARE CLEAN, THE METRICS ARE GREEN 31 —

    Srdjan Vranac, Code4Hire, CODEstantine 2.0
  8. WHAT THE HELL DID I JUST READ? 34 — Srdjan

    Vranac, Code4Hire, CODEstantine 2.0
  9. I CAN HAZ ERROR HANDLING NAO? LOL NO 35 —

    Srdjan Vranac, Code4Hire, CODEstantine 2.0
  10. THIS WORKER IS FAST, THAT WORKER IS SLOW, WHICH ONE

    WILL CRASH THE DATABASE, NOBODY KNOWS 37 — Srdjan Vranac, Code4Hire, CODEstantine 2.0
  11. MAKE THIS APP GREAT AGAIN AND GET THE CLIENT TO

    PAY FOR IT 45 — Srdjan Vranac, Code4Hire, CODEstantine 2.0
  12. BUT WHY ISN'T THERE A SEPARATE LOG PARTITION? 48 —

    Srdjan Vranac, Code4Hire, CODEstantine 2.0
  13. WHAT DO YOU MEAN WE CAN'T SCALE? 51 — Srdjan

    Vranac, Code4Hire, CODEstantine 2.0
  14. 18 HOURS TO LOCATE, 20 SECONDS TO FIX 62 —

    Srdjan Vranac, Code4Hire, CODEstantine 2.0
  15. 16 CPUS 122 GB OF RAM 64 — Srdjan Vranac,

    Code4Hire, CODEstantine 2.0
  16. BARCODE SCANNERS AND KEYBOARDS DO NOT MIX 66 — Srdjan

    Vranac, Code4Hire, CODEstantine 2.0
  17. THANK YOU! SRDJAN VRANAC // [email protected] // @VRANAC 68 —

    Srdjan Vranac, Code4Hire, CODEstantine 2.0