Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Lessons learned from building serverless, distributed architecture

Lessons learned from building serverless, distributed architecture

Presented at DevOps Days India 2017

Jalem Raj Rohit

September 15, 2017
Tweet

More Decks by Jalem Raj Rohit

Other Decks in Programming

Transcript

  1. Introduction I am Jalem Raj Rohit. Works on Devops and

    Machine Learning full-time. - Moderates the DevOps and DataScience sites of StackOverflow - Contributes to random OSS projects
  2. Setting the context - Serverless, distributed system for processing ML

    workloads - Upto 900 servers every run. - Batch architecture
  3. Always return your lambda functions - The cost of lambda

    functions can go from ‘meh’ to ‘OMFG’ really quick - A function which has not been returned is considered a failure by Lambda, and it keeps on retrying. [5 times]
  4. Monitoring and logging - Monitoring a serverless system is very

    tricky. - Adding the distributed systems paradigm to it doesn’t really help - Having a hosted server for monitoring serverless systems?
  5. Monitoring and logging (cont...) - Monitor the orchestration rather than

    trying to monitor all the servers - Use the cloud provider’s dashboard as much as possible - For logging, the closest best practise is to zip the log file and send to a data store before the server termination task
  6. Super high scalability - Super high scalability at a fraction

    of the costs - Can be made to scale seamlessly with demand
  7. Self-healing - Debugging for a lost file or a faulty

    file in a distributed system is like finding a needle in a haystack - Thus, self-healing
  8. Load Balancing - Improper or poorly done load balancing defeats

    the whole purpose of having distributed systems - Have proper load balancing techniques or algorithms in place wherever data is getting ingested
  9. Compliance Automation - Boon for teams which have very strict

    compliance - No need to worry about the number of systems in production - Tag-based and boundary-based detection
  10. Horrors of debugging/fixing serverless distributed systems - These systems run

    in a nohup mode - All the servers get terminated once the orchestration is completed - So, if late in killing the process, one needs to start all over again from the beginning
  11. Horrors of debugging/fixing serverless distributed systems - Watching the tail

    of the log file would save a lot of headache - The more distributed the workload is, the bigger hell it is for the developer