$30 off During Our Annual Pro Sale. View Details »

Lessons learned from building serverless, distributed architecture

Lessons learned from building serverless, distributed architecture

Presented at DevOps Days India 2017

Jalem Raj Rohit

September 15, 2017
Tweet

More Decks by Jalem Raj Rohit

Other Decks in Programming

Transcript

  1. Lessons learned from building serverless, distributed architecture

  2. Introduction I am Jalem Raj Rohit. Works on Devops and

    Machine Learning full-time. - Moderates the DevOps and DataScience sites of StackOverflow - Contributes to random OSS projects
  3. Setting the context - Serverless, distributed system for processing ML

    workloads - Upto 900 servers every run. - Batch architecture
  4. LESSONS LEARNED

  5. LESSON #1 Always return your Lambda functions

  6. Always return your lambda functions - The cost of lambda

    functions can go from ‘meh’ to ‘OMFG’ really quick - A function which has not been returned is considered a failure by Lambda, and it keeps on retrying. [5 times]
  7. LESSON #2 Monitoring and Logging is still an unconquered beast

  8. Monitoring and logging - Monitoring a serverless system is very

    tricky. - Adding the distributed systems paradigm to it doesn’t really help - Having a hosted server for monitoring serverless systems?
  9. None
  10. Monitoring and logging (cont...) - Monitor the orchestration rather than

    trying to monitor all the servers - Use the cloud provider’s dashboard as much as possible - For logging, the closest best practise is to zip the log file and send to a data store before the server termination task
  11. LESSON #3 Super-high scalability with relative ease

  12. Super high scalability - Super high scalability at a fraction

    of the costs - Can be made to scale seamlessly with demand
  13. LESSON #4 If it is a distributed serverless system, it

    needs to be self-healing
  14. Self-healing - Debugging for a lost file or a faulty

    file in a distributed system is like finding a needle in a haystack - Thus, self-healing
  15. LESSON #5 Having distributed system doesn’t necessarily mean the load

    is distributed equally
  16. Load Balancing - Improper or poorly done load balancing defeats

    the whole purpose of having distributed systems - Have proper load balancing techniques or algorithms in place wherever data is getting ingested
  17. LESSON #6 Compliance automation is good. Let’s do more of

    it
  18. Compliance Automation - Boon for teams which have very strict

    compliance - No need to worry about the number of systems in production - Tag-based and boundary-based detection
  19. LESSON #7 Debugging and fixing serverless distributed systems is extremely

    difficult
  20. Horrors of debugging/fixing serverless distributed systems - These systems run

    in a nohup mode - All the servers get terminated once the orchestration is completed - So, if late in killing the process, one needs to start all over again from the beginning
  21. Horrors of debugging/fixing serverless distributed systems - Watching the tail

    of the log file would save a lot of headache - The more distributed the workload is, the bigger hell it is for the developer
  22. LESSON #8 Every distributed systems engineer deserves a hug

  23. THANK YOU • Github: Dawny33 • Home: jrajrohit.me