10 Traumas in 10 minutes

10 Traumas in 10 minutes

A lightening talk - walking through 10 trauma's for developing applications for the cloud

B83d5b590577969ee79a5ad845411a7d?s=128

Amy Palamountain

October 26, 2012
Tweet

Transcript

  1. Hi

  2. @ammeep

  3. None
  4. 10 E Traumas in 10 minutes

  5. 1 E Deployments are painful

  6. automated deployments are hard in the cloud

  7. automated deployments are hard in the cloud dirty lies

  8. automated build automated test automated deploy

  9. automate provisioning API

  10. automate network setup API

  11. automate deployment API

  12. automate a the thing

  13. 2 E i’m married to my cloud

  14. AbstractProxyBooleanProvider avoid a ma iage of inconvenience just don’t commit

    be abstract
  15. abstract message queues abstract storage providers abstract auto scaling

  16. abstract away the gory details if you don’t own it

  17. 3 E insanity the definition of

  18. Oh HAI please may i retrieve my data? sure ...

    i’ll hold...
  19. hello? ... ok holding ... may i have my data?

  20. data? what? ........

  21. None
  22. Are we there yet? Are we there yet? Are we

    there yet? Are we there yet? Are we there yet? Are we there yet? Are we there yet? Are we there yet? not a are born equal transient faults
  23. exponential the nature of the transient fault Unde tand perhaps

    try back off
  24. 4 E storage the limitations of

  25. Transactions can be capped but can we still have infinite

    storage capacity infinite throughput and
  26. many ways to skin a cat When is it accessed

    Wh is accessing it What is accessed


artition you data Spread out you data Cache you data
  28. 5 E web scale we’ve got to be

  29. web scale cloud scale what does that even mean?

  30. twitter uses it but... it must be g d

  31. choice of technology is not a function of fashion

  32. it ok t be relational if it works for you

    & your data
  33. 6 E go daddy goes down

  34. DNS is a single point of failure avoiding this is

    a tricky problem
  35. have a simple back up plan Use more than one

    domain
  36. 7 E slow our app is

  37. hogging resources 100% CPU blocking locking opening cursors never closing

    interrupt obstructing stalling no connection pooling n+1 leaking memory performance ++ deadlocks
  38. Just scale Up not optimised performance for node

  39. Just scale out not optimised for cost

  40. Bo leneck cost performance tuning is important

  41. 8 E chaos monkey

  42. patche upgrade VM shuffling DNS changes shared hardware slo node

    Storage Transient failur queue Transient failures physical location time Storage Transient failures patches o upgrade queue Transient failures provisioning failure autoscaling capacity latency
  43. patche upgrade VM shuffling DNS changes shared hardware Slo node

    Storage Transient failur queue Transient failures physical location time Storage Transient failures patches o upgrade queue Transient failures provisioning failure autoscaling capacity latency VOLATILE
  44. a ept that It will fail and plan fo it

  45. simulate chao get that monkey working for you

  46. 9 E you have lupus

  47. it’s 2 am the phone rings your app is crashing

    servers are unresponsive
  48. and it gets worse YOU CAN’T TELL WHAT IS GOING

    ON
  49. centrally store logs

  50. user activity correlate be able to

  51. diagnostics n d t be a first class concern

  52. 10 E you are going to fail

  53. accept it You’re not going to get it right not

    the first time and maybe not the time after that
  54. change i coming

  55. your designs responsibly to adapt be prepared

  56. 10 E phew thats it

  57. for listening thanks @ammeep