Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Microservices architecture pitfalls

Microservices architecture pitfalls

This is the talk about common pitfalls met while building scalable microservice architecture

7f1dfa02fd3771699d5bac40fc54a21c?s=128

Mateusz Gajewski

March 07, 2015
Tweet

Transcript

  1. Microservices architecture pitfalls WJUG meeting ◦ march 2015 Mateusz Gajewski


    Solutions Architect @ Allegro Twitter: @wendigo
  2. About me given: I started working in Allegro in 2009

    (5 mln AO, 50 devs) when: Allegro reached 40 mln AO, 400 devs then: I am Solutions Architect 2
  3. Agenda • Microservices, microservices, microservices! ;) • Some challenges &

    pitfalls: • Architectural, • Operational, • Organisational 3
  4. Let’s go back in time to year 2012 4

  5. 5

  6. Back then we wanted • agile development, • scalability, •

    resilience, • lower costs, • hybrid cloud. 6
  7. Basically SOA + JVM was an answer! 7

  8. But our system was too BIG & too complex to

    do it with existing enterprise solutions 8
  9. s/Enterprise/OSS/g Solutions ;) 9

  10. we’ve started to do *buzzword* 10

  11. And now, literally everyone is doing microservices!!?? 11

  12. Microservices by Fowler 12 Lots of *buzzwords* http://martinfowler.com/articles/microservices.html

  13. SOA ≈ microservices? 13

  14. microservices architecture ≈ fine-grained SOA − enterprise (commercial) sh*t ≈

    highly scalable, distributed system 14
  15. Distributed systems • concurrency of components, • independent failure of

    components, • lack of a global clock. 15
  16. The Eight Fallacies of Distributed Computing 16 by Peter Deutsch

    1991
  17. #1: Network is reliable 17

  18. #2: Latency is zero 18

  19. #3: Bandwidth is infinite 19

  20. #4: Network is secure 20

  21. #5: Topology doesn’t change 21

  22. #6: There is one administrator 22

  23. #7: Transport cost is zero 23

  24. #8: Network is homogeneous 24

  25. distributed systems are hard →
 microservices are much harder ;)

    25
  26. What have we learnt? 26

  27. Act I: architectural constraints 27

  28. CAP is not just theorem it’s reality against us 28

  29. bye, bye ACID semantics 29

  30. Long live BASE guarantees! Basically Available, Soft state, Eventually consistent

    30
  31. distributed transactions add complexity 31

  32. it’s far cheaper to do compensation 32

  33. 33 http://bravenewgeek.com/you-cannot-have-exactly-once-delivery/

  34. you need idempotent APIs and events sinks 34

  35. 35 choreography > orchestration

  36. So we’ve built Hermes a.k.a circulatory system 36

  37. network can be congested! 37

  38. REST+JSON on top of HTTP/1.1
 is fine 38

  39. REST+JSON on top of HTTP/2.0
 with TLS is finer 39

  40. we don’t rely on network anymore net splits in public

    clouds happens everytime! 40
  41. we adopted antifragile organization 41

  42. 42

  43. powerful tandem 43 + Reactive programming Circuit breaker pattern

  44. you need to support non- native old services, clients and

    systems 44
  45. 45

  46. conclusion: constant architecture improvement 46

  47. 47 Act II: operational troubles

  48. creating new service should be instant! 48

  49. 49

  50. automation
 with gradle & axions 50

  51. 51

  52. so now we’ve got over 1800 repositories grouped under 250

    projects 52
  53. all with CI, code quality checks, security checks, integrated with

    sonar & artefact repository 53
  54. but what with
 services upgrades? 54

  55. we’ve initially built our own service stack
 … and it

    was ok - for a while 55
  56. now we are extending spring-boot
 with so called andamio project

    56
  57. rapid deployments integrated with CI/CD environment and canary tests are

    must-have 57
  58. war files
 ▾ scp + puppet ▾ golden images ▾

    docker (immutable images) ▾ 58
  59. frequency of changes → 
 automated
 monitoring, logging 
 &

    operational insights 59
  60. graphite statsd cabot tessera kibana logstash zabbix newrelic selena pingdom

    … 60
  61. Monitoring As A Service + SLA Monitoring + 61

  62. we need to build real-time anomaly detection soon 62

  63. 63 Act III: organizational shift

  64. strategic DDD is good for splitting up monolith 64

  65. but leave tactical DDD up to teams 65

  66. huge polyglot hangover 66

  67. acquiring distributed skills 67

  68. you build it - you run it 68

  69. coupling avoidance 69

  70. please don’t audit me 70

  71. distributed (micro) data curation 71

  72. So after two years… 72

  73. 73

  74. Final thoughts 74

  75. 75

  76. 76

  77. 77

  78. Thanks! Any questions? Visit our blog: allegrotech.io Follow us on

    twitter: @allegrotechblog Check our OSS projects: github.com/allegro And meetup group: meetup.com/allegrotech 78