Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Experiences with Microservices at Tuenti

Tuenti
June 26, 2015

Experiences with Microservices at Tuenti

This is the talk Aarón Fas and Andrés Viedma presented in the JBcnConf 2015 telling their experiences at Tuenti using a distributed architecture of microservices.

Tuenti

June 26, 2015
Tweet

More Decks by Tuenti

Other Decks in Programming

Transcript

  1. .Experiences with
    Microservices at
    Aarón Fas
    Andrés Viedma

    View Slide

  2. Microservices?
    I know what you’re
    probably thinking...

    View Slide

  3. View Slide

  4. View Slide

  5. View Slide

  6. Who did you say these guys are?
    Andrés Viedma
    @andres_viedma
    Aarón Fas
    @aaronfc
    Java
    dinosaur
    Useless gadgets
    buyer

    View Slide

  7. About

    View Slide

  8. From Social Network...

    View Slide

  9. From Social Network...

    View Slide

  10. To Mobile Operator
    (full MVNO)

    View Slide

  11. The PHP
    Monolith
    One single source
    repository
    PHP???

    View Slide

  12. Do you need a
    release?
    Take a ticket
    and wait...

    View Slide

  13. .Microservices

    View Slide

  14. Microservices… again… (and take a shot)
    ❖ Distributed, independently deployable
    components
    ❖ Well defined interfaces
    ❖ Simple communication interface (HTTP?)
    ❖ Each service has its own DB
    ❖ Each service has its own source repository

    View Slide

  15. Microservices… again… (and take a shot)
    ❖ Distributed, independently deployable
    components
    ❖ Well defined interfaces
    ❖ Simple communication interface (HTTP?)
    ❖ Each service has its own DB
    ❖ Each service has its own source repository
    THAT IS
    SOA !!!

    View Slide

  16. Microservices… again… (and take a shot)
    ❖ Distributed, independently deployable
    components
    ❖ Well defined interfaces
    ❖ Simple communication interface (HTTP?)
    ❖ Each service has its own DB
    ❖ Each service has its own source repository
    Is that important enough to
    deserve a new name???

    View Slide

  17. Mixing technologies
    ❖ Allows using different languages
    ❖ Different platform versions
    ❖ Incremental technology changes / evolution

    View Slide

  18. Separation of responsibilities
    ❖ Forces separation of responsibilities
    ➢ Subsystems with well defined facades
    ➢ Different source repositories

    View Slide

  19. Separation of responsibilities
    ❖ Forces separation of responsibilities
    ➢ Subsystems with well defined facades
    ➢ Different source repositories
    YOU DON’T NEED
    MICROSERVICES!
    USE
    JARS !!!

    View Slide

  20. Continuous deployment
    «Our highest priority is to satisfy the customer
    through early and continuous delivery
    of valuable software.»
    «The best architectures, requirements, and designs
    emerge from self-organizing teams.»
    -- Principles of the Agile Manifesto

    View Slide

  21. Continuous deployment
    «Our highest priority is to satisfy the customer
    through early and continuous delivery
    of valuable software.»
    «The best architectures, requirements, and designs
    emerge from self-organizing teams.»
    -- Principles of the Agile Manifesto
    1 Service => 1 Team?
    Better than Continuous delivery!:
    Continuous deployment
    Team responsible of the
    deployments?

    View Slide

  22. Beware! High costs
    ❖ No transactions!
    ➢ Distributed tx?
    ❖ Requires a much more complex infrastructure
    ❖ Difficult integration testing

    View Slide

  23. For us: Seemed like a good idea
    ❖ We have small self-organized teams =>
    Continuous deployment is a reality
    ❖ We wanted Java, we had PHP
    ❖ Strong SRE / DevOps team
    ❖ Our software was intended mainly to access
    3rd parties => transactions not possible anyway

    View Slide

  24. .Communications
    protocol

    View Slide

  25. Existing libraries
    ❖ No PHP implementation
    ➢ Avro, Etch, Netflix stack
    ❖ Only serialization
    ➢ Protocol buffers
    ❖ Didn’t exist or were too new
    ➢ Cap’n Proto, gRPC
    ❖ Thrift?
    ➢ Good option, but a lot of PHP boilerplate

    View Slide

  26. TService
    ❖ Own abstraction layer - RPC based
    ❖ Basic implementation: JSON-RPC
    ❖ Interface Definition Language (IDL)
    ❖ Generates Java / PHP / Erlang:
    ➢ Interchange objects
    ➢ Client
    ➢ Server stub

    View Slide

  27. TService IDL
    /**
    * Manages the transfer of balance between subscriptions.
    * @version 1
    */
    interface BalanceTransferService {
    /** Transfer money from one subscription to another one. */
    String transfer(Donation donation) throws NoSuchSubscriptionException;
    (...)
    }
    /** Donation between two subscriptions. */
    class Donation {
    /** Id of the donor */
    long from;
    /** Amount of money to transfer */
    int amount;
    (...)
    }
    class NoSuchSubscriptionException extends Exception {
    int code = 100;
    }
    Java???

    View Slide

  28. TService Versioning
    Interface v1
    Service
    Client 1
    Client 2
    (compatible changes)
    ● New methods
    ● New fields in objects
    ● New parameters in
    methods
    ● Delete methods /
    parameters / fields

    View Slide

  29. TService Versioning
    Interface v1
    Service
    Interface v2
    Client 1
    Client 2
    (compatible changes)

    View Slide

  30. TService Versioning
    Interface v1
    Service
    Interface v2
    Client 1
    Client 2
    (compatible changes)

    View Slide

  31. .Java Platform

    View Slide

  32. Technology stack

    View Slide

  33. XConfig
    ❖ Own configuration system
    ❖ YAML files based
    ❖ Git repository
    ❖ Overriding system: by env, common / service
    ❖ Hot reloading
    ➢ Everything adjusts to changes: even DB pools!
    ➢ No restart required

    View Slide

  34. Async jobs
    TService request
    processing Enqueue
    job
    Queued jobs
    Executor thread pool

    View Slide

  35. Async jobs
    TService request
    processing Enqueue
    job
    Queued jobs
    Executor thread pool
    Cron jobs
    Cron jobs programming in
    config

    View Slide

  36. Feature disabling
    ❖ Activation / deactivation of features by config
    ➢ Is the new development risky?
    ➢ Is the rest of services / environment ready for the
    change?
    ❖ Partial activation of a feature for a % of users
    ➢ Incremental activation of an optional risky change
    ➢ A / B tests

    View Slide

  37. Integration tests
    ❖ Custom JUnit runner
    ➢ Bootstraps the platform
    ➢ Cleans / restarts the local database
    ➢ Allows the use of @Inject in tests
    ➢ Allows overriding in dependency injection => inject
    mocks of the other services
    ❖ Uses special, “development” XConfig repo

    View Slide

  38. .Monitoring

    View Slide

  39. Monitoring, a priority
    ❖ What is happening or has happened?
    ➢ Logs
    ➢ Metrics
    ➢ Alarms
    ❖ Distributed architectures are much more
    difficult to track

    View Slide

  40. And basically because...

    View Slide

  41. .Let’s talk about logs

    View Slide

  42. Logging
    ❖ Logging library in Java?
    ➢ Log4j
    ❖ We needed full details
    ➢ Filters to expand/simplify information logged
    ➢ Multiple appenders logged into distinct storages

    View Slide

  43. ❖ Overview of appenders
    Logging
    log.info(...); Logger
    MySQL Appender
    LogStash Appender
    Hadoop Appender

    View Slide

  44. ❖ Following call’s path (TService calls logging)
    Logging
    ServiceA ServiceB ServiceC
    GlobalID = 100
    RequestID = 1
    GlobalID = 100
    RequestID = 2
    GlobalID = 100
    RequestID = 3
    Benefits
    ● Locate in/out for calls
    ● Get all interactions

    View Slide

  45. Logging
    ❖ Kibana dashboard
    What does it look like?

    View Slide

  46. Change query

    View Slide

  47. Customize filters

    View Slide

  48. Log types by color

    View Slide

  49. Full log details

    View Slide

  50. .Let’s talk about metrics

    View Slide

  51. Metrics
    ❖ We graphs
    ➢ As easy as possible to track new metrics
    ❖ Do not reinvent the wheel
    ➢ Already using StatsD/Graphite on PHP side
    ❖ What are we tracking?
    ➢ Basic monitoring metrics added by the platform
    ➢ Metrics from Tomcat JMX
    ➢ Metrics related to business

    View Slide

  52. Metrics
    ❖ Multiple graphs dashboards tested
    ➢ Default graphite one
    ➢ Grafana

    View Slide

  53. Graphite’s is a little ugly...

    View Slide

  54. Grafana is prettier

    View Slide

  55. Layout customized

    View Slide

  56. Much better UI to create graphs

    View Slide

  57. .Let’s talk about alarms

    View Slide

  58. Alarms
    ❖ Graphs are ok, but we don’t have people 24x7
    staring at them.
    ➢ We need notifications
    ❖ Different things to monitor
    ➢ SQL queries
    ➢ Graphite metrics
    ➢ HTTP requests
    ➢ ...

    View Slide

  59. Alarms
    ❖ Created our own alarms system
    ➢ Multiple data sources and easily extensible
    ➢ Quick edition of conditions
    ➢ Observers for alarms
    ❖ We ended up using mainly
    ➢ MySQL and Graphite data sources
    ➢ Java Expression Language on config checkers
    ➢ Email notifications

    View Slide

  60. … and then, we found Cabot
    Separated by service

    View Slide

  61. Cabot overview
    Multiple integrations

    View Slide

  62. Cabot overview
    Service status overview

    View Slide

  63. Cabot overview
    Graphite checks

    View Slide

  64. Cabot overview (Creating new check)
    Set graphite metric

    View Slide

  65. Cabot overview (Creating new check)
    Check data

    View Slide

  66. Cabot overview (Creating new check)
    Set check type/value

    View Slide

  67. Cabot overview (Creating new check)
    Set importance

    View Slide

  68. Cabot
    ❖ Benefits of using Cabot
    ➢ Friendlier UI than config files
    ➢ No dependency on the service monitored
    ➢ Opensource and many integrations

    View Slide

  69. Alarms
    ❖ Where are we heading now?
    ➢ Moving now most Graphite alarms to Cabot
    ➢ Replacing thresholds with dynamic expectations (Holt
    Winters)
    ❖ It is still the main alarms platform being used

    View Slide

  70. .That’s all about monitoring

    View Slide

  71. .Some Lessons
    learned

    View Slide

  72. View Slide

  73. GO
    ASYNC!!!

    View Slide

  74. Don’t get blocked for too long
    ❖ Concurrent requests: don’t wait for free threads
    ➢ Own Rate limit mechanism
    ➢ Tune container thread pool size
    ➢ Tune database pool (and other possible blocking pools)
    ❖ Tune clients timeout
    ➢ It may depend on called service / operation
    ➢ It may depend on the caller

    View Slide

  75. Asynchronous logging
    log.info(...)
    Appender
    MySQL
    Appender
    Logstash
    Appender
    Hadoop
    Logger

    View Slide

  76. Asynchronous logging
    log.info(...)
    When the ring buffer is full…
    WAIT!
    Appender
    MySQL
    Appender
    Logstash
    Appender
    Hadoop
    Logger
    Ring buffer
    Async Logger
    Not configurable!

    View Slide

  77. Asynchronous logging
    log.info(...)
    When the ring buffer is full…
    WAIT!
    Appender
    MySQL
    Appender
    Logstash
    Appender
    Hadoop
    Logger
    Ring buffer
    Async Logger
    Async
    Appender
    Async
    Appender
    Async
    Appender
    Not configurable!
    When async appender full,
    messages are discarded

    View Slide

  78. Asynchronous operations
    ❖ Getters
    ➢ Make them fast (sacrifice consistency)
    ➢ Cache
    ➢ Use default values
    ❖ Setters
    ➢ No operation result
    ➢ Wait for a notification of operation finished
    ➢ Query the status of the change

    View Slide

  79. Message queues
    ❖ Operation queues
    ➢ Retry system
    ➢ Persistent queues
    ❖ Publish / subscribe model (pub/sub)
    ➢ Event driven
    ➢ Reactive programming

    View Slide

  80. Circuit breaker
    ❖ From the client, consider the status of the
    service
    ➢ Previous calls
    ➢ Health checks
    ❖ If it’s degraded, don’t call it (close the circuit)
    ➢ Return a default response
    ➢ Enqueue the operation for later retry
    ➢ Throw an error

    View Slide

  81. .Do It Yourself

    View Slide

  82. Many implementations available
    ❖ Communication layer
    ➢ gRPC, Cap’n proto, Thrift…
    ➢ REST, JSON…
    ❖ Services platform
    ➢ Spring boot, Dropwizard, Spark, Ninja, Jodd…
    ❖ Netflix stack
    ➢ Hystrix, Ribbon…

    View Slide

  83. Make your own combination!
    (it can’t be so
    difficult…)

    View Slide

  84. Aarón Fas
    @aaronfc
    Andrés Viedma
    @andres_viedma
    .Thanks!
    Questions?

    View Slide