Experiences with Microservices at Tuenti

.Experiences with Microservices at Aarón Fas Andrés Viedma

Microservices? I know what you’re probably thinking...

Who did you say these guys are? Andrés Viedma @andres_viedma
Aarón Fas @aaronfc Java dinosaur Useless gadgets buyer

From Social Network...

To Mobile Operator (full MVNO)

The PHP Monolith One single source repository PHP???

Do you need a release? Take a ticket and wait...

.Microservices

Microservices… again… (and take a shot) ❖ Distributed, independently deployable
components ❖ Well defined interfaces ❖ Simple communication interface (HTTP?) ❖ Each service has its own DB ❖ Each service has its own source repository

components ❖ Well defined interfaces ❖ Simple communication interface (HTTP?) ❖ Each service has its own DB ❖ Each service has its own source repository THAT IS SOA !!!

components ❖ Well defined interfaces ❖ Simple communication interface (HTTP?) ❖ Each service has its own DB ❖ Each service has its own source repository Is that important enough to deserve a new name???

Mixing technologies ❖ Allows using different languages ❖ Different platform
versions ❖ Incremental technology changes / evolution

Separation of responsibilities ❖ Forces separation of responsibilities ➢ Subsystems
with well defined facades ➢ Different source repositories

Separation of responsibilities ❖ Forces separation of responsibilities ➢ Subsystems
with well defined facades ➢ Different source repositories YOU DON’T NEED MICROSERVICES! USE JARS !!!

Continuous deployment «Our highest priority is to satisfy the customer
through early and continuous delivery of valuable software.» «The best architectures, requirements, and designs emerge from self-organizing teams.» -- Principles of the Agile Manifesto

Continuous deployment «Our highest priority is to satisfy the customer
through early and continuous delivery of valuable software.» «The best architectures, requirements, and designs emerge from self-organizing teams.» -- Principles of the Agile Manifesto 1 Service => 1 Team? Better than Continuous delivery!: Continuous deployment Team responsible of the deployments?

Beware! High costs ❖ No transactions! ➢ Distributed tx? ❖
Requires a much more complex infrastructure ❖ Difficult integration testing

For us: Seemed like a good idea ❖ We have
small self-organized teams => Continuous deployment is a reality ❖ We wanted Java, we had PHP ❖ Strong SRE / DevOps team ❖ Our software was intended mainly to access 3rd parties => transactions not possible anyway

.Communications protocol

Existing libraries ❖ No PHP implementation ➢ Avro, Etch, Netflix
stack ❖ Only serialization ➢ Protocol buffers ❖ Didn’t exist or were too new ➢ Cap’n Proto, gRPC ❖ Thrift? ➢ Good option, but a lot of PHP boilerplate

TService ❖ Own abstraction layer - RPC based ❖ Basic
implementation: JSON-RPC ❖ Interface Definition Language (IDL) ❖ Generates Java / PHP / Erlang: ➢ Interchange objects ➢ Client ➢ Server stub

TService IDL /** * Manages the transfer of balance between
subscriptions. * @version 1 */ interface BalanceTransferService { /** Transfer money from one subscription to another one. */ String transfer(Donation donation) throws NoSuchSubscriptionException; (...) } /** Donation between two subscriptions. */ class Donation { /** Id of the donor */ long from; /** Amount of money to transfer */ int amount; (...) } class NoSuchSubscriptionException extends Exception { int code = 100; } Java???

TService Versioning Interface v1 Service Client 1 Client 2 (compatible
changes) • New methods • New fields in objects • New parameters in methods • Delete methods / parameters / fields

TService Versioning Interface v1 Service Interface v2 Client 1 Client
2 (compatible changes)

.Java Platform

Technology stack

XConfig ❖ Own configuration system ❖ YAML files based ❖
Git repository ❖ Overriding system: by env, common / service ❖ Hot reloading ➢ Everything adjusts to changes: even DB pools! ➢ No restart required

Async jobs TService request processing Enqueue job Queued jobs Executor
thread pool

Async jobs TService request processing Enqueue job Queued jobs Executor
thread pool Cron jobs Cron jobs programming in config

Feature disabling ❖ Activation / deactivation of features by config
➢ Is the new development risky? ➢ Is the rest of services / environment ready for the change? ❖ Partial activation of a feature for a % of users ➢ Incremental activation of an optional risky change ➢ A / B tests

Integration tests ❖ Custom JUnit runner ➢ Bootstraps the platform
➢ Cleans / restarts the local database ➢ Allows the use of @Inject in tests ➢ Allows overriding in dependency injection => inject mocks of the other services ❖ Uses special, “development” XConfig repo

.Monitoring

Monitoring, a priority ❖ What is happening or has happened?
➢ Logs ➢ Metrics ➢ Alarms ❖ Distributed architectures are much more difficult to track

And basically because...

.Let’s talk about logs

Logging ❖ Logging library in Java? ➢ Log4j ❖ We
needed full details ➢ Filters to expand/simplify information logged ➢ Multiple appenders logged into distinct storages

❖ Overview of appenders Logging log.info(...); Logger MySQL Appender LogStash
Appender Hadoop Appender

❖ Following call’s path (TService calls logging) Logging ServiceA ServiceB
ServiceC GlobalID = 100 RequestID = 1 GlobalID = 100 RequestID = 2 GlobalID = 100 RequestID = 3 Benefits • Locate in/out for calls • Get all interactions

Logging ❖ Kibana dashboard What does it look like?

Change query

Customize filters

Log types by color

Full log details

.Let’s talk about metrics

Metrics ❖ We graphs ➢ As easy as possible to
track new metrics ❖ Do not reinvent the wheel ➢ Already using StatsD/Graphite on PHP side ❖ What are we tracking? ➢ Basic monitoring metrics added by the platform ➢ Metrics from Tomcat JMX ➢ Metrics related to business

Metrics ❖ Multiple graphs dashboards tested ➢ Default graphite one
➢ Grafana

Graphite’s is a little ugly...

Grafana is prettier

Layout customized

Much better UI to create graphs

.Let’s talk about alarms

Alarms ❖ Graphs are ok, but we don’t have people
24x7 staring at them. ➢ We need notifications ❖ Different things to monitor ➢ SQL queries ➢ Graphite metrics ➢ HTTP requests ➢ ...

Alarms ❖ Created our own alarms system ➢ Multiple data
sources and easily extensible ➢ Quick edition of conditions ➢ Observers for alarms ❖ We ended up using mainly ➢ MySQL and Graphite data sources ➢ Java Expression Language on config checkers ➢ Email notifications

… and then, we found Cabot Separated by service

Cabot overview Multiple integrations

Cabot overview Service status overview

Cabot overview Graphite checks

Cabot overview (Creating new check) Set graphite metric

Cabot overview (Creating new check) Check data

Cabot overview (Creating new check) Set check type/value

Cabot overview (Creating new check) Set importance

Cabot ❖ Benefits of using Cabot ➢ Friendlier UI than
config files ➢ No dependency on the service monitored ➢ Opensource and many integrations

Alarms ❖ Where are we heading now? ➢ Moving now
most Graphite alarms to Cabot ➢ Replacing thresholds with dynamic expectations (Holt Winters) ❖ It is still the main alarms platform being used

.That’s all about monitoring

.Some Lessons learned

GO ASYNC!!!

Don’t get blocked for too long ❖ Concurrent requests: don’t
wait for free threads ➢ Own Rate limit mechanism ➢ Tune container thread pool size ➢ Tune database pool (and other possible blocking pools) ❖ Tune clients timeout ➢ It may depend on called service / operation ➢ It may depend on the caller

Asynchronous logging log.info(...) Appender MySQL Appender Logstash Appender Hadoop Logger

Asynchronous logging log.info(...) When the ring buffer is full… WAIT!
Appender MySQL Appender Logstash Appender Hadoop Logger Ring buffer Async Logger Not configurable!

Asynchronous logging log.info(...) When the ring buffer is full… WAIT!
Appender MySQL Appender Logstash Appender Hadoop Logger Ring buffer Async Logger Async Appender Async Appender Async Appender Not configurable! When async appender full, messages are discarded

Asynchronous operations ❖ Getters ➢ Make them fast (sacrifice consistency)
➢ Cache ➢ Use default values ❖ Setters ➢ No operation result ➢ Wait for a notification of operation finished ➢ Query the status of the change

Message queues ❖ Operation queues ➢ Retry system ➢ Persistent
queues ❖ Publish / subscribe model (pub/sub) ➢ Event driven ➢ Reactive programming

Circuit breaker ❖ From the client, consider the status of
the service ➢ Previous calls ➢ Health checks ❖ If it’s degraded, don’t call it (close the circuit) ➢ Return a default response ➢ Enqueue the operation for later retry ➢ Throw an error

.Do It Yourself

Many implementations available ❖ Communication layer ➢ gRPC, Cap’n proto,
Thrift… ➢ REST, JSON… ❖ Services platform ➢ Spring boot, Dropwizard, Spark, Ninja, Jodd… ❖ Netflix stack ➢ Hystrix, Ribbon…

Make your own combination! (it can’t be so difficult…)

Aarón Fas @aaronfc Andrés Viedma @andres_viedma .Thanks! Questions?

Experiences with Microservices at Tuenti

Experiences with Microservices at Tuenti

More Decks by Tuenti

Other Decks in Programming

Featured

Transcript