From 1 to 20 million users the technical story of BlaBlaCar

From 1 to 20 million users the technical story of BlaBlaCar

IPC Spring Berlin 2015

F135ce7f204af6fac8075b469685c91d?s=128

Matthieu Moquet

June 06, 2015
Tweet

Transcript

  1. the technical story of From 1 to 20 million users

  2. Matthieu Moquet @MattKetmo web engineer at

  3. Leader ride-sharing service Our goal is to become the 1st

    travel platform
  4. Why this talk? ❖ History of the BlaBlaCar platform ❖

    Overview our main technical choices ❖ Understand of our culture & methodologies
  5. v1.0 2005–2006

  6. The prototype

  7. v2.0 (2008) – Plain PHP

  8. 2008 2010 2011 2012 v2.0

  9. None
  10. None
  11. Homemade framework + No code convention + Messy = Not

    maintainable in the long term
  12. $  wc  -­‐l  lib.trip.php Longest method: 1000+ lines 3678

  13. None
  14. No tests

  15. Going further will be a pain Hiring will be hard

    We need to change things
  16. Refactor all the code

  17. Symfony 2.0

  18. Because community matters

  19. Attract new talented 
 & motivated people

  20. Separation of Concerns Dependencies Injection Integration & Unit tests etc.

    Embrace best practices
  21. Update product design

  22. None
  23. Agility is our best strength

  24. — Eberhardt Von Kuenheim « The big will not always

    eat the small, 
 but the fast ones will overtake the slow ones »
  25. 10+ deployments per day

  26. DEVOPS… before it was cool « if you break it

    you fix it »
  27. Automation Automatize as much as possible to reduce the time

    between dev and prod
  28. Simple development workflow Master Branch

  29. git push origin feat-awesome Deploy on staging environment is as

    simple as
  30. Staging env. for QA & product owners https://feat-awesome.staging.blablacar.com

  31. Continuous Integration https://github.com/alexandresalome/behat-launcher

  32. None
  33. Localization Update translations on the fly

  34. Frontend server Databases Backoffice Sync translations Update translations

  35. Download last translations when app boot Update translations Mobile app

    CDN Backoffice
  36. openl10n.io

  37. Progressive Rollout ❖ Open new countries with v3 one by

    one ❖ It took about 2.5 years to run v3 everywhere ❖ Today we can deploy new features for a set of users (by attributes or random)
  38. Monitoring

  39. ELK

  40. Monitoring

  41. Departure to new horizons…

  42. Opportunity to change backend But we still keep the same

    primary database
  43. Database Updates Sync replication & No Master SPOF

  44. Better archiving strategy

  45. Photo Storage Don’t store static BLOB into MySQL. Use an

    elastic filesystem storage. MySQL AWS S3
  46. elasticsearch ❖ Horizontally scalable ❖ Geo type index ❖ Aggregations

  47. Varnish Fast reverse proxy cache 30% HIT/MISS Firemode to handle

    high traffic (TTL) Be careful with authenticated user blocks (Javascript is your friend)
  48. SPDY (HTTP 2.0) Fast API calls for (compliant) mobile clients

    ngx_http_spdy_module
  49. Asynchronous jobs Command Command Handlers Commands Queueing Ack

  50. workers/      mail      sms      push

         image-­‐resize      indexer      cache-­‐invalidation      elasticsearch-­‐indexation      trip-­‐publication      ... github.com/swarrot
  51. Horizontal scalability

  52. Event dispatcher Every business events is dispatched into RabbitMQ Easy

    to watch the events (in real-time or batchs)
  53. Data Warehouse Log every business event into Hadoop user.register user.edit_bio

    user.left_rating user.post_trip ...
  54. user.register user.edit_bio user.left_rating user.post_trip ... user.register user.post_trip ... Real-Time dashboard

    Log every business event into Elasticsearch
  55. Such performances. Many users. Wow.

  56. How to serve the product worldwide?

  57. Datacenter centric One datastore to rule them all? It ain’t

    gonna work. ?? ??
  58. Monolithics Micro-Services

  59. Today we are mainly Monolithics But we would love using

    more Micro-Services Easier to deploy Development workflow Legacy database Smaller teams Faster deployments Easier to scale out
  60. Gateway Principles (Clean Architecture) Decouple models (not db) Isolate business

    & data accesses (in the Monolytics to better decouple in micro services) Restrictive rules Enter the
  61. Gateway RegisterUser PublishTrip PostMessage UpdateBio PostRating Users Messages Trips Business

    Data
  62. But this is a long term project…

  63. caching In the short term we invested on

  64. Points of presence & Varnish

  65. Mobile First

  66. But how to cache the API? GET  /api/trips?from=Paris&to=Berlin   Authorization:

     Bearer  7c82e855b0415f27bd92d   HTTP/1.1  200  OK   {          "trips":  [...]   }
  67. { Reverse proxy is useless if only the app knows

    the authorizations Reverse Proxy Client User Scopes Access Token
  68. Let the reverse proxy check authorization directly Reverse Proxy But

    here is a latency problem
  69. Use of stateless token Reverse Proxy So we don’t need

    a database anymore
  70. Json Web Token

  71. What mobile apps send GET  /api/trips?from=Paris&to=Berlin   Authorization:  Bearer  eyJhbGciOiJIUzI1NiI...

  72. What backend servers receive GET  /api/trips?from=Paris&to=Berlin   X-­‐Auth-­‐User:  1337  

    X-­‐Auth-­‐Client:  android   X-­‐Auth-­‐Scope:  user_info,messages
  73. App gets an Access Token from the origin Reverse Proxy

    App submit request with Access Token Reverse Proxy transform Access Token header into custom X-Auth headers
  74. Cachable response HTTP/1.1  200  OK   Content-­‐Type  application/json   Vary:

     X-­‐Auth-­‐Scope   ! {  "trips":  [  ...  ]  }
  75. Long term solution to deliver data close to the client

  76. ❖ Manage massive amounts of data ❖ High Availability ❖

    Multi Datacenter replication
  77. ‣ Know the read requests before creating your data models

    ‣ Create as many tables (KeySpaces) than you have views ‣ Denormalize the data (no join allowed)
  78. CQRS & Event Sourcing ❖ Separate Read & Write ❖

    Eventual consistency ❖ But hard to do with legacy software / database See talk PHPTour 2015 at moquet.net
  79. C* C* C* Primary Data C* Read Data User post

    message Endpoint
  80. Messaging (micro) service Expose an HTTP API

  81. 2008 2012 2013 2014 2015 2016 2 6 15 30+

    … ?
  82. blablatech.com We’re hiring Follow us

  83. Thank You Slides available at moquet.net/talks/ipc-2015-blablacar Leave feedbacks at @MattKetmo