$30 off During Our Annual Pro Sale. View Details »

From 1 to 20 million users the technical story of BlaBlaCar

From 1 to 20 million users the technical story of BlaBlaCar

IPC Spring Berlin 2015

Matthieu Moquet

June 06, 2015
Tweet

More Decks by Matthieu Moquet

Other Decks in Programming

Transcript

  1. the technical story of
    From 1 to 20 million users

    View Slide

  2. Matthieu Moquet
    @MattKetmo
    web engineer at

    View Slide

  3. Leader ride-sharing service
    Our goal is to become the 1st travel platform

    View Slide

  4. Why this talk?
    ❖ History of the BlaBlaCar platform
    ❖ Overview our main technical choices
    ❖ Understand of our culture & methodologies

    View Slide

  5. v1.0
    2005–2006

    View Slide

  6. The prototype

    View Slide

  7. v2.0 (2008) – Plain PHP

    View Slide

  8. 2008 2010 2011 2012
    v2.0

    View Slide

  9. View Slide

  10. View Slide

  11. Homemade framework
    +
    No code convention
    +
    Messy
    =
    Not maintainable in the
    long term

    View Slide

  12. $  wc  -­‐l  lib.trip.php
    Longest method: 1000+ lines
    3678

    View Slide

  13. View Slide

  14. No tests

    View Slide

  15. Going further will be a pain
    Hiring will be hard
    We need to change things

    View Slide

  16. Refactor all the code

    View Slide

  17. Symfony 2.0

    View Slide

  18. Because community matters

    View Slide

  19. Attract new talented 

    & motivated people

    View Slide

  20. Separation of Concerns
    Dependencies Injection
    Integration & Unit tests
    etc.
    Embrace best practices

    View Slide

  21. Update product design

    View Slide

  22. View Slide

  23. Agility
    is our best strength

    View Slide

  24. — Eberhardt Von Kuenheim
    « The big will not always eat the small, 

    but the fast ones will overtake the slow ones »

    View Slide

  25. 10+
    deployments per day

    View Slide

  26. DEVOPS… before it was cool
    « if you break it you fix it »

    View Slide

  27. Automation
    Automatize as much as possible to
    reduce the time between dev and prod

    View Slide

  28. Simple development workflow
    Master
    Branch

    View Slide

  29. git push origin feat-awesome
    Deploy on staging environment
    is as simple as

    View Slide

  30. Staging env. for QA & product owners
    https://feat-awesome.staging.blablacar.com

    View Slide

  31. Continuous Integration
    https://github.com/alexandresalome/behat-launcher

    View Slide

  32. View Slide

  33. Localization
    Update translations on the fly

    View Slide

  34. Frontend server Databases
    Backoffice
    Sync translations
    Update translations

    View Slide

  35. Download last translations
    when app boot
    Update translations
    Mobile app
    CDN
    Backoffice

    View Slide

  36. openl10n.io

    View Slide

  37. Progressive
    Rollout
    ❖ Open new countries with v3 one by one
    ❖ It took about 2.5 years to run v3
    everywhere
    ❖ Today we can deploy new features for a
    set of users (by attributes or random)

    View Slide

  38. Monitoring

    View Slide

  39. ELK

    View Slide

  40. Monitoring

    View Slide

  41. Departure to new horizons…

    View Slide

  42. Opportunity to
    change backend
    But we still
    keep the
    same primary
    database

    View Slide

  43. Database Updates
    Sync replication & No Master SPOF

    View Slide

  44. Better archiving strategy

    View Slide

  45. Photo Storage
    Don’t store static BLOB into MySQL.
    Use an elastic filesystem storage.
    MySQL AWS S3

    View Slide

  46. elasticsearch
    ❖ Horizontally scalable
    ❖ Geo type index
    ❖ Aggregations

    View Slide

  47. Varnish
    Fast reverse proxy cache
    30% HIT/MISS
    Firemode to handle high traffic (TTL)
    Be careful with authenticated user blocks
    (Javascript is your friend)

    View Slide

  48. SPDY (HTTP 2.0)
    Fast API calls for (compliant) mobile clients
    ngx_http_spdy_module

    View Slide

  49. Asynchronous jobs
    Command
    Command
    Handlers
    Commands Queueing
    Ack

    View Slide

  50. workers/  
       mail  
       sms  
       push  
       image-­‐resize  
       indexer  
       cache-­‐invalidation  
       elasticsearch-­‐indexation  
       trip-­‐publication  
       ...
    github.com/swarrot

    View Slide

  51. Horizontal scalability

    View Slide

  52. Event dispatcher
    Every business events is dispatched into RabbitMQ
    Easy to watch the events (in real-time or batchs)

    View Slide

  53. Data Warehouse
    Log every business event into Hadoop
    user.register
    user.edit_bio
    user.left_rating
    user.post_trip
    ...

    View Slide

  54. user.register
    user.edit_bio
    user.left_rating
    user.post_trip
    ...
    user.register
    user.post_trip
    ...
    Real-Time dashboard
    Log every business event into Elasticsearch

    View Slide

  55. Such performances.
    Many users.
    Wow.

    View Slide

  56. How to serve the
    product worldwide?

    View Slide

  57. Datacenter centric
    One datastore to rule them all? It ain’t gonna work.
    ??
    ??

    View Slide

  58. Monolithics
    Micro-Services

    View Slide

  59. Today we are mainly
    Monolithics
    But we would love using more
    Micro-Services
    Easier to deploy
    Development workflow
    Legacy database
    Smaller teams
    Faster deployments
    Easier to scale out

    View Slide

  60. Gateway
    Principles (Clean Architecture)
    Decouple models (not db)
    Isolate business & data accesses
    (in the Monolytics to better
    decouple in micro services)
    Restrictive rules
    Enter the

    View Slide

  61. Gateway
    RegisterUser
    PublishTrip PostMessage
    UpdateBio
    PostRating
    Users Messages Trips
    Business
    Data

    View Slide

  62. But this is a long term project…

    View Slide

  63. caching
    In the short term we invested on

    View Slide

  64. Points of presence & Varnish

    View Slide

  65. Mobile First

    View Slide

  66. But how to cache the API?
    GET  /api/trips?from=Paris&to=Berlin  
    Authorization:  Bearer  7c82e855b0415f27bd92d  
    HTTP/1.1  200  OK  
    {  
           "trips":  [...]  
    }

    View Slide

  67. {
    Reverse proxy is useless if
    only the app knows the
    authorizations
    Reverse Proxy
    Client
    User
    Scopes
    Access Token

    View Slide

  68. Let the reverse proxy check
    authorization directly
    Reverse Proxy
    But here is a latency problem

    View Slide

  69. Use of stateless token
    Reverse Proxy
    So we don’t need a
    database anymore

    View Slide

  70. Json Web Token

    View Slide

  71. What mobile apps send
    GET  /api/trips?from=Paris&to=Berlin  
    Authorization:  Bearer  eyJhbGciOiJIUzI1NiI...

    View Slide

  72. What backend servers receive
    GET  /api/trips?from=Paris&to=Berlin  
    X-­‐Auth-­‐User:  1337  
    X-­‐Auth-­‐Client:  android  
    X-­‐Auth-­‐Scope:  user_info,messages

    View Slide

  73. App gets an Access Token
    from the origin
    Reverse Proxy
    App submit request
    with Access Token
    Reverse Proxy transform
    Access Token header into
    custom X-Auth headers

    View Slide

  74. Cachable response
    HTTP/1.1  200  OK  
    Content-­‐Type  application/json  
    Vary:  X-­‐Auth-­‐Scope  
    !
    {  "trips":  [  ...  ]  }

    View Slide

  75. Long term solution to deliver data
    close to the client

    View Slide

  76. ❖ Manage massive amounts of data
    ❖ High Availability
    ❖ Multi Datacenter replication

    View Slide

  77. ‣ Know the read requests before creating your data models
    ‣ Create as many tables (KeySpaces) than you have views
    ‣ Denormalize the data (no join allowed)

    View Slide

  78. CQRS & Event Sourcing
    ❖ Separate Read & Write
    ❖ Eventual consistency
    ❖ But hard to do with legacy
    software / database
    See talk PHPTour 2015
    at moquet.net

    View Slide

  79. C* C* C*
    Primary
    Data
    C*
    Read Data
    User post message
    Endpoint

    View Slide

  80. Messaging (micro) service
    Expose an HTTP API

    View Slide

  81. 2008
    2012
    2013
    2014
    2015
    2016
    2
    6
    15
    30+

    ?

    View Slide

  82. blablatech.com
    We’re hiring
    Follow us

    View Slide

  83. Thank You
    Slides available at
    moquet.net/talks/ipc-2015-blablacar
    Leave feedbacks at @MattKetmo

    View Slide