Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Hailo Tech Platform

Hailo Tech Platform

A technical deep dive presented to prospective investors in New York in July 2014. Covered the entire Hailo technology stack and explained how it had evolved.

Dave Gardner

July 01, 2014
Tweet

More Decks by Dave Gardner

Other Decks in Technology

Transcript

  1. Hailo Tech Platform
    David Gardner, Chief Architect

    View Slide

  2. What this talk will cover
    1. Philosophy and inspiration behind Hailo’s technology
    2. High level architecture overview
    3. Web, mobile and backend technology stacks and tools
    4. Microservice architecture and infrastructure stack
    5. Build and release process
    6. Automated testing

    View Slide

  3. Philosophy and inspiration

    View Slide

  4. Let people build things, quickly
    • Anti-fragile - embrace failure
    • Cloud native
    • Safety net - testing automation
    • Freedom and responsibility

    View Slide

  5. Utopia Dystopia
    Static
    Better
    Cheaper
    Sooner
    Dynamic
    Broken
    Inefficient

    View Slide

  6. Product Owner • Independent self-organising
    teams covering a range of
    skills
    • Embedded QA and data
    science
    • SCRUM process with 2 week
    sprints
    • Focus on speed to market
    followed by data-driven
    iteration
    Scrum Master
    Android Engineer
    iOS Engineer
    Web Engineer
    Backend Engineer
    Data Analyst

    View Slide

  7. High level architecture

    View Slide

  8. Hailo architecture
    • Native iOS and Android clients for drivers and passengers (4 apps)
    • H2 platform for running backend services and desktop/mobile web
    apps
    • Fully cloud-based hosting on AWS
    • “Microservices” architecture
    • RabbitMQ message bus transporting protobuf encoded messages

    View Slide

  9. us-east-1
    C* C* C*
    eu-west-1
    ELB
    Go “Thin” API
    RabbitMQ Message Bus

    (federated clusters per AZ)
    Go
    Service
    Go
    Service
    Java
    Service
    C* C* C*
    ELB
    Go “Thin” API
    RabbitMQ Message Bus

    (federated clusters per AZ)
    Go
    Service
    Go
    Service
    Java
    Service
    1.
    2.
    3.

    View Slide

  10. Designed for global

    View Slide

  11. ELB
    Go “Thin” API
    1.
    H1
    Driver
    API
    H2 “API”
    Service
    H2 Orch.
    Service
    H2 Core
    Service
    v1-api-driver-london.elasticride.com
    api-driver-london.elasticride.com
    http Message Bus
    RMQ = Hailo’s RabbitMQ
    Message Bus
    Message Bus
    Message Bus
    • Elastic Load Balancer
    terminates SSL connections
    and balances between
    instances in a region
    • Rule-based router built into
    “thin API” can send traffic to
    old and new backends
    • “API” Service acts as a
    translation layer between
    legacy interfaces and newer
    protobuf-defined interfaces

    View Slide

  12. 2.
    RMQ RMQ
    cluster
    RMQ RMQ
    cluster
    RMQ RMQ
    cluster
    Service
    haproxy
    Services always connect to localhost.
    HAProxy sends to the same AZ, unless that AZ is
    down, in which case it “fails over” to a different AZ.
    RMQ runs in clusters of 2, within each AZ
    Each exchange is federated to the other AZs

    View Slide

  13. Handler
    3.
    Logic
    Storage
    go-platform-layer
    go-service-layer
    Self-configuring external service adapters
    Library for building services that talk via RMQ
    Services get for free:
    • Service discovery
    • Monitoring
    • Authentication/authorisation
    • Provisioning
    • AB testing
    • Self-configuring connectivity
    to third-party services

    View Slide

  14. ap-southeast-1 us-east-1 eu-west-1
    Cassandra provides inter-region
    active-active replication

    View Slide

  15. Web technology

    View Slide

  16. Web technologies and languages
    • JavaScript
    • Node, Angular, React, Backbone, Require, Grunt, Bower, Mocha, Qunit,
    Phantom (plus many more client libs)
    • Ruby
    • SASS, Jekyll
    • Hailo Web Platform
    • Fully integrated with the H2 build and deployment system via
    Jenkins CI

    View Slide

  17. Hailo Web Platform
    • RPC over HTTP web API, Websocket API for event streaming
    • JS library to authenticate with and use the Hailo APIs
    • Fully mobile custom UI framework using SASS
    • Web modules and components libraries for UI widgets, maps and
    graphing
    • Automated CI build and test, plus one-click deploy to any
    environment

    View Slide

  18. Hailo Web Platform, continued
    • Internationalization framework, integrated with CrowdIn
    • Client error logging
    • User browser tracking
    • A/B testing and reporting framework
    • Webapp manifests for cross-app deep linking
    • Homescreen for webapp discoverability

    View Slide

  19. 35 web projects, ranging from public H4B to log UI

    View Slide

  20. • Hailo Web UI is our version of
    “bootstrap” and makes it easy to
    build web projects with a common
    look and feel
    • Hailo web toolkit provides
    common libraries for making API
    calls and managing session state
    • Designed to be reactive – scaling
    up from mobile to desktop clients

    View Slide

  21. View Slide

  22. Hailo Web UI project enables “full stack” initiative

    View Slide

  23. Mobile technology

    View Slide

  24. Mobile stack
    • Java for Android
    • Objective-C and C for iOS
    • Some components built in C++ and shared between platforms
    • Eclipse and Xcode for software development
    • Cucumber and Calabash for testing
    • Integrated with Jenkins CI for packaging and beta-deploy

    View Slide

  25. Backend technology

    View Slide

  26. Backend stack
    • Mainly Go with some use of Java
    • Various open source middleware for distributed storage,
    coordination, search and caching
    • Sublime text for software development
    • Fully integrated with the H2 build and deployment system via
    Jenkins CI

    View Slide

  27. Backend open source middleware
    • Apache Cassandra Multi-region distributed database
    • Apache Zookeeper Per-region distributed coordination
    • RabbitMQ Per-region message bus
    • NSQ Distributed durable message queue
    • Memcache Per-region in-memory KV store
    • Elastic Search Per-region distributed search index

    View Slide

  28. ETA
    Service
    Routing
    Service
    Phone
    Service
    Profile
    Service
    State
    Service
    Charge
    Service
    Near
    Drivers
    Service
    Tow
    Truck
    Service
    Restau-
    rant
    Service
    Place
    Service
    /v1/customer/neardrivers
    API TIER
    ORCHESTRATION
    TIER
    CORE
    TIER

    View Slide

  29. Microservice SOA

    View Slide

  30. Infrastructure

    View Slide

  31. High level infrastructure architecture
    • Everything run out of 2 AWS regions (EU-WEST-1 and US-EAST-1)
    • META VPC in each region which hosts shared services, and
    terminates our client VPNs
    • Each "environment" also has its own VPC (LVE, STG, TST). These
    are peered to the META VPCs
    • Each VPC has 3 sets of everything, in line with the idea of
    "QUORUM" - we could lose one of anything (instance, subnet, AZ)
    but we'd still have more than 50% of our full capacity available

    View Slide

  32. What makes up a VPC
    • NAT gateway
    • "External" subnet with 512 IP addresses available (used for things
    that require an external IP address)
    • "Internal" subnet with 512 IP addresses available
    • "Secure" subnet with 512 addresses available (used for things that
    need to communicate over the site-to-site VPN)
    • We've also got a spare 512 addresses in each AZ in case we need
    it, and the ability to allocate up to 8192 addresses per VPC

    View Slide

  33. View Slide

  34. View Slide

  35. Detailed security separation within regions

    View Slide

  36. Resilient site-to-site VPN links

    View Slide

  37. Push button environment launch

    View Slide

  38. AWS API + AMI + Puppet = fully initialised environment

    View Slide

  39. Build and release process

    View Slide

  40. 1. Branch
    2. Write code + tests
    3. Push code Jenkins automatically builds branch
    4. Create Pull Request Status from CI fed back to PR UI
    5. Review + merge
    6. Deploy to staging
    7. Automated QA App UAT, robomon, load testing
    8. Deploy to production
    9. Monitor status

    View Slide

  41. Code added via pull requests, reviewed via Github

    View Slide

  42. Jenkins runs CI for all projects

    View Slide

  43. Janky provides Hipchat integration

    View Slide

  44. UI for developers to deploy backend and web projects

    View Slide

  45. UI for monitoring continuity and health after release

    View Slide

  46. All infrastructure and deployment changes tracked

    View Slide

  47. Automated testing

    View Slide

  48. Testing
    • Automated User Acceptance Testing (UAT) for each app build
    • Automated test suites for backend services
    • Autonomous agents for testing backend services under load (robot
    drivers and passengers)
    • Integrated failure testing to assert anti-fragility

    View Slide

  49. Cucumber used to define BDD test cases

    View Slide

  50. Calabash used to run the tests on multiple devices

    View Slide

  51. Backend testing at scale,
    Thursday 24th July
    • 50,000 completed jobs
    • 15,000 jobs/hour peak
    • 20,000,000 driver location
    updates
    • 1,600 updates/second
    • 12,000 drivers on shift
    This is a quiet day

    View Slide

  52. Testing blurs into monitoring

    View Slide

  53. Kerguelen Island hosts our production test fleet

    View Slide

  54. H2 re-platforming

    View Slide

  55. Original ambitions
    • Provide a simple framework for us to build an efficient, resilient,
    second generation Hailo
    • Allow Hailo to scale the business along three axis: adding features
    to our current business, adding cities and brand new stuff
    • Solve pain points in our current architecture
    • Be productive

    View Slide

  56. + features + cities
    + brand new + productive
    - pain

    View Slide

  57. PHP
    Cust
    API
    eu-west-1
    Java
    Hailo
    Engine
    MySQL
    PHP
    Driver
    API
    PHP
    Cust
    API
    PHP
    Cust
    API
    ELB ELB
    Java
    Hailo
    Engine
    PHP
    Driver
    API
    ELB
    MySQL
    Java
    Hailo
    Engine
    MySQL
    PHP
    Driver
    API
    ELB
    C* C* C*
    PHP
    Cust
    service
    PHP
    Credits
    service
    Java
    Pay
    service
    eu-west-1
    us-east-1
    Servers needed per-city + cities

    View Slide

  58. PHP
    Cust
    API
    eu-west-1
    Java
    Hailo
    Engine
    MySQL
    PHP
    Driver
    API
    PHP
    Cust
    API
    PHP
    Cust
    API
    ELB ELB
    Java
    Hailo
    Engine
    PHP
    Driver
    API
    ELB
    MySQL
    Java
    Hailo
    Engine
    MySQL
    PHP
    Driver
    API
    ELB
    C* C* C*
    PHP
    Cust
    service
    PHP
    Credits
    service
    Java
    Pay
    service
    eu-west-1
    us-east-1
    City configuration in many places + cities
    PHP array
    YAML
    Config service
    plists built
    into app
    plists built
    into app
    XML XML
    Config
    service
    PHP array
    YAML

    View Slide

  59. PHP
    Cust
    API
    eu-west-1
    Java
    Hailo
    Engine
    MySQL
    PHP
    Driver
    API
    PHP
    Cust
    API
    PHP
    Cust
    API
    ELB ELB
    Java
    Hailo
    Engine
    PHP
    Driver
    API
    ELB
    MySQL
    Java
    Hailo
    Engine
    MySQL
    PHP
    Driver
    API
    ELB
    C* C* C*
    PHP
    Cust
    service
    PHP
    Credits
    service
    Java
    Pay
    service
    eu-west-1
    us-east-1
    Coordination changing three apps + features

    View Slide

  60. PHP
    Cust
    API
    eu-west-1
    Java
    Hailo
    Engine
    MySQL
    PHP
    Driver
    API
    PHP
    Cust
    API
    PHP
    Cust
    API
    ELB ELB
    Java
    Hailo
    Engine
    PHP
    Driver
    API
    ELB
    MySQL
    Java
    Hailo
    Engine
    MySQL
    PHP
    Driver
    API
    ELB
    C* C* C*
    PHP
    Cust
    service
    PHP
    Credits
    service
    Java
    Pay
    service
    eu-west-1
    us-east-1
    Unclear responsibilities + features
    Eg: payment or
    cancellation

    View Slide

  61. PHP
    Cust
    API
    eu-west-1
    Java
    Hailo
    Engine
    MySQL
    PHP
    Driver
    API
    PHP
    Cust
    API
    PHP
    Cust
    API
    ELB ELB
    Java
    Hailo
    Engine
    PHP
    Driver
    API
    ELB
    MySQL
    Java
    Hailo
    Engine
    MySQL
    PHP
    Driver
    API
    ELB
    C* C* C*
    PHP
    Cust
    service
    PHP
    Credits
    service
    Java
    Pay
    service
    eu-west-1
    us-east-1
    Broad but inflexible services + features, + brand new

    View Slide

  62. PHP
    Cust
    API
    eu-west-1
    Java
    Hailo
    Engine
    MySQL
    PHP
    Driver
    API
    PHP
    Cust
    API
    PHP
    Cust
    API
    ELB ELB
    Java
    Hailo
    Engine
    PHP
    Driver
    API
    ELB
    MySQL
    Java
    Hailo
    Engine
    MySQL
    PHP
    Driver
    API
    ELB
    C* C* C*
    PHP
    Cust
    service
    PHP
    Credits
    service
    Java
    Pay
    service
    eu-west-1
    us-east-1
    Separate push deployment models + productive
    rsync
    conan/cap
    rsync
    conan/cap
    conan/cap

    View Slide

  63. PHP
    Cust
    API
    eu-west-1
    Java
    Hailo
    Engine
    MySQL
    PHP
    Driver
    API
    PHP
    Cust
    API
    PHP
    Cust
    API
    ELB ELB
    Java
    Hailo
    Engine
    PHP
    Driver
    API
    ELB
    MySQL
    Java
    Hailo
    Engine
    MySQL
    PHP
    Driver
    API
    ELB
    C* C* C*
    PHP
    Cust
    service
    PHP
    Credits
    service
    Java
    Pay
    service
    eu-west-1
    us-east-1
    Different auth models - pain
    none none
    IP whitelist
    plus token
    turned off
    IP whitelist
    plus login
    service

    View Slide

  64. PHP
    Cust
    API
    eu-west-1
    Java
    Hailo
    Engine
    MySQL
    PHP
    Driver
    API
    PHP
    Cust
    API
    PHP
    Cust
    API
    ELB ELB
    Java
    Hailo
    Engine
    PHP
    Driver
    API
    ELB
    MySQL
    Java
    Hailo
    Engine
    MySQL
    PHP
    Driver
    API
    ELB
    C* C* C*
    PHP
    Cust
    service
    PHP
    Credits
    service
    Java
    Pay
    service
    eu-west-1
    us-east-1
    SPOFs - pain

    View Slide

  65. PHP
    Cust
    API
    PHP
    Cust
    API
    PHP
    Cust
    API
    ELB
    ELB
    Load balancing broken and complex - pain
    PHP
    Cust
    HAP
    ELB
    PHP
    Credits
    HAP
    ELB
    Java
    Pay
    HAP
    Phone
    ELB
    API
    ELB
    HAProxy
    Service
    ELB
    HAProxy
    Service
    us-east-1
    eu-west-1
    us-east-1

    View Slide

  66. Key features
    • Lose PHP, adopt Go – gains in efficiency of developer time and
    compute resource
    • Eliminate all SPOFs – adopt a cloud native approach to build a
    working whole out of ephemeral and often broken parts
    • Scale engineering output in line with additional resource – services
    with few, clearly defined responsibilities reduce friction
    • Increase reusability – develop features by composing fine-grained
    services that are agnostic to Hailo’s current operation

    View Slide

  67. Discovery
    Service
    Binding
    Service
    Config
    Service
    Login
    Service
    • Keeps track of every running instance of a service
    within a single region
    • Stores this information as ephemeral nodes in
    Zookeeper, keeping a watch to ensure strong
    consistency between all instances of the discovery
    service within a region
    • Sends heartbeats to services periodically via RMQ
    and removes dead instances
    • Self-healing system because instances that don’t
    receive heartbeats will try to reconnect and failing
    that die

    View Slide

  68. Discovery
    Service
    Binding
    Service
    Config
    Service
    Login
    Service
    • Creates bindings within RabbitMQ for all running
    services, leveraging information in the discovery
    service
    • Reacts to services coming up and down by
    creating and destroying bindings
    • Bindings establish a connection between an
    exchange and a queue
    • Stores and manages control plane data in order to
    provide advanced bindings such as “send 10% of
    traffic to this particular version of the service”

    View Slide

  69. Discovery
    Service
    Binding
    Service
    Config
    Service
    Login
    Service
    • Stores application configuration data as JSON
    • Able to store JSON under any arbitrary key
    • Can combine many keys, on request, to serve up
    “compiled” config

    View Slide

  70. Discovery
    Service
    Binding
    Service
    Config
    Service
    Login
    Service
    • Credential and session/token store for all
    applications
    • The only thing that is able to issue and sign (with
    private key) tokens
    • Applications can exchange a session ID for a
    token, which they can then use to establish
    authorisation

    View Slide