Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building a global startup with Cassandra

Dave Gardner
September 12, 2014

Building a global startup with Cassandra

Slides from my talk at the Cassandra Summit 2014 about our journey at Hailo from a monolithic application and DB to a distributed application and DB. Lots of information about our Go and Java microservice architecture and the process and challenges of migration.

Dave Gardner

September 12, 2014
Tweet

More Decks by Dave Gardner

Other Decks in Technology

Transcript

  1. Building a
    global startup
    David Gardner, Chief Architect at Hailo #CassandraSummit

    View Slide

  2. CASSANDRASUMMIT2014

    View Slide

  3. CASSANDRASUMMIT2014

    View Slide

  4. CASSANDRASUMMIT2014

    View Slide

  5. CASSANDRASUMMIT2014

    View Slide

  6. CASSANDRASUMMIT2014
    Factors driving C* adoption
    Resilience Geographical
    replication
    Cover future
    growth

    View Slide

  7. CASSANDRASUMMIT2014
    London
    Launch
    North America
    Launch
    Atlanta
    Launch
    Introduction
    of C*
    November 2011 September 2012 March 2014
    Ireland
    Migration
    August
    C* only cities
    launched
    MySQL
    primary
    storage
    C* adoption timeline

    View Slide

  8. CASSANDRASUMMIT2014
    Starting a
    startup

    View Slide

  9. CASSANDRASUMMIT2014

    View Slide

  10. CASSANDRASUMMIT2014
    Introduction
    of C*
    C* only cities
    launched
    MySQL
    primary
    storage
    From MySQL to Cassandra

    View Slide

  11. PHP
    Cust
    API
    eu-west-1
    PHP
    Driver
    API
    Java
    Hailo
    Engine
    MySQL
    PHP
    Driver
    API
    PHP
    Driver
    API
    PHP
    Cust
    API
    PHP
    Cust
    API
    Load Balancer Load Balancer
    MySQL
    Redis
    H1

    View Slide

  12. PHP
    Cust
    API
    eu-west-1
    Java
    Hailo
    Engine
    MySQL
    PHP
    Driver
    API
    PHP
    Cust
    API
    PHP
    Cust
    API
    ELB ELB
    Java
    Hailo
    Engine
    PHP
    Driver
    API
    ELB
    MySQL
    Java
    Hailo
    Engine
    MySQL
    PHP
    Driver
    API
    ELB
    C* C* C*
    PHP
    Cust
    service
    PHP
    Credits
    service
    Java
    Pay
    service
    eu-west-1
    us-east-1
    H1.5

    View Slide

  13. us-east-1
    C* C* C*
    eu-west-1
    ELB
    Go “Thin” API
    RabbitMQ Message Bus
    (federated clusters per AZ)
    Go
    Service
    Go
    Service
    Java
    Service
    C* C* C*
    ELB
    Go “Thin” API
    RabbitMQ Message Bus
    (federated clusters per AZ)
    Go
    Service
    Go
    Service
    Java
    Service
    H2

    View Slide

  14. View Slide

  15. CASSANDRASUMMIT2014
    •  15,000 jobs/hour
    •  3,000 req/sec
    •  20,000 drivers on
    shift
    Testing at Hailo

    View Slide

  16. CASSANDRASUMMIT2014
    Economy Premium
    us-east-1 eu-west-1
    us-east-1 eu-west-1
    C* 2.0.8
    i2.xlarge
    8GB heap
    800GB storage (1 SSD ephemeral)
    C* 2.0.8
    m1.4xlarge
    8GB heap
    1.7TB storage (4x400GB ephemeral)

    View Slide

  17. CASSANDRASUMMIT2014
    Premium
    •  Read heavy (10:1)
    •  Avoid issues caused
    by compactions
    Economy
    •  Write heavy (10:1)
    •  Secondary use cases

    View Slide

  18. CASSANDRASUMMIT2014
    CASSANDRA-4430

    View Slide

  19. CASSANDRASUMMIT2014
    Entity storage
    Basic CRUD for
    passengers, drivers, jobs
    Time series
    Logs of actions during
    jobs, logins
    Search
    For management portals
    Analytics
    Counting events

    View Slide

  20. CASSANDRASUMMIT2014
    1/ Entity
    storage

    View Slide

  21. CASSANDRASUMMIT2014

    View Slide

  22. CASSANDRASUMMIT2014
    H1
    $sql = "INSERT INTO
    `customers`
    (`email`, `password`,
    `created`, `status`)
    VALUES
    ('$e', '$p', '$t', '$s')";
    #php #mysql
    * parameter sanitisation snipped

    View Slide

  23. CASSANDRASUMMIT2014
    Functionality
    •  IDs generated by MySQL auto-increment
    •  Index constraints (phone, email) checked
    by unique MySQL indexes automatically
    •  ACID consistency, familiar
    H1 #php #mysql

    View Slide

  24. 77669839702851584
    41 bits Timestamp
    millisecond precision, bespoke epoch
    10 bits Configured machine ID
    12 bits Sequence number
    CASSANDRASUMMIT2014
    H2
    ID Generation using Snowflake derivative
    #golang #cassandra

    View Slide

  25. lock, err := lockUser(user)
    defer lock.Unlock()
    if err != nil {
    return errors.ISE(..)
    }
    CASSANDRASUMMIT2014
    H2
    Index checking using flawed ZK/C* combo
    #golang #cassandra

    View Slide

  26. CASSANDRASUMMIT2014
    Ideal solution: C* CAS with CQL
    INSERT INTO users
    (email, name)
    VALUES ('[email protected]',
    'Jane Doe’)
    IF NOT EXISTS

    View Slide

  27. CASSANDRASUMMIT2014
    H1
    $sql = "UPDATE `customers`
    SET `email`='$e'
    WHERE id = '1234'";
    #php #mysql
    * parameter sanitisation snipped

    View Slide

  28. CASSANDRASUMMIT2014
    CQL or Thrift the same rule applies
    •  Update individual columns that equate to
    user actions
    •  Example: set auto tip should mutate one
    column
    H2 #cassandra

    View Slide

  29. CASSANDRASUMMIT2014
    2/ Time
    series

    View Slide

  30. CASSANDRASUMMIT2014

    View Slide

  31. CASSANDRASUMMIT2014
    H1
    $sql = "SELECT * FROM `quotes`
    WHERE `customer`='$c’
    ORDER BY `timestamp` DESC
    LIMIT $o,$n";
    #php #mysql
    * parameter sanitisation snipped

    View Slide

  32. CASSANDRASUMMIT2014
    H2
    iter := ts.ReversedIterator(
    start,
    end,
    lastId,
    customerId
    )
    #cassandra

    View Slide

  33. CASSANDRASUMMIT2014
    H2
    for iter.Next() {
    job := &Job{}
    if err := iter.Item().Unmarshal(job); err != nil {
    return nil, "”, err
    }
    jobs = append(jobs, job)
    if len(jobs) >= count {
    break
    }
    }
    return jobs, iter.Last(), nil
    #cassandra

    View Slide

  34. CASSANDRASUMMIT2014
    Time series library
    •  Buckets by configurable row time range
    •  Supports an index for sparse datasets to
    track which time ranges exist
    (can iterate all)
    H2 #cassandra

    View Slide

  35. CASSANDRASUMMIT2014
    Ideal solution: CQL
    CREATE TABLE temperature (
    weatherstation_id text,
    event_time timestamp,
    temperature text,
    PRIMARY KEY
    (weatherstation_id,event_time)
    );

    View Slide

  36. CASSANDRASUMMIT2014
    Ideal solution: CQL
    •  Better data model (discrete columns)
    •  Lacks row time partitioning
    •  Ideally would still wrap in a nice library

    View Slide

  37. CASSANDRASUMMIT2014
    3/ Search

    View Slide

  38. CASSANDRASUMMIT2014

    View Slide

  39. CASSANDRASUMMIT2014
    Functionality
    •  Search to power admin portals for
    company management
    •  Ideally not on the critical path for getting
    people Hailos to where they want to be

    View Slide

  40. CASSANDRASUMMIT2014
    H1
    $sql = "SELECT * FROM
    customers WHERE (
    firstname LIKE '$keywords%’
    OR lastname LIKE '$keywords%’
    OR email LIKE '$keywords%’
    OR phone LIKE '%$keywords%')"
    #php #mysql
    * parameter sanitisation snipped

    View Slide

  41. CASSANDRASUMMIT2014
    H2 #elasticsearch
    Customer
    Search
    Service
    Search
    Service
    Event
    Federation
    Service
    us-east-1
    Regional
    Elastic
    Search
    Customer
    Search
    Service
    Search
    Service
    Event
    Federation
    Service
    eu-west-1
    Regional
    Elastic
    Search

    View Slide

  42. CASSANDRASUMMIT2014
    Implementation
    •  Primary data store is C*
    •  Microservice listens to changes and
    asynchronously indexes documents
    •  Documents federated between regions to
    independent Elastic Search clusters

    View Slide

  43. CASSANDRASUMMIT2014
    4/ Analytics

    View Slide

  44. CASSANDRASUMMIT2014

    View Slide

  45. H1
    SELECT
    COUNT(*) as numJobs,
    SUM(jobs.fare + jobs.tip) as totalFare,
    AVG(jobs.cleared - jobs.created) AS avgJobTimeInSeconds,
    SUM(jobs.cleared - jobs.created) AS timePOBInSeconds,
    MAX(jobs.distance) AS distanceLongestJobInMetres,
    (SELECT pickup_sector FROM (
    SELECT COUNT(*) as numPickups, pickup_sector FROM jobs
    WHERE jobs.cleared >= $from AND jobs.cleared < $to
    AND jobs.driver=$driverId
    AND pickup_sector IS NOT NULL AND pickup_sector != ''
    GROUP BY pickup_sector
    ORDER BY numPickups DESC
    LIMIT 1
    ) AS cpp) as mostFrequentPickupSector
    FROM jobs
    WHERE jobs.cleared >= $from AND jobs.cleared < $to AND jobs.driver = $driverId
    #php #mysql #truestory #omg

    View Slide

  46. CASSANDRASUMMIT2014
    H2 #nsq #redshift #cassandra
    Event firehose (NSQ)
    Service Service Service
    Redshift
    loader
    service
    S3 archive
    service
    HOB
    activity
    service
    Generation Processing

    View Slide

  47. {
    “eventType”: “point”,
    “timestamp”: “123456789”,
    “driver”: “LON1234”,
    “lat”:
    “lng”:
    }
    for {
    get event from firehose
    add to map[time.Time]map[string]*HyperLogLog
    if batch full || max flush time {
    flush batch to C* and complete in NSQ
    }
    }
    ROW: metric name + time component
    COL: time component + random batch UID
    VAL: binary HLL data
    Raw events
    In memory
    batched
    rollup
    C* storage

    View Slide

  48. CASSANDRASUMMIT2014
    Counting distinct things with C*
    •  HyperLogLog data structure bounds
    memory and gives probabilistic answers
    •  Idempotent flush to C* plus global
    replication
    •  Can make lots of columns

    View Slide

  49. CASSANDRASUMMIT2014
    Ideal solution: split out analytics
    •  Turn things that happen into “events”
    •  Put events into NSQ or Kafka
    •  Use events in different ways to suit
    different use cases

    View Slide

  50. CASSANDRASUMMIT2014

    View Slide

  51. CASSANDRASUMMIT2014
    Operational and
    organisational
    challenges

    View Slide

  52. CASSANDRASUMMIT2014
    2012 2013 2014
    Zero downtime operational changes
    first cluster
    phpcassa
    v1.0.9
    eu-west
    stats cluster
    astynax
    v1.1
    us-east, eu-west
    expand into
    ap-northeast-1
    split into premium/
    economy clusters
    v2.0
    us-east, eu-west
    deprecate stats
    cluster
    adopt Go and
    gossie client
    expand into
    us-east-1

    View Slide

  53. CASSANDRASUMMIT2014
    Meet the experts
    Julien
    Campan
    Cassandra
    Analyst
    Since the C* Summit 2013 we have
    added dedicated C* support
    Chris
    Hoolihan
    Infrastructure
    Architect

    View Slide

  54. CASSANDRASUMMIT2014
    Welp! My client has been abandoned

    View Slide

  55. CASSANDRASUMMIT2014
    Huzzah! @mattstump to the rescue!

    View Slide

  56. CASSANDRASUMMIT2014
    Welp! 16.6GB maximum row
    size

    View Slide

  57. CASSANDRASUMMIT2014

    View Slide

  58. Monolithic DB
    Monolithic app
    Monolithic team
    Distributed DB
    Distributed app
    Distributed team
    CASSANDRASUMMIT2014

    View Slide

  59. CASSANDRASUMMIT2014
    Wrapping up

    View Slide

  60. CASSANDRASUMMIT2014

    View Slide

  61. CASSANDRASUMMIT2014
    Today, I would start with C* for Hailo
    •  Technology startup which has to be big to
    work (network effect)
    •  Experienced and well-funded founders with
    immediate global ambition
    •  C* 2.x with CQL and C*-experienced team

    View Slide

  62. CASSANDRASUMMIT2014
    Micro-
    services

    View Slide

  63. CASSANDRASUMMIT2014
    Thanks

    View Slide

  64. CASSANDRASUMMIT2014
    Image credits
    h"ps://www.flickr.com/photos/wrwetzel/7302103558/in/photostream/  
    h"ps://www.flickr.com/photos/w4nd3rl0st/12491208355  
    h"ps://www.flickr.com/photos/cdevers/5702488800  
    h"ps://www.flickr.com/photos/rh2ox/9990016123  
    h"ps://www.flickr.com/photos/davidnunn/11933033385  
    h"ps://www.flickr.com/photos/bre"morrison/4780958663  
    h"ps://www.flickr.com/photos/nathaninsandiego/2732195668

    View Slide