Building a global startup with Cassandra

779e0fba968b181ac4edbad013f5d3b7?s=47 Dave Gardner
September 12, 2014

Building a global startup with Cassandra

Slides from my talk at the Cassandra Summit 2014 about our journey at Hailo from a monolithic application and DB to a distributed application and DB. Lots of information about our Go and Java microservice architecture and the process and challenges of migration.


  1. Building a global startup David Gardner, Chief Architect at Hailo





  6. CASSANDRASUMMIT2014 Factors driving C* adoption Resilience Geographical replication Cover future

  7. CASSANDRASUMMIT2014 London Launch North America Launch Atlanta Launch Introduction of

    C* November 2011 September 2012 March 2014 Ireland Migration August C* only cities launched MySQL primary storage C* adoption timeline
  8. CASSANDRASUMMIT2014 Starting a startup


  10. CASSANDRASUMMIT2014 Introduction of C* C* only cities launched MySQL primary

    storage From MySQL to Cassandra
  11. PHP Cust API eu-west-1 PHP Driver API Java Hailo Engine

    MySQL PHP Driver API PHP Driver API PHP Cust API PHP Cust API Load Balancer Load Balancer MySQL Redis H1
  12. PHP Cust API eu-west-1 Java Hailo Engine MySQL PHP Driver

    API PHP Cust API PHP Cust API ELB ELB Java Hailo Engine PHP Driver API ELB MySQL Java Hailo Engine MySQL PHP Driver API ELB C* C* C* PHP Cust service PHP Credits service Java Pay service eu-west-1 us-east-1 H1.5
  13. us-east-1 C* C* C* eu-west-1 ELB Go “Thin” API RabbitMQ

    Message Bus (federated clusters per AZ) Go Service Go Service Java Service C* C* C* ELB Go “Thin” API RabbitMQ Message Bus (federated clusters per AZ) Go Service Go Service Java Service H2
  15. CASSANDRASUMMIT2014 •  15,000 jobs/hour •  3,000 req/sec •  20,000 drivers

    on shift Testing at Hailo
  16. CASSANDRASUMMIT2014 Economy Premium us-east-1 eu-west-1 us-east-1 eu-west-1 C* 2.0.8 i2.xlarge

    8GB heap 800GB storage (1 SSD ephemeral) C* 2.0.8 m1.4xlarge 8GB heap 1.7TB storage (4x400GB ephemeral)
  17. CASSANDRASUMMIT2014 Premium •  Read heavy (10:1) •  Avoid issues caused

    by compactions Economy •  Write heavy (10:1) •  Secondary use cases

  19. CASSANDRASUMMIT2014 Entity storage Basic CRUD for passengers, drivers, jobs Time

    series Logs of actions during jobs, logins Search For management portals Analytics Counting events
  20. CASSANDRASUMMIT2014 1/ Entity storage


  22. CASSANDRASUMMIT2014 H1 $sql = "INSERT INTO `customers` (`email`, `password`, `created`,

    `status`) VALUES ('$e', '$p', '$t', '$s')"; #php #mysql * parameter sanitisation snipped
  23. CASSANDRASUMMIT2014 Functionality •  IDs generated by MySQL auto-increment •  Index

    constraints (phone, email) checked by unique MySQL indexes automatically •  ACID consistency, familiar H1 #php #mysql
  24. 77669839702851584 41 bits Timestamp millisecond precision, bespoke epoch 10 bits

    Configured machine ID 12 bits Sequence number CASSANDRASUMMIT2014 H2 ID Generation using Snowflake derivative #golang #cassandra
  25. lock, err := lockUser(user) defer lock.Unlock() if err != nil

    { return errors.ISE(..) } CASSANDRASUMMIT2014 H2 Index checking using flawed ZK/C* combo #golang #cassandra
  26. CASSANDRASUMMIT2014 Ideal solution: C* CAS with CQL INSERT INTO users

    (email, name) VALUES ('jdoe@abc.com', 'Jane Doe’) IF NOT EXISTS
  27. CASSANDRASUMMIT2014 H1 $sql = "UPDATE `customers` SET `email`='$e' WHERE id

    = '1234'"; #php #mysql * parameter sanitisation snipped
  28. CASSANDRASUMMIT2014 CQL or Thrift the same rule applies •  Update

    individual columns that equate to user actions •  Example: set auto tip should mutate one column H2 #cassandra
  29. CASSANDRASUMMIT2014 2/ Time series


  31. CASSANDRASUMMIT2014 H1 $sql = "SELECT * FROM `quotes` WHERE `customer`='$c’

    ORDER BY `timestamp` DESC LIMIT $o,$n"; #php #mysql * parameter sanitisation snipped
  32. CASSANDRASUMMIT2014 H2 iter := ts.ReversedIterator( start, end, lastId, customerId )

  33. CASSANDRASUMMIT2014 H2 for iter.Next() { job := &Job{} if err

    := iter.Item().Unmarshal(job); err != nil { return nil, "”, err } jobs = append(jobs, job) if len(jobs) >= count { break } } return jobs, iter.Last(), nil #cassandra
  34. CASSANDRASUMMIT2014 Time series library •  Buckets by configurable row time

    range •  Supports an index for sparse datasets to track which time ranges exist (can iterate all) H2 #cassandra
  35. CASSANDRASUMMIT2014 Ideal solution: CQL CREATE TABLE temperature ( weatherstation_id text,

    event_time timestamp, temperature text, PRIMARY KEY (weatherstation_id,event_time) );
  36. CASSANDRASUMMIT2014 Ideal solution: CQL •  Better data model (discrete columns)

    •  Lacks row time partitioning •  Ideally would still wrap in a nice library
  37. CASSANDRASUMMIT2014 3/ Search


  39. CASSANDRASUMMIT2014 Functionality •  Search to power admin portals for company

    management •  Ideally not on the critical path for getting people Hailos to where they want to be
  40. CASSANDRASUMMIT2014 H1 $sql = "SELECT * FROM customers WHERE (

    firstname LIKE '$keywords%’ OR lastname LIKE '$keywords%’ OR email LIKE '$keywords%’ OR phone LIKE '%$keywords%')" #php #mysql * parameter sanitisation snipped
  41. CASSANDRASUMMIT2014 H2 #elasticsearch Customer Search Service Search Service Event Federation

    Service us-east-1 Regional Elastic Search Customer Search Service Search Service Event Federation Service eu-west-1 Regional Elastic Search
  42. CASSANDRASUMMIT2014 Implementation •  Primary data store is C* •  Microservice

    listens to changes and asynchronously indexes documents •  Documents federated between regions to independent Elastic Search clusters
  43. CASSANDRASUMMIT2014 4/ Analytics


  45. H1 SELECT COUNT(*) as numJobs, SUM(jobs.fare + jobs.tip) as totalFare,

    AVG(jobs.cleared - jobs.created) AS avgJobTimeInSeconds, SUM(jobs.cleared - jobs.created) AS timePOBInSeconds, MAX(jobs.distance) AS distanceLongestJobInMetres, (SELECT pickup_sector FROM ( SELECT COUNT(*) as numPickups, pickup_sector FROM jobs WHERE jobs.cleared >= $from AND jobs.cleared < $to AND jobs.driver=$driverId AND pickup_sector IS NOT NULL AND pickup_sector != '' GROUP BY pickup_sector ORDER BY numPickups DESC LIMIT 1 ) AS cpp) as mostFrequentPickupSector FROM jobs WHERE jobs.cleared >= $from AND jobs.cleared < $to AND jobs.driver = $driverId #php #mysql #truestory #omg
  46. CASSANDRASUMMIT2014 H2 #nsq #redshift #cassandra Event firehose (NSQ) Service Service

    Service Redshift loader service S3 archive service HOB activity service Generation Processing
  47. { “eventType”: “point”, “timestamp”: “123456789”, “driver”: “LON1234”, “lat”: “lng”: }

    for { get event from firehose add to map[time.Time]map[string]*HyperLogLog if batch full || max flush time { flush batch to C* and complete in NSQ } } ROW: metric name + time component COL: time component + random batch UID VAL: binary HLL data Raw events In memory batched rollup C* storage
  48. CASSANDRASUMMIT2014 Counting distinct things with C* •  HyperLogLog data structure

    bounds memory and gives probabilistic answers •  Idempotent flush to C* plus global replication •  Can make lots of columns
  49. CASSANDRASUMMIT2014 Ideal solution: split out analytics •  Turn things that

    happen into “events” •  Put events into NSQ or Kafka •  Use events in different ways to suit different use cases

  51. CASSANDRASUMMIT2014 Operational and organisational challenges

  52. CASSANDRASUMMIT2014 2012 2013 2014 Zero downtime operational changes first cluster

    phpcassa v1.0.9 eu-west stats cluster astynax v1.1 us-east, eu-west expand into ap-northeast-1 split into premium/ economy clusters v2.0 us-east, eu-west deprecate stats cluster adopt Go and gossie client expand into us-east-1
  53. CASSANDRASUMMIT2014 Meet the experts Julien Campan Cassandra Analyst Since the

    C* Summit 2013 we have added dedicated C* support Chris Hoolihan Infrastructure Architect
  54. CASSANDRASUMMIT2014 Welp! My client has been abandoned

  55. CASSANDRASUMMIT2014 Huzzah! @mattstump to the rescue!

  56. CASSANDRASUMMIT2014 Welp! 16.6GB maximum row size


  58. Monolithic DB Monolithic app Monolithic team Distributed DB Distributed app

    Distributed team CASSANDRASUMMIT2014
  59. CASSANDRASUMMIT2014 Wrapping up


  61. CASSANDRASUMMIT2014 Today, I would start with C* for Hailo • 

    Technology startup which has to be big to work (network effect) •  Experienced and well-funded founders with immediate global ambition •  C* 2.x with CQL and C*-experienced team
  62. CASSANDRASUMMIT2014 Micro- services

  63. CASSANDRASUMMIT2014 Thanks

