Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Cost of 100% processing and crashstorage options for Socorro

Cost of 100% processing and crashstorage options for Socorro

Selena Deckelmann

March 26, 2014

More Decks by Selena Deckelmann

Other Decks in Technology


  1. 100% processing Selena Deckelmann and Lars Lohn Web Engineering Workweek

  2. Current 3 year cost: $1.8 million (at 10% processing)

  3. The support classifier Why we want to do 100% processing

  4. http://uncommonrose.com/ presentations/pytn14/resources/ socorro.dollar.svg

  5. Type Initial Cost + DC 2014 Maintenance renewal Hbase $1,117k

    $80k Non-hbase $257k $22k Elastic Search $43k $3k RabbitMQ $4k $1k NFS (symbols) $12k n/a Zeus $10k n/a
  6. Estimated costs for 100% processing

  7. Type Initial Cost + DC Add’l systems Hbase $1,117k Non-hbase

    $257k $138k Elastic Search $43k $11k RabbitMQ $4k NFS (symbols) $12k Zeus $10k $10k Object Store $315k Add’l processors ???
  8. Estimated 3 year cost: $900k (at 100% processing, plus more

  9. $900k < $1.8 million

  10. Big changes • 17 more systems in stage • Postgres

    - +10 systems to store all raw and processed JSON (5 pairs) (SSDs?) • Ceph (or something) instead of HBase • Likely get rid of the Netapp....
  11. Who helped collect this data

  12. System type Who Hbase BIDW/Annie Non-hbase jakem, cshields Elastic Search

    adrian/solarce/bugzilla RabbitMQ solarce NFS (symbols) lerxst Zeus jakem Add’l systems lonnen, lars, me, adrian
  13. Next steps • Test processing throughput (Lars) • Implement Ceph/S3

    crashstorage class • Test Ceph (Inktank meeting Friday!) • Plan for symbols (See Ted later this morning)
  14. Crash Storage Options Selena Deckelmann Web Engineering Workweek Q12014

  15. Most of this information collected in this etherpad: https://etherpad.mozilla.org/socorro-hbase-alternatives

  16. Assumptions • Durability: No loss of user submitted data (crashes)

    • Size: Need a distributed storage mechanism for ~60TB of crash dumps (Current footprint 50 TB unreplicated, ~150TB replicated x3)
  17. Purpose of data we store • raw_dump: reprocessing and putting

    into a debugger • raw_crash: metadata display • processed_crash: MapReduce and reporting
  18. Do we need to store raw crashes/ processed json in

    hbase? If HBase is to continue as our primary crash storage, yes, we need all three of raw_crash, raw_dump and processed crash in there. It is required that we save raw_crash and processed_crash in there if we are to continue to support Map/Reduce jobs on our data.
  19. Assumptions Performance: Need to retrieve single, arbitrary crashes in a

    timely fashion for the web front-end and processors
  20. Assumptions Performance: Need to store single crashes in a timely

    fashion for crashmovers. The only time requirement is that priority jobs must be saved, retrieved and processed within 60 seconds. Since any crash could potentially be a priority job, we must be able to store from the mover with seconds.
  21. 100% processing note! When we move to 100% processing, priority

    jobs may no longer be needed
  22. Assumptions HBase is a CP (consistent, partition tolerant) system. Wasn’t

    initially an explicit requirement, but now important architecturally for our processors and front- end which assume consistency.
  23. Theory • To replace HDFS/HBase/Hadoop, we'll likely need a combination

    of a few new systems. • If we use an AP or AC system, we'll need another layer to ensure consistency.
  24. Options • Distributed Filesystems: GlusterFS, AFS • Object storage: S3,

    Ceph, Nimbus, WOS • Hbase alternative with MR: Riak, Cassandra • Alternative architecture: Fast queue + stream processing system
  25. GlusterFS • Supported by Redhat, lacks interface, just looks like

    a filesystem • Probably too bare-bones for our needs • We’ve already been down the NFS road...
  26. Ceph • CP system • Architecture: http://ceph.com/docs/master/ architecture/ • Object

    gateway docs: http://ceph.com/docs/ master/radosgw/ • Python API example: http://ceph.com/docs/ master/radosgw/s3/python/
  27. Ceph Pros • Provides an S3-like interface • Uses a

    Paxos algorithm for reliability • Have good personal relationship with Sage, main dev
  28. Ceph Cons • no prior ops experience Moz Ops deployed

    a test cluster! • Need to test performance (but not likely to be a dealbreaker) • Need a map-reduce story (maybe)
  29. Riak + RiakCS • AP system • Cons: Cost, reliability

    layer not open source, needs a consistency layer even though very expensive
  30. Cassandra • AP system • https://wiki.apache.org/cassandra/ ArchitectureOverview

  31. Cassandra Pros • Very simple API • Performance on reporting

    side • Designed with operations in mind
  32. Cassandra Cons • Potential loss of data on write due

    to network partition/node loss: http:// aphyr.com/posts/294-call-me-maybe- cassandra • Not designed for large object storage • Best for a starter streaming reporting system
  33. Larger re-architecture • Pursue Kafka + a streaming system (like

    LinkedIn/Twitter) • Requires more research, more dev involvement • Peter prototyped a Kafka consumer • Point is faster mean-time-to-reports not immediate access to data
  34. Next steps • Performance test Ceph • Performance test Cassandra,

    implement reports (TCBS?) • Report back, evaluate whether more research into streaming is warranted