Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Content Addressable Storages for Fun and Profit

Content Addressable Storages for Fun and Profit

A lighting talk session from HBaseCon 2012 – May 22, 2012 | San Francisco, CA

This session is a case study of how we used our existing HBase cluster as a content addressable store for BLOBs. We will discuss how we wrote a CAS implementation using HBase as the backend, Scala and Finagle as the application and using caching reverse proxies (i.e. Varnish in our case) for serving BLOBs at scale. The talk will dicuss why content addressable storage is the right pattern for many web use cases, how to foster an existing HBase cluster for better usage of possibly underutilized resources, and operational gotchas to store and serve BLOBs from HBase at scale.

Berk D. Demir

May 25, 2012
Tweet

More Decks by Berk D. Demir

Other Decks in Programming

Transcript

  1. m: d: MD5 16 bytes (SHA-1 20 bytes) Metadata 9

    bytes BLOB Many bytes One table to rule them all. MAX_FILESIZE => 20G, VERSION => 1, BLOCKCACHE => true, BLOOMFILTER => ROW ! Pre-split into 512 regions at table creation time.
  2. Headers! ! Cache-Control: max-age=<1 year>! Last-Modified: <cell timestamp>! Content-MD5: <row

    key: base64>! ! Content-Disposition: attachment; filename=su.xpi
  3. Like the design of this slide deck?! ! Direct your

    positive feedback to Coda Hale (@coda)