Overview of Riak, the open source distributed database, for retail and eCommerce platform and services. Covers use cases including shopping carts, product catalogs, and mobile apps; data modeling and querying; architecture and operations.
model • with some extras: search, MapReduce, 2i, links, pre- and post-commit hooks, pluggable backends, HTTP and binary interfaces • Written in Erlang with C/C++ • Open source under Apache 2 License Riak
Must be highly available • High latency is perceived as unavailability • Withstands node failure, network partition, datacenter failure • Many of the same architectural principles that power Amazon’s shopping cart
to tens of thousands or more inventory items • Content agnostic: images, video, text, JSON/XML/ HTML documents • Add and serve product data even under failure conditions • Scale out without sharding
data as a platform to internal and external client, developers and partners/affiliates • Flexible, schemaless design • RESTful HTTP API, protocol buffers and many client libraries • Throughput and capacity scales linearly with growth
powers top consumer mobile apps including Bump and Voxer • Fast, small object storage • Designed for concurrency to meet mobile client request patterns
SKU or ID JSON, XML or Text, HTML doc Product Advertising Campaign ID Ad Content User Profile Login, Email, UUID User attributes (often, JSON doc) Image or Video Content Content Name, ID or Integer Image or video file format Session Information User/Session ID Session Data
extracting links Full-Text Search: Searching product info or descriptions Secondary Indexes (2i): Tagging products with categories, promotion identifiers, etc.
to require >1 physical machine (preferably >5) When availability is more important than consistency (think “critical data”on “big data”) When your data can be modeled as keys and values; don’t be afraid to denormalize
Highly available, event-based shopping experience • “Riak is one of those things that just works and doesn’t need our attention on a day-to- day basis, saving both time and money.”
photos, other objects • Picked Riak for operational ease of use • “It does what it’s supposed to do; nodes can go down but Riak will still work. It’s great to be able to deal with node failures the next day instead of at 3am.”
at low- latency anywhere in the world • Failover to other sites in the event of data center failure • Full sync and real-time sync, can be configured uni- directionally or bi-directionally