2 Ceph Overview A distributed storage system Object, block, and file storage in one unified system Designed for performance, reliability and scalability
3 Ceph Motivating Principles Everything must scale horizontally No single point of failure (SPOF) Commodity (off-the-shelf) hardware Self-manage (whenever possible) Client/cluster instead of client/server Avoid ad-hoc high-availability Open source (LGPL)
4 Ceph Architectural Features Smart storage daemons • Centralized coordination of dumb devices does not scale • Peer to peer, emergent behavior Flexible object placement • “Smart” hash-based placement (CRUSH) • Awareness of hardware infrastructure, failure domains No metadata server or proxy for finding objects Strong consistency (CP instead of AP)
6 MONs Tracks & Monitor the health of the cluster Maintains a master copy of the cluster map Consensus for distributed decision making MONs DO NOT serve data to clients
7 OSDs Object Storage Deamon Store the actual data as objects on physical disks Serve stored data to clients Replication mechanism included Minimum of 3 OSDs recommended for data replication
9 Placement Groups Helper to balance the data across the OSDs One PG typically spans several OSDs One OSD typically serves many PGs Recommended ~150 PGs per OSD
10 CRUSH Map Controlled Replication Under Scalable Hashing MONs maintain the CRUSH map Topology of any environment can be modeled (row, rack, host, dc...)
16 Ceph Filesystem Features POSIX-compliant semantics Separates metadata from data Dynamic rebalancing Subdirectory snapshots Configurable striping Kernel driver support FUSE support NFS/CIFS deployable Use with Hadoop (replace HDFS)