The Ceph Distributed Storage System

Slides of my introduction to the Ceph Distributed Storage System for the SAGE@GUUG meetup in Hamburg, Germany - http://guug.de/lokal/hamburg/index.html


Lenz Grimmer

May 24, 2018


  2. 2 Ceph Overview A distributed storage system Object, block, and

    file storage in one unified system Designed for performance, reliability and scalability
  3. 3 Ceph Motivating Principles Everything must scale horizontally No single

    point of failure (SPOF) Commodity (off-the-shelf) hardware Self-manage (whenever possible) Client/cluster instead of client/server Avoid ad-hoc high-availability Open source (LGPL)
  4. 4 Ceph Architectural Features Smart storage daemons • Centralized coordination

    of dumb devices does not scale • Peer to peer, emergent behavior Flexible object placement • “Smart” hash-based placement (CRUSH) • Awareness of hardware infrastructure, failure domains No metadata server or proxy for finding objects Strong consistency (CP instead of AP)
  5. 5 Ceph Components

  6. 6 MONs Tracks & Monitor the health of the cluster

    Maintains a master copy of the cluster map Consensus for distributed decision making MONs DO NOT serve data to clients
  7. 7 OSDs Object Storage Deamon Store the actual data as

    objects on physical disks Serve stored data to clients Replication mechanism included Minimum of 3 OSDs recommended for data replication
  8. 8 Ceph Pools Logical container for storage objects Replicated or

    erasure coding Dedicated CRUSH rules
  9. 9 Placement Groups Helper to balance the data across the

    OSDs One PG typically spans several OSDs One OSD typically serves many PGs Recommended ~150 PGs per OSD
  10. 10 CRUSH Map Controlled Replication Under Scalable Hashing MONs maintain

    the CRUSH map Topology of any environment can be modeled (row, rack, host, dc...)
  11. 11 Basic Ceph Cluster (MONs and OSDs only)

  12. 12 Write Data

  13. 13 Read Data

  14. 14 Ceph Object Store Features RESTful Interface S3- and Swift-compliant

    APIs S3-style subdomains Unified S3/Swift namespace User management Usage tracking Striped objects Cloud solution integration Multi-site deployment Multi-site replication
  15. 15 Ceph Block Device Features Thin-provisioned Images up to 16

    exabytes Configurable striping In-memory caching Snapshots Copy-on-write cloning Kernel driver support KVM/libvirt support Back-end for cloud solutions Incremental backup Disaster recovery (multisite asynchronous replication)
  16. 16 Ceph Filesystem Features POSIX-compliant semantics Separates metadata from data

    Dynamic rebalancing Subdirectory snapshots Configurable striping Kernel driver support FUSE support NFS/CIFS deployable Use with Hadoop (replace HDFS)
