1. Resilient and Fast Persistent Container Storage Leveraging Linux’s Storage Functionalities Philipp Reisner, CEO LINBIT 1
2. 2 COMPANY OVERVIEW REFERENCES • Developer of DRBD • 100% founder owned • Offices in Europe and US • Team of 30 highly experienced Linux experts • Partner in Japan TECHNOLOGY OVERVIEW LINBIT - the company behind it
3. 3 Linux Storage Gems LVM, RAID, SSD cache tiers, deduplication, targets & initiators
4. 4 Linbit Software defined Storage - Linbit HA NFS / CIFS / iSCSI KVM / VMWare /Xen Databases Fileservers Webservers NagiosXI Messaging (MQ) - Container-native OpenShift Kubernetes Docker Cloud-native OpenNebula OpenStack Proxmox VE
5. 5 Linux's LVM • based on device mapper • original objects • PVs, VGs, LVs, snapshots • LVs can scatter over PVs in multiple segments • thinlv • thinpools = LVs • thin LVs live in thinpools • multiple snapshots became efficient!
6. 6 Linux's LVM Linux already provides several storage gems: LVM• RAID• SSD cache tiers• De-duplication• Targets & initiators
7. 7 Linux's RAID • original MD code • mdadm command • Raid Levels: 0,1,4,5,6,10 • Now available in LVM as well • device mapper interface for MD code • do not call it ‘dmraid’; that is software for hardware fake-raid • lvcreate --type raid6 --size 100G VG_name
8. 8 SSD cache for HDD • dm-cache • device mapper module • accessible via LVM tools • bcache • generic Linux block device • slightly ahead in the performance game
9. 9 Linux’s DeDupe • Virtual Data Optimizer (VDO) since RHEL 7.5 • Red hat acquired Permabit and is GPLing VDO • Linux upstreaming is in preparation • in-line data deduplication • kernel part is a device mapper module • indexing service runs in user-space • async or synchronous writeback • Recommended to be used below LVM
10. Linux LVM Based on device mapper• Original objects•
PVs, VGs, LVs, snapshots• LVs can scatter over PVs in multiple segments• thinlv•
thinpools = LVs• Thin LVs live in thinpools• Multiple snapshots are efficient!•
11. 11 ZFS on Linux • Ubuntu eco-system only • has its own • logic volume manager (zVols) • thin provisioning • RAID (RAIDz) • caching for SSDs (ZIL, SLOG) • and a file system!
12. dm-cache• device mapper module •accessible via LVM tools• bcache•
generic Linux block device • slightly ahead in the performance game
13. 13 LINUX Dedupe: Virtual Data Optimizer (VDO) since RHEL 7.5•
Red hat acquired Permabit and is GPLing VDO• Linux upstreaming is in preparation•
In-line data deduplication• Kernel part is a device mapper module
14. 14 Targets and Initiators: Open-ISCSI initiator
Ietd, STGT, SCST
mostly historical IO iSCSI• iSER, SRP, FC, FCoE, SCSI pass through, block IO, file IO, user-specific-IO• NVMe-OF, target & initiator
15. Drbd Main Line Linux kernel 1000’s of Nodes
Up to 32 Synchronous or async replicas per volume • Automatic partial resync after connection outage• Multiple resources per node possible (1000s)• Diskless nodes•
Intentional diskless (no change tracking bitmap)• Disks can fail
Reliable• A node knows the version of the data is exposes• Checksum-based verify & resync• Split brain detection & resolution policies• Fencing Quorum•
Dual Primary for live migration of VMs only!
16. 16 DRBD – up to 32 replicas • each may be synchronous or async
17. 17 DRBD – Diskless nodes • intentional diskless (no change tracking bitmap) • disks can fail
18. 18 DRBD - more about • a node knows the version of the data is exposes • automatic partial resync after connection outage • checksum-based verify & resync • split brain detection & resolution policies • fencing • quorum • multiple resouces per node possible (1000s) • dual Primary for live migration of VMs only!
19. Controls LVM/ZFS• Snapshots• Thin •Multiple VGs
•For caching SSDs•Different pools•Controls DRBD
20. LINSTOR features complete
•Snapshot Support •Multiple Sites with DRBD Proxy •Swordfish API •Access via NVMe-oF•Scheduler Support •OpenShift •Kubernetes •OpenStack •OpenNebula
•ProxmoxVE •Cloud Platform Support •AWS •Google Cloud •IBM Cloud •Azure
LINSTOR Road map North bound drivers
•Autoplace policies as LINSTOR objects (Q2 2019)
•Management of PMEM/NVDIMM storage (Q2 2019)
•DRBD: Erasure Coding (Raid 5 support) (Q4 2019)