Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Kalista - IT Press Tour 41 Jan 2022

Kalista - IT Press Tour 41 Jan 2022

The IT Press Tour

January 26, 2022
Tweet

More Decks by The IT Press Tour

Other Decks in Technology

Transcript

  1. KALISTA IO Overview for IT Press Tour 01.26.2022 Get ready

    for a data storage revolution © 2022 Kalista IO
  2. Founder of KALISTA IO. Previously, MSFT, WDC and various startups.

    Created one of the first HM-SMR storage system solutions. [email protected] https:/ /linkedin.com/in/alberthchen Albert Chen © 2022 Kalista IO Introduction
  3. New storage devices are not compatible with current applications and

    OS. • Shingled magnetic recording • Energy assisted magnetic recording • Multi-actuator • Variable capacity • Large sector © 2022 Kalista IO
  4. Hard disk capacity is increasing, but not performance, and maintaining

    consistent performance is hard. • Declining IO density (IOPS/GB) • Contention • Long tail latency • Total cost of ownership © 2022 Kalista IO
  5. “Hard disk is the worst form of storage device, except

    for all the others.”1 Winston Leonard Spencer-Churchill © 2022 Kalista IO
  6. Phalanx Storage System Enable applications to use next gen storage

    devices without modification Form device friendly commands that enable consistent, predictable performance at every scale © 2022 Kalista IO
  7. • Support SMR natively • Minimize seeks and contention (increase

    IO density) • Evenly distribute wear across available capacity (prevent hotspots and reduce tail latency) © 2022 Kalista IO
  8. • Evenly distribute workloads across available devices (system wide wear

    leveling) • Support variable capacity devices © 2022 Kalista IO
  9. Current SMR compatibility solutions have dependencies and limitations Phalanx is

    kernel agnostic and requires no application changes © 2022 Kalista IO
  10. One line to SMR. ] docker run --privileged -v /mnt:/mnt:rshared

    -v /md:/md:shared phalanx -d /dev/sdc -bm ▊ © 2022 Kalista IO
  11. Easy to deploy Simple to operate Runs everywhere Deploy and

    operate Phalanx using existing orchestration and provisioning frameworks such as Kubernetes® and vSphere®. Phalanx is designed to fit within your current workflow and environment. © 2021 Kalista IO | Proprietary © 2022 Kalista IO
  12. No storage silos. Unify and simplify — Phalanx supports both

    conventional as well as zoned devices. So you don’t have to worry about mixing and matching devices. © 2022 Kalista IO
  13. Phalanx enables HM-SMR devices to work with applications and environments

    beyond cold archival storage. © 2022 Kalista IO
  14. Insert.IntVector (64 threads, no multi-db) -3% op/s Insert.UniqueIndex (64 threads,

    no multi-db) -1% op/s Update.UniqueIndex (64 threads, no multi-db) -1% op/s Update.MatchedElementWithinArray (64 threads, no multi-db) +10% op/s Queries.SortByArrayOfNestedDocuments (64 threads, no multi-db) +6% op/s Queries.SortByArrayOfScalars (64 threads, no multi-db) -0.1% op/s MongoDB (mongo-perf)4 © 2022 Kalista IO
  15. Minio (S3-bench)5 Threads 64 Object size 1 MB Runtime 1800

    s Write average IOPS +60% Read average IOPS +35% © 2022 Kalista IO
  16. Linux (version 5.12.5)6 (C, GCC version 7.5.0, make -j 16)

    +45 s NGINX (version 1.17.8)6 (C, GCC version 7.5.0, make -j 16) +3.7 s Hadoop (version 3.3.0)7 (Java, openjdk version 1.8.0_222) +4 m Software compilation © 2022 Kalista IO
  17. 16x more IOPS with fio random write8 19% faster throughput

    with Hadoop TestDFSIO read9 58% higher IOPS with Ceph Rados write bench10 10x better performance consistency with Ceph Rados write bench10 © 2022 Kalista IO
  18. Doing more with less. Using fewer drives means fewer component

    failures and lower maintenance costs © 2022 Kalista IO
  19. Intelligent storage • Self-optimizing data placement & IO prioritization •

    Proactive management of device health & performance • Data services that automatically index and tag stored data © 2022 Kalista IO
  20. Notes and references 1. It is a little known fact

    that PM Churchill said this to Alan Turing after learning the Turing machine uses such antiquated technology as “infinite tape” for storage. 2. Testing conducted by Kalista IO in July 2020 using XFS file system with Linux kernel 5.4.0-42-generic, and Intel® Core™ i7-4771 CPU 3.50GHz with 16GiB DDR3 Synchronous 2400 MHz RAM, and Western Digital Ultrastar DC HC530 CMR drive connected through SATA 3.2, 6.0 Gb/s interface. Write bench created a single 1GB file and executed 600,000 write commands each overwriting the first 64KB region of the file to capture latency values. 3. Testing conducted by Kalista IO in July 2020 using preproduction Olympus (Phalanx) software with Linux kernel 5.4.0-42-generic, and Intel® Core™ i7-4771 CPU 3.50GHz with 16GiB DDR3 Synchronous 2400 MHz RAM, and Western Digital Ultrastar DC HC620 host managed SMR drives connected through SATA 3.2, 6.0 Gb/s interface. Write bench created a single 1GB file and executed 600,000 write commands each overwriting the first 64KB region of the file capture latency values. © 2022 Kalista IO
  21. Notes and references 4. Testing conducted by Kalista IO in

    May 2021 using production Phalanx software with Linux kernel 5.4.0-42-generic, and Intel Core i7-4771 CPU 3.50GHz with 16GiB DDR3 Synchronous 2400 MHz RAM, and HM-SMR/CMR drives connected through SATA 3.2, 6.0 Gb/s interface. Tested with MongoDB version 4.4.6, Python version 2.7.17, and mong-perf commit 19221af7d8d9600d3e088d14ebb91e500a6c61dc on master branch. 5. Testing conducted by Kalista IO in May 2021 using production Phalanx software with Linux kernel 5.4.0-73-generic, and Intel Core i7-4771 CPU 3.50GHz with 16GiB DDR3 Synchronous 2400 MHz RAM, and HM-SMR/CMR drives connected through SATA 3.2, 6.0 Gb/s interface. Tested with Minio RELEASE.2021-04-18T19-26-29Z, and S3-benchmark version 3.0. 6. Testing conducted by Kalista IO in May 2021 using production Phalanx software with Linux kernel 5.4.0-96-generic, and Intel Core i7-4771 CPU 3.50GHz with 16GiB DDR3 Synchronous 2400 MHz RAM, and HM-SMR/CMR drives connected through SATA 3.2, 6.0 Gb/s interface. © 2022 Kalista IO
  22. Notes and references 7. Testing conducted by Kalista IO in

    May 2021 using production Phalanx software with Linux kernel 5.0.0-25-generic, and Intel Core i7-4771 CPU 3.50GHz with 16GiB DDR3 Synchronous 2400 MHz RAM, and HM-SMR/CMR drives connected through SATA 3.2, 6.0 Gb/s interface. 8. Testing conducted by Kalista IO in August 2019 using preproduction Phalanx software with Linux kernel 4.18.0-25-generic, and Intel Core i7-4771 CPU 3.50GHz with 16GiB DDR3 Synchronous 2400 MHz RAM, and Western Digital Ultrastar DC HC620 host managed SMR and Ultrastar DC HC530 CMR drives connected through SATA 3.2, 6.0 Gb/s interface. Tested with Flexible I/O tester (fio) version 3.14-11-g308a. Random write bench ran for 1800 seconds with 4KB block and 200GB file size, 64 concurrent threads each with queue depth of 1. Executed 3 times to capture average and standard deviation IOPS values. © 2022 Kalista IO
  23. Notes and references 9. Testing conducted by Kalista IO in

    August 2019 using preproduction Phalanx software with Linux kernel 5.0.0-25-generic, and Intel® Core™ i7-4771 CPU 3.50GHz with 16GiB DDR3 Synchronous 2400 MHz RAM, and Western Digital Ultrastar DC HC620 host managed SMR and Ultrastar DC HC530 CMR drives connected through SATA 3.2, 6.0 Gb/s interface. Tested with Apache Hadoop version 3.2.0 in single node pseudodistributed mode with single block replica, and TestDFSIO version 1.8 on OpenJDK version 1.8.0_222. TestDFSIO read benchmark ran with 32 files, 16GB each for a 512GB dataset. Executed 3 times to capture average and standard deviation throughput values. © 2022 Kalista IO
  24. Notes and references 10. Testing conducted by Kalista IO in

    August 2019 using preproduction Phalanx software with Linux kernel 5.0.0-25-generic, and Intel Core i7-4771 CPU 3.50GHz with 16GiB DDR3 Synchronous 2400 MHz RAM, and Western Digital Ultrastar DC HC620 host managed SMR and Ultrastar DC HC530 CMR drives connected through SATA 3.2, 6.0 Gb/s interface. Tested with Ceph version 13.2.6 Mimic in single node mode with single object replica. Rados write bench ran with 4MB object and block (op) size with 16 concurrent operations for 1800 seconds to capture average and standard deviation IOPS values. © 2022 Kalista IO
  25. Additional information 1. KALISTA IO and Western Digital joint solution

    brief: Distributed Storage System with Host-Managed SMR HDDs https:/ /www.kalista.io/resources/joint-solution-briefs/KalistaIO-WDC-Joint-Solution-Brief.pdf 2. Addressing Shingled Magnetic Recording drives with Linear Tape File System https:/ /www.snia.org/sites/default/files/files2/files2/SDC2013/presentations/Hardware/Al bertChenMalina_Addressing_Shingled_Magnetic_Recording.pdf 3. Host Managed SMR https:/ /www.snia.org/sites/default/files/SDC15_presentations/smr/AlbertChen_JimMalina_ Host_Managed_SMR_revision5.pdf 4. libzbc https:/ /github.com/hgst/libzbc © 2022 Kalista IO
  26. Additional information 5. Linux SCSI Generic (sg) driver http://sg.danny.cz/sg/index.html 6.

    dm-zoned https://www.kernel.org/doc/html/latest/admin-guide/device-mapper/dm-zoned.html 7. Flash-Friendly File System (F2FS) https://www.kernel.org/doc/Documentation/filesystems/f2fs.txt 8. Zoned storage https://zonedstorage.io 9. Linux kernel changes https://kernelnewbies.org/LinuxVersions © 2022 Kalista IO
  27. Additional information 10. Another Layer of Indirection https:/ /www.linkedin.com/pulse/another-layer-indirection-albert-chen/ 11.

    Phalanx Flexible I/O tester (fio) benchmarks https:/ /www.kalista.io/resources/performance/phalanx-fio-benchmarks.pdf 12. Phalanx Hadoop TestDFSIO benchmarks https:/ /www.kalista.io/resources/performance/phalanx-hadoop-benchmarks.pdf 13. Phalanx Ceph OSD and Rados benchmarks https:/ /www.kalista.io/resources/performance/phalanx-ceph-benchmarks.pdf © 2022 Kalista IO
  28. Image attributions 1. Icons from Font Awesome. License available at

    https:/ /fontawesome.com/license No modifications made. © 2022 Kalista IO
  29. Trademarks Ceph is a trademark or registered trademark of Red

    Hat, Inc. or its subsidiaries in the United States and other countries. Apache®, Apache Hadoop, Hadoop®, and the yellow elephant logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. Linux® is the registered trademark of Linus Torvalds in the U.S. and other countries. Intel and Intel Core are trademarks of Intel Corporation or its subsidiaries. Oracle, Java, and MySQL are registered trademarks of Oracle and/or its affiliates. Docker and the Docker logo are trademarks or registered trademarks of Docker, Inc. in the United States and/or other countries. Docker, Inc. and other parties may also have trademark rights in other terms used herein. MongoDB® is a registered trademark of MongoDB, Inc. VMware, ESX, ESXi, vSphere, vCenter, and vCloud are trademarks or registered trademarks of VMware Corporation in the United States, other countries, or both. Kubernetes® is a registered trademark of the Linux Foundation in the United States and other countries, and is used pursuant to a license from the Linux Foundation. All other marks are the property of their respective owners. © 2022 Kalista IO