Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Lightbits Labs - IT Press Tour #44 June 2022

Lightbits Labs - IT Press Tour #44 June 2022

The IT Press Tour

June 08, 2022
Tweet

More Decks by The IT Press Tour

Other Decks in Technology

Transcript

  1. Lightbits Labs Proprietary and Confidential | 2 Agenda Part I

    Company Overview Market Perspectives Go-to-Market and Key Partnerships Use Cases Kam Eshghi CSO Lightbits Part II Intel’s Pensive Lake Gary McCulley Director, Data Platform Group, Data Storage Technology Business Intel Part III Technology Deep Dive Product Demonstration Abel Gordon Chief System Architect Lightbits
  2. Lightbits Labs Proprietary and Confidential | 3 Lightbits In a

    Nutshell • Founded in 2016 and invented NVMe/TCP standard • Software-Defined disaggregated storage for private, edge, and public clouds • Strong customers traction w/ proven product market fit Leadership Team Make high-performance disaggregated storage simple, agile, and cost-efficient for any cloud.
  3. Lightbits Labs Proprietary and Confidential | 4 Enterprise Features, at

    NVMe Speeds, on Any Server Application Servers Storage Cluster STANDARD TCP/IP NETWORK • Flexible: Scale storage independently • Lower TCO: Increased flash endurance, data reduction, no hypervisor on storage nodes • High performance: High IOPS, consistent low latency • Multi-tenant: Single storage cluster can service multiple environments • Easy: Existing TCP/IP network, run on commodity servers • High availability: SSD-level eRAID, volume replication
  4. Lightbits Labs Proprietary and Confidential | 5 • Reduce TCO

    w/ disaggregation • Deploy on any cloud: private, edge & public • Extend flash endurance • Increase ROI w/ high performance • Scale up or out, dynamically • VMware, Kubernetes or OpenStack • Enterprise data services and availability • Standard infrastructure • Software-defined, API-driven architecture Lightbits Cloud Data Platform up to 75 million IOPS up to 20x more endurance compared to DAS, SDS, and SAN as low as 160us latency up to 80% lower TCO compared to DAS, SDS, and SAN with Intelligent Flash Management
  5. Lightbits Labs Proprietary and Confidential | 6 Direct Attached Storage

    (DAS) DAS Challenges Network Local attached NVME Local attached NVME Local attached NVME Local attached NVME Local attached NVME APPLICATION-BASED REPLICATION AND RECOVERY OVER THE NETWORK • Stranded capacity and IOPS, low flash utilization • Applications locked to servers and storage • Data recovery is expensive ◦ Extended service degradation ◦ Severe network impact Capacity Performance 20% 80% 50% 50% WASTED UTILIZED
  6. Lightbits Labs Proprietary and Confidential | 7 • High storage

    utilization, both capacity and performance • Scale compute and storage independently • Fully multi-tenant storage, shared across multiple environments • Consistent high performance and low latency, better endurance • Intelligent Flash Management, TLC & QLC optimized • Low TCO https://www.lightbitslabs.com/tco-calculato r Disaggregated Storage Lightbits Benefits using Lightbits Cloud Data Platform Lightbits Disaggregated Storage Hypervisor/OS with NVME/TCP Hypervisor/OS with NVME/TCP Hypervisor/OS with NVME/TCP Standard TCP/IP Network STORAGE-BASED REPLICATION AND RECOVERY
  7. Lightbits Labs Proprietary and Confidential | 8 Independently Scale Storage

    Scale up or out, or both Per Storage Server • Start partially populated, add additional NVMe drives at any time • Add 1 or many at once with no disruption in service Per Lightbits Cluster • Start with at least 3 storage servers • Add additional storage server to the cluster at any time, online • Cluster dynamically rebalances or NVME/TCP TARGET NVME/TCP TARGET Lightbits Cluster +
  8. Lightbits Labs Proprietary and Confidential | 9 Local flash performance

    with rich data services Data Services • Logical volumes w/online resize • Thin-provisioning • Inline compression • Redirect-on-write snapshots and clones • Online SSD capacity expansion • High performance, consistent low latency • Quality of Service (QoS) • Encryption (coming)
  9. Lightbits Labs Proprietary and Confidential | 10 Clustered Scale Out

    Architecture High Availability and Data Protection Cluster Size and Configuration • up to 16 storage servers per cluster • Up to 64K volumes per cluster • Up to 64K clients per cluster • Online automatic node Add/Remove • Automatic volume placement • Per-volume ACLs and IP-ACLs • Dynamic data rebalancing High Availability and Data Protection • NVMe multipathing (ANA) • Standard NIC bonding support • Per volume replication policies • Managed volume backup to S3 (coming) • Cloud monitoring service (coming) • DELTA log recovery (partial rebuild) • SSD failure protection with ElasticRAID • Highly available discovery Service • Highly available API service
  10. Lightbits Labs Proprietary and Confidential | 11 Unique architecture delivers

    high-performance storage that is simple, agile, and efficient Lightbits Technology Differentiation
  11. Lightbits Labs Proprietary and Confidential | 12 Lightbits for Private

    Clouds “With Lightbits, we get similar performance to DAS, a more resilient environment and we reduce our server footprint by 33%.” - Storage Design Director at large U.S. tech company Use Cases Top Business Challenges Lightbits Benefits High performance multi-tenant block storage Bare metal, VMware, OpenStack and Kubernetes Reducing IT infrastructure costs Delivering reliable service to multiple business units Aging IT infrastructure, shortage of resources Simplify provisioning of resources Reduce TCO by 80%, increase endurance by 20X Secure and reliable multi-tenant data storage with QoS Use existing infrastructure, no need to provision new API-driven, automated software defined storage
  12. Lightbits Labs Proprietary and Confidential | 13 Lightbits for Financial

    Services “We have chosen NVMe/TCP for our cloud… Lightbits is the only vendor who was able to get a storage solution to work in our environment, integrating with OpenStack and Kubernetes.” - Top global investment management firm Use Cases Top Business Challenges Lightbits Benefits Trading Applications On-premises Banking Cloud Datacenter Modernization Financial Risk Management Rapidly changing consumer banking services Globalization of competition and services Reducing IT budgets, aging IT infrastructure API-driven automation enables agility Consistent high performance with scalable architecture Up to 80% lower TCO, 20X endurance improvement
  13. Lightbits Labs Proprietary and Confidential | 14 Lightbits for Cloud

    Service Providers “Lightbits reduced our setup time from days to minutes.” Use Cases Top Business Challenges Lightbits Benefits High performance storage for virtual environments Fast and resilient Kubernetes and Tanzu storage Fast provisioning system repository New revenue streams to increase market share Margin pressure in a highly competitive environment Time to market Shortage of resources Consistent high performance and low latency Reduce TCO by up to 80%, increase endurance by 20X API-driven, automated software defined storage Use existing infrastructure, no need to provision new
  14. Lightbits Labs Proprietary and Confidential | 15 Pain Points: •

    Local flash storage is fast but ephemeral • Need Kubernetes integration with multi-tenant isolation Lightbits Solution: • Resilient storage at local flash performance • Independent scaling • Secure multi-tenant support Financial Services Cloud Service Provider Network Compute Storage
  15. 17 Lightbits Labs and Intel Proprietary. Intel, the Intel logo,

    and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Lightbits Labs, the Lightbits Labs logo and other Lightbits marks are trademarks of Lightbits Labs. Other names and brands may be claimed as the property of others. © Intel Corporation and Lightbits Labs. A Performance and Cost-Efficient Storage Platform Intel® Ethernet 800 Series with ADQ Technology ▪ High performance for low-latency NVMe/TCP Intel® Xeon® Scalable Processors ▪ High performance ▪ Efficient storage-software ▪ VMD: Enterprise class SSD Hot-plug and LED Intel® Optane™ Technology ▪ Fast NV write buffer and metadata ▪ No battery, supercaps or servicing ▪ Large memory capacities lower TCO
  16. 18 Lightbits Labs and Intel Proprietary. Intel, the Intel logo,

    and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Lightbits Labs, the Lightbits Labs logo and other Lightbits marks are trademarks of Lightbits Labs. Other names and brands may be claimed as the property of others. © Intel Corporation and Lightbits Labs. High Performance and Acceleration on Intel® Xeon® Scalable Processors ▪ Efficient utilization of more cores and higher frequency ▪ Higher platform performance ▪ Optimal UPI access for Increased CPU I/O throughput ▪ Enhanced Memory Performance, Capacity and Bandwidth ▪ More, Faster I/O across PCIe lanes ▪ Intel® Volume Management Device (VMD) ▪ Application Device Queues (ADQ) Lightbits with Intel® Xeon® Scalable Processors
  17. 19 Lightbits Labs and Intel Proprietary. Intel, the Intel logo,

    and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Lightbits Labs, the Lightbits Labs logo and other Lightbits marks are trademarks of Lightbits Labs. Other names and brands may be claimed as the property of others. © Intel Corporation and Lightbits Labs. Efficiently Use Intel® Optane™ Pmem for the Most Demanding Workloads ▪ High-capacity and affordable memory to host metadata structures - enables support of large memory for high-capacity storage platforms at lower cost ▪ Persistent write buffer partition – Low latency, no battery, supercaps or servicing needed ▪ Close to memory performance Efficient utilization of Intel® Optane™ Pmem with Lightbits
  18. 20 Lightbits Labs and Intel Proprietary. Intel, the Intel logo,

    and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Lightbits Labs, the Lightbits Labs logo and other Lightbits marks are trademarks of Lightbits Labs. Other names and brands may be claimed as the property of others. © Intel Corporation and Lightbits Labs. Improved Predictability, Latency, and Throughput with Intel® Ethernet 800 Series with Application Device Queues (ADQ) Lightbits with Intel® Ethernet 800 Series with ADQ: Efficient Storage and Network utilization https://www.lightbitslabs.com/ty-wp-scalable-low-latency-nvme-tcp-storage/ ▪ Application-specific steering and queuing for increased predictability, reduced latency, and improved throughput ▪ Filters traffic to dedicated set of queues ▪ Provides “express lanes” for traffic to get to and from applications With ADQ Application traffic to a dedicated set of queues Without ADQ Application traffic intermixed with other traffic types Intel® Ethernet Network Adapter E810-CQDA2
  19. 21 Lightbits Labs and Intel Proprietary. Intel, the Intel logo,

    and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Lightbits Labs, the Lightbits Labs logo and other Lightbits marks are trademarks of Lightbits Labs. Other names and brands may be claimed as the property of others. © Intel Corporation and Lightbits Labs. Lightbits with Intel® Technologies Delivers “Hyperscale Storage For All” CSPs Benefits: • Scalability, Flexibility: Independent scaling of compute and storage that is easy to deploy over standard TCP/IP networks • Efficiency: Improved resource utilization, higher flash endurance, and data reduction at wire speed to reduce TCO • Performance: Scale out disaggregated storage that performs like local flash Edge Benefits: • High Availability: Better protection against SSD failure or storage node failure • Efficiency: Flexible hardware configurations (software-defined) with rich data services for data reduction • Performance: Low latency disaggregated storage for real-time analytics and performance-sensitive modern database apps Benefits: • Efficiency: Multi-tenancy, high utilization of lower-cost QLC SSDs (together with Optane) reduces TCO • Performance: 'NVMe as a service' delivers logical volumes with high performance and consistent low latency to compute nodes • Resilience: Applications unaffected by SSD failures or storage node failures Private Cloud
  20. Lightbits Labs Proprietary and Confidential | 23 Lightbits for I/O

    Intensive Applications Public, Private and Edge Cloud Providers Databases: SQL, NoSQL, NewSQL, In-Memory Analytics: Traditional, Financial, Log Processing PaaS IaaS SaaS Deployment Options Edge Cloud Public Cloud Private Cloud
  21. Lightbits Labs Proprietary and Confidential | 24 Backend: Intelligent Flash

    Management & Data Services Frontend: NVMe/TCP Clustered Target Scalable Cluster Services Lightbits High Level System Architecture Configuration Management API Endpoints Monitoring, Logging and Alerts Services TCP/IP NVMe HA Service Data Consistency Persistent Write Buffer Resizable Logical Volumes Thin Provisioning Data Reduction (Compression) SSD Hot-swap (add / remove) SSD optimized I/O Scheduler Endurance Optimizer Flash Error Detect/Fix/Rebuild Erasure Coding Data Rebuild Snapshots and Clones QoS Encryption Scalable Cluster Management Placement Engine Health Monitor Data/Load Balancer NVMe Discovery Service Upgrade Manager
  22. Lightbits Labs Proprietary and Confidential | 25 Standard TCP/IP Network

    Increase Performance and Endurance Hypervisor/OS with NVME/TCP Hypervisor/OS with NVME/TCP Hypervisor/OS with NVME/TCP NVMe Devices Intell igent Flash Man agem ent Multiple application streams Sequential, random, VM’s, K8s and Linux Intelligent Flash Management Increases Performance and Endurance Intelligent Flash Management with Lightbits Intelligent Flash Management up to 75 million IOPS up to 20x more endurance compared to DAS, SDS, and SAN as low as 160us latency up to 80% lower TCO compared to DAS, SDS, and SAN
  23. Lightbits Labs Proprietary and Confidential | 26 NVME/TCP TARGET NVME/TCP

    TARGET LOCAL NVME LOCAL NVME LOCAL NVME LOCAL NVME LOCAL NVME LOCAL NVME LOCAL NVME LOCAL NVME LOCAL NVME NVME/TCP TARGET NVME/TCP TARGET LOCAL NVME LOCAL NVME LOCAL NVME LOCAL NVME LOCAL NVME LOCAL NVME LOCAL NVME LOCAL NVME LOCAL NVME NVME/TCP TARGET NVME/TCP TARGET LOCAL NVME LOCAL NVME LOCAL NVME LOCAL NVME LOCAL NVME LOCAL NVME LOCAL NVME LOCAL NVME LOCAL NVME High Availability • Failure Domain aware • Allows for failover between racks/domains • Intelligent replication spans failure domains • “Blast Radius” (failure domain) definable: ◦ Rack ◦ Row ◦ Power zone CLUSTER DOMAIN A DOMAIN C DOMAIN B
  24. Lightbits Labs Proprietary and Confidential | 27 Management & Monitoring

    • REST API and CLI interfaces • Multi-tenancy; RBAC, Quotas • Environments: ◦ Kubernetes (CSI, operator) ◦ vSphere (vCenter Plugin) ◦ OpenStack (Cinder) • Prometheus/Grafana Monitoring ◦ Cluster, Node, Capacity, Performance metrics • Rolling Cluster Upgrades
  25. Lightbits Provisioning for vSphere VMFS Model ESXi ESXi ESXi ESXi

    vCenter Lightbits VCP Cluster 1 Cluster 2 Cluster 3 VMFS Datastore VM VM VM VM VMFS Datastore VMFS Datastore vCenter Plugin is deployed as a virtual appliance (VM) Data-path Control-plane NVME/TCP TARGET NVME/TCP TARGET NVME/TCP TARGET NVME/TCP TARGET NVME/TCP TARGET NVME/TCP TARGET VMFS Datastore VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM Lightbits Labs Proprietary and Confidential | 28
  26. Lightbits Labs Proprietary and Confidential | 31 Lightbits Performance for

    VMware Storage STANDARD TCP/IP NETWORK 8 ESXi Servers 3 Storage Servers Source: ESG White Paper Total IOPs from all clients vs. block size – read/write Total IOPs from all ESXi clients (M) 4 3 2 1 0 Block Size/Read Write Mix 4k- 0r:100w 4k- 70r:30w 4k- 100r:0w 16k- 0r:100w 16k- 70r:30w 16k- 100r:0w 64k- 0r:100w 64k- 70r:30w 64k- 100r:0w 838,055 1,946,372 3,263,559 335,719 851,087 1,276,339 92,086 260,576 342,948 8 VMs per ESXi Server Performance can scale even further with additional ESXi servers • Single Lightbits storage server can deliver 4M IOPS • Dynamically scale out storage cluster to scale performance & capacity
  27. Lightbits Labs Proprietary and Confidential | 32 Lightbits Performance STANDARD

    TCP/IP NETWORK 24 Linux Clients 3 Storage Servers • 3 servers Lightbits cluster server can deliver 14M 100% Read IOPS • Dynamically scale out storage cluster to scale performance & capacity 2x25 Gbe per client 10 NVMe/TCP volumes per client Random I/O Workload Block Size IOPs BW Read Only 4KB 14M 53GB/s 128KB 535K 65GB/s 70% Read / 30% Write 4KB 9.9M 38GB/s 128KB 583K 71GB/s 50% Read / 50% Write 4KB 8.4M 32GB/s 128KB 407K 49GB/s Write Only 4KB 6M 23GB/s 128KB 220K 27GB/s … … … 2x100Gbe per Lightbits Server
  28. Lightbits Labs Proprietary and Confidential | 33 Consistent and Low

    Latency STANDARD TCP/IP NETWORK 24 Linux Clients 3 Storage Servers 2x25 Gbe per client 10 NVMe/TCP volumes per client 4KB Random I/O Workload IOPS R/W latency Avg (usec) 99th (usec) 100% Read 3.8 Million read 158.76 292.86 70% Read 3.7 Million read 185.33 378.88 30% Write write 109.00 264.19 50% Read 3.7 Million read 200.58 428.03 50% Write write 118.89 284.67 100% Write 3.8 Million write 151.54 370.69 … … … 2x100Gbe per Lightbits Server
  29. Lightbits Labs Proprietary and Confidential | 34 Management Operations Do

    Not Affect I/O Performance • Comparing data path performance with and without management operations • Tens of management operations per second • 10 node Lightbits cluster • 200 clients, 8,000 volumes under I/O, 4,000 volumes under management operations Data-path performance with/without management operations Normalized Performance Workload No management operations 35 management operations per second
  30. Lightbits Labs Proprietary and Confidential | 35 Management Operations Do

    Not Affect I/O Performance • Lightbits cluster - 10 nodes • 200 clients, 8,000 volumes used to run IOs and 4,000 volumes for constant control plane operations Data-path performance with/without control plane operations (4000 control plane volumes) Block Size/Read Write Mix 70r30w 256k 50w50r 256k 30r70w 256k 70r30w 8k 50w50r 8k 30r70w 8k Operation Duration (secs) attach 2.77 create 2.44 delete 1.76 detach 2.11 BW [GB/s] 800 600 400 200 0 BW BW with control 747 744 744 698 593 470 406 520 586 437 454 369 355
  31. Lightbits Labs Proprietary and Confidential | 36 Lightbits vs. Ceph

    • 2x-16x better Lightbits performance when compared to Ceph • 3 servers Lightbits cluster • 12 nodes Kubernetes cluster, 8 containers running vdbench per node (total of 96) • Exactly same hardware for Ceph and Lightbits Source: EG WhitePaper Link Lightbits vs. Ceph w/ TLC Media Throughput (in MB/s) – Higher is Better Throughput [MB/s] Block Size/Read Write Mix 4k – 100% Read Xfer 4k – 0% Read Xfer 8k – 80% Read Xfer 16k – 70% Read Xfer 32k – 50% Read Xfer 20k 15k 10k 5k 0 Ceph TLC Lightbits TLC Lightbits vs. Ceph w/ QLC Media Throughput (in MB/s) – Higher is Better Throughput [MB/s] Block Size/Read Write Mix 4k – 100% Read Xfer 4k – 0% Read Xfer 8k – 80% Read Xfer 16k – 70% Read Xfer 32k – 50% Read Xfer 20k 15k 10k 5k 0 Ceph QLC Lightbits QLC