Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Igneous - IT Press Tour December 2019

Igneous - IT Press Tour December 2019

The IT Press Tour

December 13, 2019
Tweet

More Decks by The IT Press Tour

Other Decks in Technology

Transcript

  1. [email protected] @csislive Christian Smith VP of Product, Solutions, Content Marketing,

    Customer Success Live in Seattle …perpetually caffeinated …never wear a tie Career …animator …software developer …systems engineer …product manager Machine generated data solutions …since 2002 …M&E, EDA, IOT …Life Science, Fintech …HDFS, HPC, Cloud
  2. Agenda • Quick Recap from Last time • Current Market

    Status • Why is file hard? • Igneous Service Offerings – DataProtect – DataDiscover • GTM & Pricing • Igneous Team • Demo when you get tired of hearing me!! 3
  3. Unstructured Data: 70EB of file on NAS 7 Dell EMC

    Isilon NetApp Qumulo Pure Flashblade Other NAS (Stornext, GPFS, Lustre, WekaIO, VastData, Quobyte Windows Server) Geospatial Data Design and manufacturing data Media data Biotech and Pharmaceutical Research data Energy Exploration Models Algorithmic trading (HPC)
  4. Life Sciences Data Generation Device Research area / market Weekly

    raw data volume per device # in the field 15 yr publication growth NGS Genomics, diagnostics, drug design, many other research areas 5GB-10TB 15,000+ CryoEM Drug design, structural biology research 10TB 150+ Mass Spec Forensics, food science, drug efficacy, protein research, imaging 100GB-5TB 20,000+ Lattice Light Sheet Disease research, time based imaging 100TB 10+ ~60 publications since 2014 Protein Structure Determination (Xray) Drug design, structural biology research 10GB 150+
  5. Critical inflection points with file: <300TB 12 300TB 1PB 10PB

    100PB Capacity Use Case: Home Directories Number of NAS systems 1 File Count: 1M to 100M files IT Staff : Full stack IT Environment Backup Landscape File Management Vendor Replication
  6. Critical inflection points with file: <1PB 13 300TB 1PB 10PB

    100PB Capacity Use Case: Home Directories/Projects Number of NAS systems 1-3 File Count: 100M to 1B files IT Staff : Full stack IT Environment Backup Landscape File Management Vendor Replication
  7. Critical inflection points with file: <10PB 14 300TB 1PB 10PB

    100PB Capacity Use Case: Machine Generated Data Number of NAS systems 2-15 File Count: 1B to 40B files IT Staff : Storage IT Environment Backup Landscape File Management Vendor Replication
  8. Critical inflection points with file: <100PB 15 300TB 1PB 10PB

    100PB Capacity Use Case: Machine Generated Data Number of NAS systems 10-100 File Count: 20B to 1T files IT Staff : NAS IT Teams Environment Backup Landscape File Management Vendor Replication
  9. Architected for Scale Igneous, a data management architecture that uniformly

    scales in performance and capacity wherever data lives to wherever data needs to live 17 CONFIDENTIAL - DO NOT DISTRIBUTE
  10. Igneous Core 18 SUPPORT ALL FILE PROTOCOLS, NAS PLATFORMS &

    CLOUDS 400K files/sec per task Compression & traffic shaping CONFIDENTIAL - DO NOT DISTRIBUTE Trillion object index specific to ‘file’ queries
  11. Legacy NAS backup is failing at scale 20 High Operational

    Costs Missing SLAs Datacenters are full & Cloud isn’t feasible CONFIDENTIAL - DO NOT DISTRIBUTE
  12. Igneous: Modern NAS backup solution 21 SaaS Scalable Cloud Native

    Flexible Efficient Anywhere data lives / Anywhere it needs to be
  13. Any Scale with One Management Portal 22 Any Location Multi-NAS

    Multi-site Multi-cloud Any Scale 100TB Millions of files 100PB Trillions of files Limitless capacity with a single instance Effortless scale-out strategy without siloes Any scale or geographic sprawl Managed with a single portal
  14. Performance to hit SLAs 1 2 3 4 5 6

    7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Igneous operations User operations Latency Aware + Dynamic Throttling AdaptiveSCAN™ + IntelliMOVE™ + InfiniteINDEX™ • Scans minimize IOPS • Moves data at line rate • Off-box index for trillions of files • Continuously monitors latency on primary • Automatically throttles based on latency • Eliminates impacts to other workloads Meet your backup SLAs for all your file data No more backup windows
  15. Simplified, efficient management 24 • API NAS Integration • Automated

    namespace management • Policy-driven • Search-to-restore Operationally efficient Automated Simple Cost-effective • Configure systems manually • Mount each export • Babysit every backup • Browse catalogs to restore Time consuming Manual Complex Expensive CONFIDENTIAL - DO NOT DISTRIBUTE
  16. Cloud: What’s changed 27 10GbE Direct Connects Cloud Connect Price

    per Month AWS Direct Connect $1,620/month Azure ExpressRoute $3,400/month Google Interconnect $1,700/month Trend #1: Cost of Storage Trend #2: Cost of Bandwidth CONFIDENTIAL - DO NOT DISTRIBUTE AWS Azure Google On-prem $0 $50 $100 $150 $200 $250 $300 $350 2006 2010 2014 2018 Cheapest storage option
  17. Legacy vendors are cloud washing 28 $50,000 in PUT cost

    Example: Backup 1PB of 1MB files Disk-to-Disk Method $62,900 to expire data Example: Tape format, expire 5% files Tape Backup Method Every file is a cloud object Transaction costs are expensive Expiration is cheap Encapsulate files in a tar format Transaction costs are cheap Expiration is expensive Vendor Replication Online Cloud Tier
  18. Most efficient cloud usage for backup 29 Efficient Storage Data

    is compressed inline Better efficiency of data storage Data is split into chunks Files are compacted into chucks Reduces the amount of PUTS Intelligent Expiration As data expires: • Instantly marked deleted (file nonrecoverable) • Only when the the cost of expiration is less than storage cost is the chunk compacted x x x x x x Parallel Ingest Multi-thread ingest of data Keeps networks busy 1PB of data / 1MB file size: Put costs = $537 Expire 5% of data = $315
  19. Customer Story – Altius Institute 31 Goal: • Shrink datacenter

    footprint • Hit SLAs for protecting data • Protect research data Environment: • NetApp FAS • 800TB of primary NAS • 692M files • IT – one contract worker Result: • Footprint of data in cloud: 1.9PB physical / 2.85PB logical • Restore rate: < 1% per year • Azure Archive Blob Cost: $23,320 year • 30 days on-premises cache to mitigate fast retrieval
  20. Customer Story – Allen Institute 32 Goal: • Repurpose secondary

    Isilon • Minimize IT impact • Deploy robust backup solution Environment: • EMC/Dell Isilon • 8PB of primary NAS • 2.3B files • IT – one IT person Result: • Streamlined backup (Hit SLAs) • 11PB on-premises protection • Started archiving old projects • Expanded 2 times Grew on-premises capacity Expanded to GCP • Cloud is new expansion model – 4 PB to start
  21. Global Stats from DataDiscover (~10 installs) 34 Systems: 74 Exports:

    15,026 Capacity: 40,845 TB 11.25% 17.68% 12.00% 59.06% (24.1PB) 0-1 Month 1-6 Months 6 - 12 Months > 1 Year Files: 57,664 Million
  22. Why archiving hasn’t worked How do you break this cycle?

    Visibility Process not Project Maximize ROI
  23. Visibility into your data 36 Surgical, up-to-date identification of your

    archivable data 50%+ Fact-based decisions for the people who care DataDiscover customers find that more than half their data hasn’t been used in over a year
  24. One Click Archive, Anywhere 37 Single solution for all NAS

    data • Any protocol • Any NAS system • Any Cloud One-click archive • Notification upon archive completion • No babysitting needed • API integration available
  25. Frictionless data retrieval 38 Search and recover archives instantaneously: •

    Search-to-restore at scale • Retrieve with a single-click from anywhere • Retrieve only what you need • Memorized original archive location
  26. Comparison 39 Igneous DataDiscover No Configuration Single global data view

    NOT A PROJECT Fast Time to Results (minutes to hours) No Software to manage Others IT PROJECT Multi-VM Deployment NAS needs to be configured Shares need to be mounted Schedule needs to be configured Slow scanning speed Hardware needs to be deployed Software needs to be managed Small stateless VM to deploy Scan everything Fast deployment Built for scale Complex Lacks scale Automatically discovers new shares
  27. Proof Points 40 Igneous DataDiscover No Configuration Single global data

    view NOT A PROJECT Fast Time to Results (minutes to hours) No Software to manage Small stateless VM to deploy Scan everything Fast deployment Built for scale Automatically discovers new shares Customer Environment Systems: Isilon & Pure Flashblade Capacity: 7.9PiB File Count: 19B files Scan Time: 14 Hours (on active systems) Avg Files/second: 378,942 files/sec
  28. Proposed Expansion • 1PBs of new primary capacity • Additional

    backup capacity needed to protect new primary capacity Yearly Cost Primary Expansion $140,800 Backup Expansion $86,016 Total $226,816 Yearly Cost DataDiscover + DataProtect $71,200 AWS Cost $9,180 Total $80,380 Cost reduction: 65% Annual Savings: $146,436 Archive-as-a-Service Alternative • 1PB of old data discovered, compressed, and archived to AWS Glacier Deep Archive • No new primary or backup capacity required Maximize ROI Example: • 1PBs of additional primary capacity needed
  29. What does as-a-Service mean 42 Filter or sort by age,

    size, file count, scan everything, browse to old datasets DataDiscover : Released March 2019 Dynamically modify date ranges Toggle mtime and atime Trigger a rescan of a folder or share Export data to CSV One-click archive Delivered as-a-Service: Nov, 2019 No Customer Involvement
  30. Customer Story – Quantum Spatial 43 Goal: • Protect data

    off primary • Manage projects • Deploy robust backup solution • Leverage Cloud Environment: • EMC/Isilon • EMC/Unity • 6 sites • 4PB of file data • Constant data motion • Small IT Staff What is GeoSpacial LIDAR DIGITAL ORTHOIMAGERY HD VIDEO & OBLIQUE IMAGERY MULTI/HYPERSPECTRAL IMAGERY THERMAL INFRARED (TIR) IMAGERY Site Offload & processing Protect & Archive Final Archive Capture Visibility Results: • Visibility into managing data • Archive Tier / On-Prem / Cloud • Remote – direct to cloud • Reduce data center space • Deploy efficiently in smaller locations
  31. Customer Story – High Tech Company 44 Goal: • Protect

    billions of files • Reduce operational costs • Enable new capabilities • Manage globally Environment: • All NAS • Multi-site • 70PB worldwide • >200B files Result: • Visibility through DataDiscover • Protection for dense workflows • Archive to cloud • Scale globally – as-a-Service simplicity Visibility Archive
  32. Making file archive work at scale 45 Visibility into your

    data Process not Project Maximize ROI Use up-to-date insights into your entire environment to make fact-based decisions Archive and retrieve all your file data to your preferred storage with a single, simple solution Make existing budgets go twice as far by minimizing storage and management costs CONFIDENTIAL - DO NOT DISTRIBUTE DataProtect + DataDiscover
  33. Confidential - Do not distribute What’s Next? SaaS DataDiscover Mar

    2019 SaaS DataProtect - AWS September 2019 SaaS DataProtect - Azure October 2019 SaaS DataProtect - GCP November 2019 SaaS DataProtect – S3 SaaS DataProtect – NFS Next ??
  34. Confidential - Do not distribute SaaS Business, Enterprise Focus 50

    Enterprise Data Customers • Expected LTV: $4M+ • Large in Size and/or large in Data Subscription Pricing • Annual subscription based on data under management • Land & Expand Channel Fulfilled • Launched in September 2018 • 100% channel Alliance Partners • Extends cloud • Compliment next gen data protection • Certified integration investments + API Marketplace Fulfilled • AWS Marketplace • Azure Marketplace