Slide 1

Slide 1 text

Pascal de Wild Account Technology Strategist 12 September 2023 Big Data, AI and the hybrid cloud challenges

Slide 2

Slide 2 text

What are Big Data Sets? Data Logging Sensor, transaction and a multitude of IT data, logging everything, everywhere Medical Imaging Growing and maintaining all data © 2023 NetApp, Inc. All rights reserved. ….

Slide 3

Slide 3 text

What are Big Data Sets? Data Logging Sensor, transaction and a multitude of IT data, logging everything, everywhere Medical Imaging Growing and maintaining all data © 2023 NetApp, Inc. All rights reserved. …. Why is it important?

Slide 4

Slide 4 text

Why is it important? AI Big data is the ultimate source for AI projects Data fuels… What are Big Data Sets? Data Logging Sensor, transaction and a multitude of IT data, logging everything, everywhere Medical Imaging Growing and maintaining all data © 2023 NetApp, Inc. All rights reserved. ….

Slide 5

Slide 5 text

AI rewards are substantial… 7X companies that succeed at AI will be fastest-growing (Forrester) 73% of execs need AI to succeed! (Deloitte) © 2023 NetApp, Inc. All rights reserved.

Slide 6

Slide 6 text

but success is elusive 76% of execs acknowledge that they struggle to scale AI successfully 18% of companies have implemented multiple enterprise AI projects Only (Deloitte, PwC) AI rewards are substantial… 7X companies that succeed at AI will be fastest-growing (Forrester) 73% of execs need AI to succeed! (Deloitte) © 2023 NetApp, Inc. All rights reserved.

Slide 7

Slide 7 text

© 2023 NetApp, Inc. All rights reserved. Unlocking AI’s power Model training Inference Data prep

Slide 8

Slide 8 text

• Data collection • Edge-level AI Ingest Data preparation Unified data lake • Aggregation • Normalization Training cluster Training sets • Exploration • Training Deployment Repository • Deployment • Model serving Test Analysis & tiering • Cloud AI (GPU instances) • Data tiering IM1 IM2 IM3 1010 Build a data fabric that meets you where you are and where you want to be The best data for AI is data that moves freely Edge Core Cloud © 2023 NetApp, Inc. All rights reserved.

Slide 9

Slide 9 text

© 2023 NetApp, Inc. All rights reserved. Longstanding Partnership • Partners since 2018 • Hundreds of joint customers • Complementary Solution offerings • Aligned Worldwide Ecosystem partners • 35+ joint resellers, plus ISV, MSP, and SDP partners • Joint solution development • ONTAP AI Reference Architecture • ONTAP AI integrated solution (Arrow, TD/S, Arrow UK) • DGX Cloud • LaunchPad (PoC environment) • DGX SuperPOD • Industry use-case-based solutions

Slide 10

Slide 10 text

• Completely on-premise • Cloud only • Hybrid Where is Big Data deployed? Enterprise customers – moving from on-premise towards hybrid Startups – tend to be cloud only © 2023 NetApp, Inc. All rights reserved.

Slide 11

Slide 11 text

© 2023 NetApp, Inc. All rights reserved.

Slide 12

Slide 12 text

No content

Slide 13

Slide 13 text

© 2023 NetApp, Inc. All rights reserved. Enterprise Data Swamp or Data Lake ? Photo by James Wheeler Photo by Henning Roettger

Slide 14

Slide 14 text

• An Oracle data warehouse until 2012 • Until it couldn’t scale any longer • 100 TB of data per year • 2017 – 100 TB per month • Doubling every 12-18 months A trip down memory lane Active IQ v0 © 2023 NetApp, Inc. All rights reserved.

Slide 15

Slide 15 text

Active IQ: Version 1 Architecture Data Processing Pipeline Stream Analytics Users Hadoop Data Lake Cassandra Oracle Db Web Tier, App Server Telemetry Data Data Lake DAS Analytics & Reporting E-series/HDFS HDFS © 2023 NetApp, Inc. All rights reserved.

Slide 16

Slide 16 text

Active IQ: Version 2 - On-premise architecture Data Processing Pipeline Stream Analytics Users Hadoop Data Lake Cassandra Oracle Db Web Tier, App Server Telemetry Data Data Lake Analytics & Reporting ONTAP 9 AFF StorageGRID ONTAP 9 Disk N FS C onnector S3 Tiering DR Archive DR copy of data © 2023 NetApp, Inc. All rights reserved.

Slide 17

Slide 17 text

• Inflexible • Difficult to change database schemas • Fixed ratio of compute to storage • Not well suited for machine learning • Need expensive compute in bursts • Costly • Software license and support fees • Severe underutilization of hardware • IT support • Months to upgrade systems • Months to deploy new applications Challenges with on-premise architecture © 2023 NetApp, Inc. All rights reserved.

Slide 18

Slide 18 text

© 2023 NetApp, Inc. All rights reserved. Unleashing the power of data with Data Fabric and cloud analytics Active IQ v3: On-prem and in the hybrid cloud NetApp In-Place Analytics Module NFS On Premises SnapMirror® Spark Cluster Kafka NetApp Telemetry IoT Messages Ingest (Edge) Performance Development (Cloud) Secure Replica Capacity Replication Disaster Copy Analytics (Core) Data Lake HDInsight NetApp In-Place Analytics Module Databricks/EMR NetApp In-Place Analytics Module Express Route Direct Connect

Slide 19

Slide 19 text

Thin clones © 2023 NetApp, Inc. All rights reserved. Serverless analytics in any public and private cloud Data Platform v3: Serverless Analytics Analytics (Core) Data Lake Local Snapshots GPU and CPU Kafka NetApp Telemetry IoT Messages Ingest (Edge) Development and Analytics (Cloud) SnapMirror® Replication Data Lake Local Snapshots Disaster Copy

Slide 20

Slide 20 text

The NetApp Active IQ Ecosystem Predictive analytics driven by a very large, diverse dataset >300,000 Endpoints >200Billion Data points processed daily 400 TB Telemetry data processed per month >5 PBHot Data Lake >>15 PB Total Storage Growing rapidly © 2023 NetApp, Inc. All rights reserved.

Slide 21

Slide 21 text

Scenario: Optimize for cost and performance Challenges: • Data security • Data volume, variety, velocity • Vendor lock-in Awareness Discussions only Active POCs and pilots Operational A few production systems Systemic Pervasive AI in digital initiatives Transformational AI in business DNA (Gartner AI Maturity Curve) There are many roadblocks on the journey to AI maturity It all comes down to data Scenario: Start and scale in the cloud Challenges: • Fragmented, siloed data • Data visibility © 2023 NetApp, Inc. All rights reserved.

Slide 22

Slide 22 text

Let’s see how we help companies free data © 2023 NetApp, Inc. All rights reserved.

Slide 23

Slide 23 text

• Data collection • Edge-level AI Ingest Data preparation Unified data lake • Aggregation • Normalization Training cluster Training sets • Exploration • Training Deployment Repository • Deployment • Model serving Test Analysis & tiering • Cloud AI (GPU instances) • Data tiering IM1 IM2 IM3 1010 Build a data fabric that meets you where you are and where you want to be The best data for AI is data that moves freely Edge Core Cloud © 2023 NetApp, Inc. All rights reserved.

Slide 24

Slide 24 text

Competitors respond with price & performance bid 2 RE-TRAINING AUDIT The Data Science Value Conversation ARCHIVE CLONE VERSIONING DATA MOVEMENT DEEP AI EXPERTISE Competitors respond with price & performance bid … NetApp responds with value 2 File/Object Performance Protect © 2023 NetApp, Inc. All rights reserved. NetApp Value-Added Capabilities for AI Model Training File/Object Performance Protect IT Buyers interprets Data Science Needs 1 ZZzzzzzzzz…. Commodity Storage Conversation NEEDS CAPABILITIES <\> + Enhance team productivity + Fast model to production + Instant cloning + Hybrid data pipelines + Simplified versioning + Model traceability TYPICAL OUTCOME: Commoditized conversation about speeds and feeds 3 BETTER OUTCOME: NetApp capabilities enhance data science productivity 3 <\>

Slide 25

Slide 25 text

Simplify access to NetApp solutions from Data Science environments NetApp DataOps Toolkit (thin) Cloning of datasets Deleting datasets Creating versions Pull Content from S3 Push files to S3 Key Functions: Jupyter Notebook: © 2023 NetApp, Inc. All rights reserved.

Slide 26

Slide 26 text

© 2023 NetApp, Inc. All rights reserved.

Slide 27

Slide 27 text

© 2023 NetApp, Inc. All rights reserved. ONTAP Reaches 171GiB/s with GPUDirect Storage The latest release has increased performance of NFS over RDMA and GDS dramatically, and customers like you now can get more than 171GiBps from an ONTAP storage cluster to a single NVIDIA DGX A100 compute node. So, you can achieve the highest levels of performance for machine learning and deep learning (ML/DL) workloads, using data center–standard protocols and technologies to deliver the simplest deployment and operational experience. If your existing ONTAP systems have the appropriate network adapters, you can add this level of performance with a free upgrade simply by updating to ONTAP 9.12.1 or later versions. ONTAP Reaches 171GiB/s with GPUDirect Storage | NetApp Blog

Slide 28

Slide 28 text

Imagine if you could… ...and get high-value practitioners off the bench and into data users faster. Tame data complexity Deploy outcome-based solutions faster and more profitably Accelerate and de-risk delivery Deliver rapid business value today Ensure sustainable returns for tomorrow © 2023 NetApp, Inc. All rights reserved.

Slide 29

Slide 29 text

` © 2023 NetApp, Inc. All rights reserved.

Slide 30

Slide 30 text

Q & A © 2023 NetApp, Inc. All rights reserved. Stand 129