Slide 1

Slide 1 text

OBJECT STORAGE FOR AI @minio

Slide 2

Slide 2 text

BRIAN STEVENS MARTEN MICKOS ANDREW FELDMAN LANHAM NAPIER BEN GOLUB MARK LESLIE JEFF ROTHSCHILD STEVE SINGH GARIMA KAPOOR MOIZ KOHARI ■ Founded in Nov 2014 by AB Periasamy, Garima Kapoor and Harshavardhana ■ Previously founded Gluster File System ■ Headquartered in Palo Alto CA ■ Series A $23.35M 2 ABOUT MINIO

Slide 3

Slide 3 text

MINIO DEPLOYMENTS 2017 DOCKER PULLS 29.2M CONTRIBUTORS 241 SLACK MEMBERS 1844 3

Slide 4

Slide 4 text

DOCKER PULLS 166M CONTRIBUTORS 308 SLACK MEMBERS 3695 MINIO DEPLOYMENTS 2018 4

Slide 5

Slide 5 text

OBJECT STORAGE USE CASES MACHINE LEARNING & DEEP LEARNING APPLICATION DATA BACKUP AND RESTORE BIG DATA & ANALYTICS DISASTER RECOVERY ARCHIVE 5

Slide 6

Slide 6 text

EVOLUTION TIMELINE Multi-Cloud Storage Private Cloud Storage Object Storage for AI 2015 2017 2019 6

Slide 7

Slide 7 text

COMPETITIVE LANDSCAPE Data Warehouse Compute Compute Compute Network Storage Storage TRADITIONAL SCALE-UP STORAGE Big Data Compute Compute Compute Storage Storage Storage MAPREDUCE & HDFS 7 Network Data Infrastructure VM or Container VM or Container VM or Container MINIO MINIO MINIO SPARK/TENSORFLOW & MINIO OBJECT STORAGE

Slide 8

Slide 8 text

MODERN DATA INFRASTRUCTURE S3 SELECT SQL S3 S3 SELECT SQL S3 SELECT Data Processing Applications Streaming Data Events Logs Sensor Data Social Transaction Data Infrastructure Machine Learning 8 STATELESS

Slide 9

Slide 9 text

Before After *SOURCE: AMAZON AWS Up to 400% faster Up to 80% Cheaper Applications Applications S3 SELECT spark .read .format("s3selectCSV") // "s3selectJson" for Json .schema(...) // optional, but recommended .options(...) // optional .load("s3://path/to/my/datafiles") 9 S3 SELECT

Slide 10

Slide 10 text

PERFORMANCE ■ All-Flash QLC Optimized ○ Skylake CPU, ○ High Capacity NVMe Drives ○ Mellanox CX5 dual 100-GbE ○ Aggregate multiple servers into Terabit bandwidth ■ Performance critical algorithms accelerated with AVX2/AVX512 ○ Erasure Coding ■ Hashing for bit-rot detection 10

Slide 11

Slide 11 text

Samsung@The Heart of Everything MINIO Performance 4 x NKV-Server + NKV-Fabric Manager 2 x CPU 24 x 4 TB KVSSD 3 x 100 GbE NVMeoF Fabric NKV-Fabric Applications KV S3 API MINIO Server NKV-SDK 8 x MINIO Server 2 x CPU 1 x 100 GbE NKV-Fabric Easy to Use & Easy to Manage

Slide 12

Slide 12 text

PARTNERSHIPS 12

Slide 13

Slide 13 text

DEMO

Slide 14

Slide 14 text

14 ■ Minio object storage suite Full-featured Minio Object Storage suite (server, gateway, client, SDKs) with access to complete source code under Apache v2 license. Open source license gives customers the freedom to innovate and avoid vendor lock-ins. ■ Security updates and bug fixes Avoid costly incidents through timely security updates and bug fixes, individually tracked and prioritized for your environment. ■ Private continuous integration and continuous delivery Operate your cloud storage similar to the public cloud environment with a personalized CI/CD schedule. Extend our test environment with application-specific tests and catch bugs earlier. ■ New features As part of the Minio subscription, customers will receive significant releases and new features at no extra charge. ■ Remote monitoring and diagnostics Minio One securely monitors your server logs for issues and proactively generates trouble tickets for engineering support review. Optionally, receive alerts via SMS and email. ■ Real-time engineering support Directly chat with the level 3/4 engineering support anytime (24 x 7 x 365) and receive the fastest resolution to your support incidents. SUBNET Subscription network for production infrastructure

Slide 15

Slide 15 text

CUSTOMER - CTO SCOTT MCCLELLAN 15 VP/CTO of Business Transformation Office of CTO VP/Chief Technologist and VP of Engineering - Cloud Services

Slide 16

Slide 16 text

THANK YOU @minio https://github.com/minio/minio https://slack.minio.io https://minio.io

Slide 17

Slide 17 text

2010 EXPONENTIAL DATA GROWTH 90% of the world’s data was created in the last two years. 16ZB 2017 163ZB 2025 BIG DATA (Hadoop HDFS) UNSTRUCTURED DATA (Object Storage) STRUCTURED DATA (HCI) *SOURCE: IBM & IDC, April 2017 17

Slide 18

Slide 18 text

Minio is a high performance distributed object storage server, designed for large-scale data infrastructure. MINIO SERVER MINIO CLIENT MINIO SDK WHAT IS MINIO? 18

Slide 19

Slide 19 text

19 S3 COMPATIBLE Amazon S3 API is the de facto standard for object storage. Minio implements Amazon S3 v2/v4 API. PLUGGABLE MULTI-CLOUD BACK-END Minio has a pluggable back-end with integrations with external storage backends such as NAS, Google Cloud Storage, and Azure Blob Storage. LAMBDA COMPUTE Minio server triggers Lambda functions through event notification service. OCR, audit compliance are good examples of lambda computing. ERASURE CODE & BITROT PROTECTION You may lose up to half the number of drives and still recover from it. Data protection code is accelerated using SIMD instructions on x64 and ARM CPUs. FEDERATION Federation allows combining unlimited number of Minio instances to form a single global. CACHE Disk-cache feature allows content to be closer to the tenants. This enables objects to be delivered with high performance and reduced bandwidth cost. SINGLE SIGN-ON Minio Server integrates with Identity Providers such as WSO2, Keycloak, Okta, Ping Identity to allow applications or users to authenticate and use Object Storage. ENCRYPTION & TAMPER-PROOF Minio provides confidentiality, integrity and authenticity assurances for encrypted data with negligible performance overhead. WORM Minio supports WORM (Write-Once-Read-Many) for long-term data retention as mandated by regulations or compliance rules. MINIO FEATURES

Slide 20

Slide 20 text

P P P P SERVER 2 SERVER 3 SERVER 32 SERVER 1 OBJECT 1 OBJECT 2 100 GBe SWITCH S3 API MINIO ARCHITECTURE minio server http://host{1...32}/export{1...32} 20

Slide 21

Slide 21 text

Disk1 MyBucket MyObject part.1 export-xl/ xl.json { “version”: “1.0.1”, “format”: “xl”, “stat”: { “size”: 2286, “modTime”: “2017-12-02T00:24:20.975968336Z”, }, “erasure”: { “algorithm”: “klauspost/reedsolomon/vandermonde”, “data”: 2, “parity”: 2, “blockSize”: 10485760, “index”: 2, “distribution”: “[ 2, 3, 4, 1 ], “checksum”: [ { “name”: “part.1”, “algorithm”: “blake2b”, “hash”: “c24fa0451fd85a3a482c...b672b7f” } ] }, “minio”: { “release”: “DEVELOPMENT.GOGET” }, “meta”: { “content-type”: “application/octet-stream”, “etag”: “c1d217c52d44c9eab00e81496b2b91b6” }, “parts”: [ { “number”: 1, “name”: “part.1”, “etag”: “”, “size”: 2286 } ] } Disk2 MyBucket MyObject part.1 xl.json Disk3 MyBucket MyObject part.1 xl.json Disk4 MyBucket MyObject part.1 xl.json ERASURE CODE INTERNALS

Slide 22

Slide 22 text

SCALABILITY DATA CENTER 1 DATA CENTER 2 DATA CENTER 3 ETCD US-EAST US-WEST DNS EU-1 EU-2 ■ Heterogeneous scalability ■ Planet-scale namespace with federation ■ Limit your failure domain to 32 servers maximum 22

Slide 23

Slide 23 text

CONTINUOUS DISASTER RECOVERY mc mirror --watch --recursive dc1/ dc2/ PRIMARY SITE SECONDARY SITE network 23

Slide 24

Slide 24 text

OBJECTS MINIO ELASTICSEARCH POSTGRESQL REDIS AMQP NATS IO WEBHOOKS MQTT LAMBDA FUNCTION BATCH PROCESS AUDIT LOGS THUMBNAIL OCR COMPRESS MINIO LAMBDA FUNCTIONS 24

Slide 25

Slide 25 text

IDENTITY & ACCESS MANAGEMENT 1 Client grant OpenID Connect 2 Token 5 Get object 3 Token STS 4 Temporary credentials S3 IDENTITY PROVIDER (IdP) APPLICATION MINIO 25

Slide 26

Slide 26 text

MINIO Encryption SSE-S3 SSE-C My Bucket My Object Data KMS Random IV Algorithm Name Sealed Object Key Random IV Sealed Object Key KMS Key ID Sealed KMS Data Key Algorithm Name Metadata Metadata MINIO 26

Slide 27

Slide 27 text

MINIO Encryption - SSE-C MyBucket S3 Client HTTPS HTTP Body HTTP Headers X--Amz-...-Customer-Algorithm X-Amz-...-Customer-Key X-Amz-...-Customer-Key-MD5 My Object Generated randomly Object Key SSE Key Sent by client Generated randomly IV Sealed Object Key Object Name Bucket Name IV Algorithm Name Sealed Object Key Metadata Object Data MINIO 27

Slide 28

Slide 28 text

MINIO Encryption - SSE-S3 My Bucket S3 Client HTTPS HTTP Body HTTP Headers X-Amz-Server-Side- Encryption: AES256 My Object Generated randomly Generated randomly IV Algorithm Name Sealed Object Key Metadata Object Data KMS KMS Sealed Key KMS Key ID Object Key KMS Data Key IV Sealed Object Key Object Name Bucket Name KMS Sealed Key Master Key 1 MINIO 28

Slide 29

Slide 29 text

DEVELOPER TRACTION Docker: 51,350 Kubernetes: 44,674 Minio: 13,803 Apache/Kafka: 10,354 Apache/Mesos: 3,962 *as of Dec 3rd, 2018 Github Stars trendline for Open-source Object Storage 29

Slide 30

Slide 30 text

S3 NAS file server NFS NFS APPLICATIONS 30 MINIO ON NAS (gateway mode)

Slide 31

Slide 31 text

APPLICATIONS USING MINIO 31 REAL-TIME ANALYTICS EXTRACT TRANSFORM LOAD PATTERN RECOGNITION PREDICTIVE ANALYTICS LOG ANALYSIS CLICKSTREAM ANALYSIS

Slide 32

Slide 32 text

Dell R740xd Cisco UCS C240 Lenovo SR650 HPE ProLiant DL380 MINIO REFERENCE HARDWARE 32