Slide 1

Slide 1 text

1 Heads will Fly! Thoughts on Storage Systems Tom Lyon For Brocade Communications 10/2/2003

Slide 2

Slide 2 text

2 Backup And hope you never have to restore!

Slide 3

Slide 3 text

3 The Backup Paradox  Old filesystems: random updates, sequential reads  New filesystems: sequential updates (log), random reads  Walking the filesystem to backup causes extreme overhead  Is snapshot/rollback enough?

Slide 4

Slide 4 text

4 Backup  Most users perform backups ‘religiously’  It’s a miracle if they can restore things!  Restores can be major source of un- availability  Allocity – recovery for MS Exchange

Slide 5

Slide 5 text

5 Capacity

Slide 6

Slide 6 text

6 Storage vs Network  Individual drives approach 500 Mbps  10Gbps “non-blocking” is only 20 drives  PCI Express 16b wide = 32Gbps  We need 40 to 100Gbps Ethernet (or FC) soon!

Slide 7

Slide 7 text

7 Network vs. Storage Network Megabits/Second Terabytes/Day Petabytes/Year T1 1.5 0.02 0.01 Ethernet 10 0.11 0.04 T3 45 0.49 0.18 FastEther 100 1.08 0.39 GigEther 1000 10.80 3.94 10Gig 10000 108.00 39.42

Slide 8

Slide 8 text

8 Capture Applications  Gathering workload information  Network sanity analysis  Wire-tapping “Legal Intercept”  Intrusion detection / analysis  Forensic evidence gathering  Capture, timestamp, encrypt, …  Video capture for security cameras

Slide 9

Slide 9 text

9 Switched I/O

Slide 10

Slide 10 text

10 New I/O  PCI Express – serialized PCI  PCI Express/Advanced Switching  Peer-Peer Host & Devices  Infiniband

Slide 11

Slide 11 text

11 Peer-to-Peer I/O  Most busses allow p2p I/O device transfers  Multibus, VME, PCI, …  Switched I/O allows p2p Host transfers  PCI Express, Infiniband, some CompactPCI  Management framework is usually insufficient  I2O attempted solution for PCI  Too complex, failed  P2P I/O never happens, P2P Host uses IP

Slide 12

Slide 12 text

12 RDMA  RDMA is peer-peer host & memory  Datapath can be optimized  But control is horribly complex  DMA never happened between hosts on a bus, why would it happen between hosts on a network?  DMA between device & host works when device is dedicated to host

Slide 13

Slide 13 text

13 Virtual Everything (This slide does not exist)

Slide 14

Slide 14 text

14 Virtual Storage  Virtual Tapes on Drives  Virtual Drives with RAID  Virtual Volumes – expandable, replicated, cached in mem  Virtual filesystems, virtual filers  Virtual files - /proc/…  Virtual databases – ODBC proxy/cache/repl

Slide 15

Slide 15 text

15 Virtualizing Interfaces  [host] –iface1-- [drive]  [host] –iface1-- [new logic] –iface2-- [drive]  If iface1 = iface2 then proxy  Else translator

Slide 16

Slide 16 text

16 Proxies  Switches – simple transport  Caches  LUN mapping  Many others

Slide 17

Slide 17 text

17 HW vs SW Interfaces  ATA, SATA HBAs are software compatible back to the dark ages – HW interfaces  SCSI, FC, iSCSI all require vendor unique OS support – SW interfaces  Deployment of HW interface is quick & painless; SW is slow and complex

Slide 18

Slide 18 text

18 Low Level Interfaces Type Value-Add ATA HW low SATA HW low SCSI SW med iSCSI SW hi FC SW hi

Slide 19

Slide 19 text

19 Translations To From ATA SATA SCSI iSCSI FC ATA R ? x x x SATA R ? x ! ! SCSI R R R x pl iSCSI R R R ? sr FC R R R ? R

Slide 20

Slide 20 text

20 HyperSCSI  From SAN -> Internet (iSCSI) is a giant leap  SAN -> LAN (HyperSCSI) is smaller step  Non-IP (not politically correct)  Avoids TCP overhead, IP admin & security problems http://nst.dsi.a-star.edu.sg/mcsa/hyperscsi/

Slide 21

Slide 21 text

21 What I Want  Convert SATA front end (highly compatible) into FC/iSCSI back ends (high value add)  ASIC for SATA->HyperSCSI aggregation  Build blade server system to replace the typical rack of single-disk servers  Rich backend for HyperSCSI->FC or iSCSI

Slide 22

Slide 22 text

22 High Level Interfaces  NFS & CIFS  HTTP & WebDAV  NDMP