Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A Survey on Efficient Utilization of Emerging P...

A Survey on Efficient Utilization of Emerging Persistent Memory

2014冬学期 電子情報学専攻 輪講スライド

Makoto Shimazu

November 07, 2014
Tweet

More Decks by Makoto Shimazu

Other Decks in Programming

Transcript

  1. Outline Background ▪ Trend of Hardware Development Emerging Persistent Memory

    Three usage of Persistent Memory ▪ File System ▪ Heap area ▪ Database Summary 2
  2. Outline Background ▪ Trend of Hardware Development Emerging Persistent Memory

    Three usage of Persistent Memory ▪ File System ▪ Heap area ▪ Database Summary 3
  3. Development of Computer “Fast” is the top goal of computation

    How to achieve: CPU: faster clock/more cores Memory: faster interconnection/more capacity Storage: broader bandwidth/shorter RW latency 4
  4. CPU Improvements Many Core Xeon Phi: 60C/240T, x86 Compatible, 1.0GHz,

    1TFlops PEZY-SC: 1024C, 733MHz, 1.5TFlops TILE-Gx: 72C, 1.2GHz left) http://www.intel.co.jp/content/www/jp/ja/processors/xeon/xeon-phi-detail.html center) http://www.tilera.com/products/processors/TILE-Gx_Family right) http://www.pezy.co.jp/news/PEZY_PR_20140905.pdf 5
  5. Memory Improvements Bandwidth HBM (High Bandwidth Memory) ▪ up to

    256GB/s (DDR3: 12.8GB/s, GDDR5 88GB/s) Capacity 2.5D/3D Stacking ▪ 128GB/card in 2015 fig) http://www.eetimes.com/document.asp?doc_id=1279432 TSV (Through Silicon Via) 6
  6. Storage Improvements NVM (Non Volatile Memory) SSD (Flash memory) is

    one of NVM OpenNVM: http://opennvm.github.io ▪ Flash-aware Linux swap as a transparent extension of DRAM fig) http://www.hlnand.com/site/ID/applications Non-Volatile Storage is still slow! Volatile Durable 7
  7. Outline Background ▪ Trend of Hardware Development Emerging Persistent Memory

    Three usage of Persistent Memory ▪ File System ▪ Heap area ▪ Database Summary 8
  8. Next Generation NVM PM (Persistent Memory) There are many methods:

    ▪ Phase Change Memory ▪ Resistance RAM ▪ Spin Transfer Torque RAM ▪ Memristor Volatile Durable!!! Persistent Memory 9
  9. Various Types of PM 10 NAND Flash PCM ReRAM STT-RAM

    Property Capacitor Phase Resister Magnet Capacity/Chip ~128Gbit 128Mbit 2Mbit 64Mbit Latency 10μs 100ns 10ns 7ns Rewrite cycles 104~5 106 1011~12 1015 Manufacturer Toshiba SanDisk Samsung Micron Panasonic Everspin Toshiba Same as DRAM! Much reliable than Flash! no need to wear leveling
  10. Byte Addressability Disk: read/write API PM: load/store instructions Difference between

    PM and Disk 11 fig of HDD/SSD) http://storage-system.fujitsu.com/jp/lib-f/tech/beginner/ssd/ load/store to DRAM read/write to SSD/HDD load/store to PM Non-volatile Data Cache Per Sector (4k or 512 bytes) Per Byte (1 byte)
  11. Design Spaces on PM Durability V-NV gap is coming up

    between cache and PM High Random Access Performance read/write API are not suitable for PM Other data structures based on disk are the same 12
  12. Outline Background ▪ Trend of Hardware Development Emerging Persistent Memory

    Three usage of Persistent Memory ▪ File System ▪ Heap area ▪ Database Summary 13
  13. PM-Aware File System Paper J. Condit, et al., Better I/O

    Through Byte-Addressable, Persistent Memory, SOSP’09 Short Summary Revisit shadow-paging ▪ Short-circuit shadow paging instead of WAL1 Introduce two important hardware modifications: ▪ Atomic 8-byte writes ▪ Epoch barrier 14 1) WAL: Write Ahead Logging
  14. PM-Aware File System WAL (Write Ahead Logging) 16 Hello World!

    RINKO NXXXX hello.txt 1: WRITE “RINKO” 2: WRITE “NOW!!!” Log Snapshot Logging is needed per operation Unsuitable for Byte Addressability CRASH! Hello World! RINKO NOW!!!
  15. PM-Aware File System Shadow Paging Safe and consistent method to

    modify data Three steps: Copy, Modify, Refer 1: Copy 2: Modify 3: Refer Recursive Copy!!! 17
  16. PM-Aware File System Short Circuit Shadow Paging Introduce atomic 8-byte

    writes ≦ 8 Bytes File size Shadow Paging Atomic 8-byte write 18
  17. PM-Aware File System Cache Eviction Ambiguous timing of cache eviction

    causes inconsistency 1: Append 2: Atomic Write Cache 3: Append Write down to PM 19
  18. PM-Aware File System Epoch Barrier Problem Inconsistency caused by timing

    of cache eviction Cache Modifications Introduce Epoch Barrier ▪ Software issues the barrier explicitly Hardware features (for 8 in-flight epochs) ▪ 1bit persistent bit+3 bits Epoch Pointer on each cache line ▪ Additional tables to keep the information of epochs 20
  19. Epoch Tables PM-Aware File System Cache Eviction w/ EB 1:

    Append Cache 3: Append Write down to PM Epoch 1 Epoch 1 Epoch 2 Epoch 3 Epoch 2 Epoch 3 ebarrier ebarrier 2: Atomic Write 21
  20. PM-Aware File System Evaluation 22 Micro benchmarks BPFS vs. NTFS

    (RAM/Disk) Epoch Barrier SESC simulation 1.5x – 2.9x speed up than write-through caching
  21. Outline Background ▪ Trend of Hardware Development Emerging Persistent Memory

    Three usage of Persistent Memory ▪ File System ▪ Heap area ▪ Database Summary 23
  22. Heap on PM Paper J. Coburn, et al., NV-Heaps: Making

    Persistent Objects Fast and Safe with Next-Generation, Non-Volatile Memories, ASPLOS’11 Short Summary Use PM as object-based storage based on the two hardware modifications Propose three issues coming from PM ▪ More significant memory leaks ▪ Restriction of NV-to-V pointer ▪ Transactions on heap area 24
  23. Heap on PM Strength of Persistent Heap Motivation Serialization needs

    an additional calculation... However... Modification of small part of data is heavy ▪ Disk must be accessed by each sector (4k or 512 bytes) ▪ Slow seek speed (Disk) / Slow wear leveling speed (SSD) 25 The Area of Byte Addressable Storage: Persistent Memory!
  24. Heap on PM Keys of Design Memory Leaks Reference count

    with logging Pointers NV-to-V must not be persistent Transactions Idea of STM (Software Transactional Memory) 26
  25. Heap on PM Evaluation Environments Linux RAM Disk with extra

    delay Comparison with BerkleyDB and Memcached Fork from BDB Results NV-Heap has the performance as good as native Memcached 27
  26. Outline Background ▪ Trend of Hardware Development Emerging Persistent Memory

    Three usage of Persistent Memory ▪ File System ▪ Heap area ▪ Database Summary 28
  27. Distributed Logging with PM Paper T. Wang and R. Johnson,

    Scalable Logging through Emerging Non-Volatile Memory, VLDB’14 Short Summary Distributed logging enhancing PM advantages Without atomic 8-byte writes and epoch barrier Two software architecture instead: ▪ Global Sequence Number ▪ Passive Group Commit 29
  28. Distributed Logging with PM Centralized Logging 30 Log File Centralized

    logging does not suit massively parallel paradigm
  29. Distributed Logging with PM Distributed Logging 32 Remember the importance

    of Byte Addressability How is the order of logs determined?
  30. Distributed Logging with PM LSN (Log Sequence Number) 33 Hello

    World! RINKO NXXXX hello.txt 1: WRITE “RINKO” 2: WRITE “NOW!!!” Log Hello World! RINKO NOW!!! Snapshot 1: WRITE “RINKO” 2: WRITE “NOW!!!” Share the counter??
  31. Distributed Logging with PM Global Sequence Number GSN must be

    greater than the GSN of the previous write operation on the same page 34 1 2 3 4 5 Tx1 Tx2 Tx3 1 2 6 Write to P1 Write to P2 P1 P2 1 2 3 4 1 2 5 6 t = 0
  32. Distributed Logging with PM Passive Group Commit Ensure old logs

    are evicted from cache Leverage existing hardware support 3 Strategies of Caching ▪ Write-Through ▪ Write-Back ▪ Write-Combining 35
  33. Distributed Logging with PM Write-Combining Write back a small block

    at once ▪ Some adjacent bytes are combined into one ▪ Intel Core series processor has 8 WC buffers/core 36
  34. dgsn: 6 dgsn: 9 WC Buffer is not durable ▪

    Logging must keep in step on each processor dgsn: 9 dgsn: 10 Distributed Logging with PM How to Keep Consistency 37 Tx 1 Tx 2 Tx 3 Tx4: dgsn 1 Tx6: dgsn 4 Passive Group Commit Deamon 1: Commit dgsn: 6 dgsn: 8 Tx2: dgsn 1 Tx1: dgsn 4 Tx1: dgsn 10 2: Commit dgsn: 12 FULL! 3: Eviction dgsn: 13 Tx2: dgsn 1 Tx1: dgsn 4 Tx1: dgsn 10 Tx1: dgsn 10 Tx2: dgsn 12 Tx3: dgsn 13 latest: 9 latest: 6 latest: 13 latest: 10 latest: 12
  35. Outline Background ▪ Trend of Hardware Development Emerging Persistent Memory

    Three usage of Persistent Memory ▪ File System ▪ Heap area ▪ Database Summary 39
  36. Summary Three different approaches to PM ▪ File system ▪

    Heap area ▪ Database logging Important features of PM ▪ Volatility of cache ▪ High random access performance Many design space and revisable viewpoint 40
  37. Heap on PM Pointers Three types of pointers NV-to-NV weak

    NV-to-NV V-to-NV V-to-NV Require an another reference counter on NV area 41
  38. Heap on PM Transactions Logging NV write log on PM

    V read log on DRAM Steps to write Copy the object to write log Modify the original Confirm if the data is valid Permanent the data 42 Borrow the idea of Software Transactional Memory
  39. Heap on PM Usage void remove(int k) { NVHeap *nv

    = NVHOpen(“foo.nvheap”); NVList::VPtr a = nv->GetRoot<NVList::NVPtr>(); AtomicBegin { while(a->get_next() != NULL) { if (a->get_next()->get_value() == k) a->set_next(a->get_next()->get_next()); a = a->get_next(); } } AtomicEnd; } 43
  40. Distributed Logging with PM Logging ARIES (Algorithms for Recovery and

    Isolation Exploiting Semantic) ▪ WAL (Write Ahead Logging) ▪ 3 Phase to recover: Analysis, Redo, Undo Remove the changes by spoiled transactions 44
  41. Distributed Logging with PM Analyze Phase 46 pageID recLSN P5

    10 P3 20 P1 50 Dirty Page Table transID lastLSN T1 60 T3 50 Transaction Table
  42. Distributed Logging with PM Redo Phase 47 pageID recLSN P5

    10 P3 20 P1 50 Dirty Page Table transID lastLSN T1 60 T3 50 Transaction Table
  43. Distributed Logging with PM Undo Phase 48 pageID recLSN P5

    10 P3 20 P1 50 Dirty Page Table transID lastLSN T1 60 T3 50 Transaction Table
  44. Distributed Logging with PM Page-level Partitioning 49 P1 P2 P3

    P4 P5 P6 Tx 1 write P1 write P4 Tx 2 write P3 write P6 Frequently remote access Simple redo phase/Complex undo phase
  45. Distributed Logging with PM Transaction-level Partitioning Simple memory path Complex

    redo phase/Simple undo phase 50 50 Tx1 Tx3 Tx5 Tx2 Tx4 Tx6 Tx 1 write P1 write P4 Tx 2 write P3 write P6
  46. Distributed Logging with PM Logical Clock Uniqueness of LSN (Log

    Sequence Number) is a main problem 51 Each process counts up own clock As a message is received, clock is set to the max(timestamp, my t) + 1 Cannot know the order of tasks Ensure the latter task has the greater number
  47. Distributed Logging with PM Global Sequence Number GSN must be

    greater than the GSN of the previous write operation on the same page GSN is a distributed LSN 52 Page contains a GSN of the latest operation
  48. Distributed Logging with PM How to Keep Consistency WC Buffer

    is not durable ▪ System must know which commit is really committed 53