Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Survey of Transactional Memory

ytakano
May 31, 2016

Survey of Transactional Memory

ytakano

May 31, 2016
Tweet

More Decks by ytakano

Other Decks in Research

Transcript

  1. Transactional Memory Yuuki Takano (NICT) ytakanoster@gmail.com

  2. 8IZ5SBOTBDUJPOBM .FNPSZ w MPDLJTEJ⒏DVMUUPNBOBHF w EFBEMPDL w TUBSWBUJPO w QSJPSJUZ*OWFSTJPO

    w MPDLDPOWPZ w USBOTBDUJPOBMNFNPSZNJUJHBUFTUIFTFQSPCMFNT 2
  3. %FBEMPDL 3 t Thread 1 Thread 2 Lock B Lock

    A try to acquire A and fail try to acquire B and fail
  4. 4UBSWBUJPO 4 t High Priority Thread (acquire A) High Priority

    Thread (acquire B) Lock B Lock A Low Priority Thread (acquire A and B) try to acquire A and fail Lock A try to acuire B and fail Lock A Release A
  5. 1SJPSJUZ*OWFSTJPO 5 t High Priority Thread Low Priority Thread acquiring

    lock try to acquire and fail
  6. -PDL$POWPZ 6 Scheduler Thread1 Thread2 Thread3 ThreadN 1. contention Thread2

    4. acquire 2. event 3. contention (spin lock) 4. reschedule high overhead when many threads
  7. $PNQMFYJUZPG .VMUJUISFBE1SPHSBNNJOH 7 algorithm data structure ideal world algorithm data

    structure parallelism parallel algorithm parallel data structure real world complicated source code simple source code buggy difficult to maintain actually we want
  8. -PDLBOE 5SBOTBDUJPOBM.FNPSZ w -PDL w FYFDVUFDSJUJDBMTFDUJPOFYDMVTJWFMZ w POMZPOFDPEFFOUFSUIFDSJUJDBMTFDUJPO w 5SBOTBDUJPOBM.FNPSZ

    w FYFDVUFDSJUJDBMTFDUJPOTQFDVMBUJWFMZ w NVMUJQMFDPEFTFOUFSTBNFDSJUJDBMTFDUJPOTJNVMUBOFPVTMZ w DPOqJDUTBSFEFUFDUFECPUIXIJMFFYFDVUJOHDSJUJDBMTFDUJPOBOEUIFFOE PGDSJUJDBMTFDUJPO 8
  9. 4QJOMPDLCZ"UPNJD0QFSBUJPO w $"4 DPNQBSFBOETXBQ  w DPNQBSFBOETXBQBSFQFSGPSNFEBUPNJDBMMZ w UFTUBOETFU DPNQBSFBOEBEE

    FUDʜ w TQJOMPDLJTBDIJFWFECZVTJOH$"4 9 int locked; lock_spin() { while (__sync_lock_test_and_set(&locked, 1)) { while (locked) ; // busy-wait } } unlock_spin() { __sync_lock_release(&locked); } JGMPDLFEJT TFU
  10. 4ZOUBYPG5SBOTBDUJPOBM.FNPSZ BUPNJD SFUSZ PS&MTF 10 atomic { // transaction if

    (q.size() == 0) { // rollback and retry // transactions is restarted when // read-set is updated retry; } … // do something } orElse { // detect rollback and retry }
  11. 4PGUXBSF 5SBOTBDUJPOBM.FNPSZ 11

  12. 4PGUXBSF5SBOTBDUJPOBM .FNPSZ w 5- w %BWF%JDF 0SJ4IBMFW BOE/JS4IBWJUl5SBOTBDUJPOBMMPDLJOH**z UI*OUFSOBUJPOBM$POGFSFODFPO %JTUSJCVUFE$PNQVUJOHz

    %*4$ w -4" w 5PSWBME3JFHFM 1BTDBM'FMCFS BOE$ISJTUPG'FU[FS l"-B[Z4OBQTIPU"MHPSJUINXJUI&BHFS 7BMJEBUJPOz UI*OUFSOBUJPOBM$POGFSFODFPO%JTUSJCVUFE$PNQVUJOH %*4$ w -PH5. w ,FWJO&.PPSF +BZBSBN#PCCB .JDIFMMF+.PSBWBO .BSL%)JMM %BWJE"8PPE l-PH5.MPH CBTFEUSBOTBDUJPOBMNFNPSZz )1$" w %&6$& w (VZ,PSMBOE /JS4IBWJUBOE1BTDBM'FMCFS l/POJOWBTJWF+BWB$PODVSSFODZXJUI%FVDF45.z  .VMUJ1SPH 12 FUD
  13. 4VNNBSZPG5- w QSFQBSFBWBSJBCMFDBMMFEHMPCBMDMPDL w BTTPDJBUFNFNPSZSFHJPOTXJUIWFSTJPOOVNCFST w VQEBUFWFSTJPOOVNCFSTXIFOXSJUJOH w EFUFDUDPOqJDUTXIFOSFBEJOHBOEXSJUJOHCZDPNQBSJOHUIF HMPCBMDMPDLXJUINFNPSZWFSTJPOOVNCFS

    w SFUSZUSBOTBDUJPOXIFOEFUFDUJOHDPOqJDUT w PUIFSXJTFDPNNJU 13
  14. 5-7BSJBCMFT 14 HMPCBMWFSTJPODMPDL WBSJBCMF WFSTJPOOVNCFS (MPCBM.FNPSZ 5ISFBE-PDBM.FNPSZ SFBEWFSTJPOOVNCFS XSJUFMPDL WBSJBCMF

    WFSTJPOOVNCFS XSJUFMPDL WBSJBCMF WFSTJPOOVNCFS XSJUFMPDL XSJUFWFSTJPOOVNCFS UISFBE SFBETFU XSJUFTFU
  15. 5-"MHPSJUIN  15 HMPCBMWFSTJPODMPDL WBSJBCMF WFSTJPOOVNCFS (MPCBM.FNPSZ 5ISFBE-PDBM.FNPSZ SFBEWFSTJPOOVNCFS XSJUFMPDL

    WBSJBCMF WFSTJPOOVNCFS XSJUFMPDL WBSJBCMF WFSTJPOOVNCFS XSJUFMPDL XSJUFWFSTJPOOVNCFS UISFBE SFBETFU XSJUFTFU transaction { load var1; load var2; … store var3; } MPBEUIFHMPCBMWFSTJPODMPDLBOETUPSFJUJOB UISFBEMPDBMSFBEWFSTJPOOVNCFS 
  16. 5-"MHPSJUIN  16 HMPCBMWFSTJPODMPDL WBSJBCMF WFSTJPOOVNCFS (MPCBM.FNPSZ 5ISFBE-PDBM.FNPSZ SFBEWFSTJPOOVNCFS XSJUFMPDL

    WBSJBCMF WFSTJPOOVNCFS XSJUFMPDL WBSJBCMF WFSTJPOOVNCFS XSJUFMPDL XSJUFWFSTJPOOVNCFS UISFBE SFBETFU XSJUFTFU SVOUISPVHIBTQFDVMBUJWFFYFDVUJPO transaction { load var1; load var2; … store var3; } SVO
  17. 5-"MHPSJUIN  17 HMPCBMWFSTJPODMPDL WBSJBCMF WFSTJPOOVNCFS (MPCBM.FNPSZ 5ISFBE-PDBM.FNPSZ SFBEWFSTJPOOVNCFS XSJUFMPDL

    WBSJBCMF WFSTJPOOVNCFS XSJUFMPDL WBSJBCMF WFSTJPOOVNCFS XSJUFMPDL XSJUFWFSTJPOOVNCFS UISFBE SFBETFU XSJUFTFU MPHSFBEBEESFTTFTUPUIFSFBETFU transaction { load var1; load var2; … store var3; } MPHSFBETFU
  18. 5-"MHPSJUIN  18 HMPCBMWFSTJPODMPDL WBSJBCMF WFSTJPOOVNCFS (MPCBM.FNPSZ 5ISFBE-PDBM.FNPSZ SFBEWFSTJPOOVNCFS XSJUFMPDL

    WBSJBCMF WFSTJPOOVNCFS XSJUFMPDL WBSJBCMF WFSTJPOOVNCFS XSJUFMPDL XSJUFWFSTJPOOVNCFS UISFBE SFBETFU XSJUFTFU MPHXSJUFBEESFTTFTBOEWBMVFTUP UIFXSJUFTFU transaction { load var1; load var2; … store var3; } MPHXSJUFTFU
  19. 5-"MHPSJUIN  19 HMPCBMWFSTJPODMPDL WBSJBCMF WFSTJPOOVNCFS (MPCBM.FNPSZ 5ISFBE-PDBM.FNPSZ XSJUFMPDL WBSJBCMF

    WFSTJPOOVNCFS XSJUFMPDL WBSJBCMF WFSTJPOOVNCFS XSJUFMPDL UISFBE QPJOUFSUP QPJOUFSUP SFBETFU QPJOUFSUP XSJUFTFU QPJOUFSUP WBMVFPG WBSJBCMFJTTUPSFEBOEMPBEFE /PUFUIBUJGBWBSJBCMFJOUIFSFBETFUBMSFBEZBQQFBST JOUIFXSJUFTFU SFGFSUPUIFWBSJBCMFJOUIFXSJUFTFU GSPNUPBWPJESFBEBGUFSXSJUFIB[BSE
  20. 5-"MHPSJUIN  20 HMPCBMWFSTJPODMPDL WBSJBCMF WFSTJPOOVNCFS (MPCBM.FNPSZ 5ISFBE-PDBM.FNPSZ SFBEWFSTJPOOVNCFS XSJUFMPDL

    WBSJBCMF WFSTJPOOVNCFS XSJUFMPDL WBSJBCMF WFSTJPOOVNCFS XSJUFMPDL XSJUFWFSTJPOOVNCFS UISFBE SFBETFU XSJUFTFU DIFDLWBSJBCMFTBSFOPUNPEJpFEXIFO MPBEJOHNBLFTVSFUIBUWFSTJPOOVNCFSTBSF MFTTUIBOUIFSFBEWFSTJPOOVNCFS transaction { load var1; load var2; … store var3; }  JGNPEJpFE BCPSUUSBOTBDUJPO
  21. 5-"MHPSJUIN  21 HMPCBMWFSTJPODMPDL WBSJBCMF WFSTJPOOVNCFS (MPCBM.FNPSZ 5ISFBE-PDBM.FNPSZ SFBEWFSTJPOOVNCFS XSJUFMPDL

    WBSJBCMF WFSTJPOOVNCFS XSJUFMPDL WBSJBCMF WFSTJPOOVNCFS XSJUFMPDL XSJUFWFSTJPOOVNCFS UISFBE SFBETFU XSJUFTFU DIFDLXSJUFMPDLTBSFGSFF transaction { load var1; load var2; … store var3; } GSFF GSFF JGMPDLFE BCPSUUSBOTBDUJPO
  22. 5-"MHPSJUIN  22 HMPCBMWFSTJPODMPDL WBSJBCMF WFSTJPOOVNCFS (MPCBM.FNPSZ 5ISFBE-PDBM.FNPSZ SFBEWFSTJPOOVNCFS XSJUFMPDL

    WBSJBCMF WFSTJPOOVNCFS XSJUFMPDL WBSJBCMF WFSTJPOOVNCFS XSJUFMPDL XSJUFWFSTJPOOVNCFS UISFBE SFBETFU XSJUFTFU BDRVJSFXSJUFMPDLTVTJOHCPVOEFETQJOMPDL transaction { load var1; load var2; … store var3; } MPDL JGGBJMFEUPBDRVJSFXSJUFMPDLMPDLFE BCPSUUSBOTBDUJPO
  23. 5-"MHPSJUIN  23 HMPCBMWFSTJPODMPDL WBSJBCMF WFSTJPOOVNCFS (MPCBM.FNPSZ 5ISFBE-PDBM.FNPSZ SFBEWFSTJPOOVNCFS XSJUFMPDL

    WBSJBCMF WFSTJPOOVNCFS XSJUFMPDL WBSJBCMF WFSTJPOOVNCFS XSJUFMPDL XSJUFWFSTJPOOVNCFS UISFBE SFBETFU XSJUFTFU JODSFNFOUUIFHMPCBMWFSTJPODMPDL $"4PQFSBUJPO  BOETUPSFJUUPUIFXSJUFWFSTJPOOVNCFS transaction { load var1; load var2; … store var3; } JODSFNFOU BOETUPSF
  24. 5-"MHPSJUIN  24 HMPCBMWFSTJPODMPDL WBSJBCMF WFSTJPOOVNCFS (MPCBM.FNPSZ 5ISFBE-PDBM.FNPSZ SFBEWFSTJPOOVNCFS XSJUFMPDL

    WBSJBCMF WFSTJPOOVNCFS XSJUFMPDL WBSJBCMF WFSTJPOOVNCFS XSJUFMPDL XSJUFWFSTJPOOVNCFS UISFBE SFBETFU XSJUFTFU DIFDLWBSJBCMFTBSFOPUNPEJpFEXIFO MPBEJOHNBLFTVSFUIBUWFSTJPOOVNCFSTBSF MFTTUIBOUIFSFBEWFSTJPOOVNCFS transaction { load var1; load var2; … store var3; }  JGNPEJpFE BCPSUUSBOTBDUJPO
  25. 5-"MHPSJUIN  25 HMPCBMWFSTJPODMPDL WBSJBCMF WFSTJPOOVNCFS (MPCBM.FNPSZ 5ISFBE-PDBM.FNPSZ SFBEWFSTJPOOVNCFS XSJUFMPDL

    WBSJBCMF WFSTJPOOVNCFS XSJUFMPDL WBSJBCMF WFSTJPOOVNCFS XSJUFMPDL XSJUFWFSTJPOOVNCFS UISFBE SFBETFU XSJUFTFU DIFDLXSJUFMPDLTBSFGSFF transaction { load var1; load var2; … store var3; } GSFF GSFF JGMPDLFE BCPSUUSBOTBDUJPO
  26. 5-"MHPSJUIN  26 HMPCBMWFSTJPODMPDL WBSJBCMF WFSTJPOOVNCFS (MPCBM.FNPSZ 5ISFBE-PDBM.FNPSZ SFBEWFSTJPOOVNCFS XSJUFMPDL

    WBSJBCMF WFSTJPOOVNCFS XSJUFMPDL WBSJBCMF WFSTJPOOVNCFS XSJUFMPDL XSJUFWFSTJPOOVNCFS UISFBE SFBETFU XSJUFTFU transaction { load var1; load var2; … store var3; } SW XW JOUIFTQFDJBMDBTF XIFSFSFBEWFSTJPO OVNCFS XSJUFWFSTJPOOVNCFS JUJTOPU OFDFTTBSZUPWBMJEBUFUIFSFBETFU

  27. 5-"MHPSJUIN  27 HMPCBMWFSTJPODMPDL WBSJBCMF WFSTJPOOVNCFS (MPCBM.FNPSZ 5ISFBE-PDBM.FNPSZ SFBEWFSTJPOOVNCFS XSJUFMPDL

    WBSJBCMF WFSTJPOOVNCFS XSJUFMPDL WBSJBCMF WFSTJPOOVNCFS XSJUFMPDL XSJUFWFSTJPOOVNCFS UISFBE SFBETFU XSJUFTFU transaction { load var1; load var2; … store var3; } DPNNJUWBMVFTPGUIFXSJUFTFU

  28. 5-"MHPSJUIN  28 HMPCBMWFSTJPODMPDL WBSJBCMF WFSTJPOOVNCFS (MPCBM.FNPSZ 5ISFBE-PDBM.FNPSZ SFBEWFSTJPOOVNCFS XSJUFMPDL

    WBSJBCMF WFSTJPOOVNCFS XSJUFMPDL WBSJBCMF WFSTJPOOVNCFS XSJUFMPDL XSJUFWFSTJPOOVNCFS UISFBE SFBETFU XSJUFTFU transaction { load var1; load var2; … store var3; } VQEBUFWFSTJPOOVNCFSTCZUIF XSJUFWFSTJPOOVNCFS SFMFBTF
  29. 5-"MHPSJUIN  29 HMPCBMWFSTJPODMPDL WBSJBCMF WFSTJPOOVNCFS (MPCBM.FNPSZ 5ISFBE-PDBM.FNPSZ SFBEWFSTJPOOVNCFS XSJUFMPDL

    WBSJBCMF WFSTJPOOVNCFS XSJUFMPDL WBSJBCMF WFSTJPOOVNCFS XSJUFMPDL XSJUFWFSTJPOOVNCFS UISFBE SFBETFU XSJUFTFU transaction { load var1; load var2; … store var3; } SFMFBTFUIFXSJUFMPDLT SFMFBTF
  30. )BSEXBSF 5SBOTBDUJPOBM.FNPSZ 30

  31. )BSEXBSF5SBOTBDUJPOBM .FNPSZ w VTF$16DBDIFUPEFUFDUDPOqJDUT w NPEJGZDBDIFDPIFSFODFBMHPSJUINUP BDIJFWFUSBOTBDUJPOBMNFNPSZ 31

  32. $BDIF$PIFSFODF w .&4*QSPUPDPM w 5IFSFBSFTUBUFT w .PEJpFE &YDMVTJWF 4IBSFE *OWBMJE

    32
  33. .&4* .PEJpFE4UBUF 33 main memory CPU0 CPU1 cache 0 cache

    1 cache line dirty, must write back not shared with other CPU
  34. .&4* &YDMVTJWF4UBUF 34 main memory CPU0 CPU1 cache 0 cache

    1 cache line not modified not shared with other CPU
  35. .&4* 4IBSFE4UBUF 35 main memory CPU0 CPU1 cache 0 cache

    1 cache line not modified shared with other CPU
  36. .&4* *OWBMJE4UBUF 36 main memory CPU0 CPU1 cache 0 cache

    1 cache line no meaningful data
  37. .&4* &YDMVTJWF-PBE 37 main memory CPU0 CPU1 cache 0 cache

    1 1. request exclusive load 2. write back if modified 3. change state to invalid 4. load state with exclusive state
  38. .&4* 4IBSFE-PBE 38 main memory CPU0 CPU1 cache 0 cache

    1 1. request shared load 2. write back if modified 3. change state to shared 4. load state with shared state
  39. .&4* FWJDUJPO 39 main memory CPU0 CPU1 cache 0 cache

    1 1. write back if modified 2. discard
  40. 5SBOTBDUJPOBM $BDIF$PIFSFODF  40 main memory CPU0 CPU1 cache 0

    cache 1 0 prepare transactional bit in each cache line 0: not in transaction 1: in transaction
  41. 5SBOTBDUJPOBM $BDIF$PIFSFODF  41 main memory CPU0 CPU1 cache 0

    cache 1 1 abort transaction if MESI protocol invalidates transaction entry shared or exclusive state
  42. 5SBOTBDUJPOBM $BDIF$PIFSFODF  42 main memory CPU0 CPU1 cache 0

    cache 1 1 discard modified value and abort transaction if MESI protocol invalidates or evicts transaction entry modified
  43. 5SBOTBDUJPOBM $BDIF$PIFSFODF  43 main memory CPU0 CPU1 cache 0

    cache 1 1 abort transaction if MESI protocol evicts transaction entry because cache coherence protocol cannot detect conflicts evicted
  44. 1SPCMFNT 44

  45. 1SPCMFN  w JOpOJUFMPPQJOUSBOTBDUJPO w EFUFDUJPOPGWBSJBCMFWFSTJPOJOMPPQTTIPVMESFEVDF QFSGPSNBODFTJHOJpDBOUMZ w SFRVJSFNFOUPGDMPTFENFNPSZNBOBHFNFOU w

    DPEFTPVUPGUSBOTBDUJPODBOSFGFSBOEVQEBUFWBSJBCMFT JOUSBOTBDUJPOJOMBOHVBHFTMJLF$ $  w DPNQJMFSPSSVOOJOHFOWJSPONFOUTIPVMEDBSFBCPVU 45
  46. 1SPCMFN  46 atomic { … launchMissile(); … } .JTTJMFTNBZCF

    MBVODIFENBOZUJNFT *0JOUSBOTBDUJPONVTUDBVTFTBCPSU
  47. 1SPCMFN  w MJWFMPDL 47

  48. *NQMFNFOUBUJPO 48

  49. 4PGUXBSF5SBOTBDUJPOBM.FNPSZ 45.  JO)BTLFMM w )BTLFMMQSPWJEFT45.CZDPODVSSFOUNPEVMF w 45.NPOBEJTQSPWJEFEUPBDIJFWF45. w FYBNQMFJNQMFNFOUBUJPO

    w IUUQTHJTUHJUIVCDPNZUBLBOP CFGDCEEGD 49
  50. )BSEXBSF5SBOTBDUJPOBM.FNPSZ )5.  *OUFM549 w )5.JTBWBJMBCMFGSPN)BTXFMM w *OUFM549)-& w YBDRVJSFBOEYSFMFBTFJOTUSVDUJPOT

    w *OUFM54935. w YCFHJOBOEYFOEJOTUSVDUJPOT 50
  51. *OUFM54935. 51 xbegin ABORT . . . xend ABORT: //

    fallback if aborted sometimes, must go to fallback codes (such as spin lock)
  52. -PDLCZVTJOHUTYUPPMT IUUQTHJUIVCDPNBOEJLMFFOUTYUPPMT 52 volatile int lock = 0; rtm_lock() {

    for (int i = 0; i < RTM_MAX_RETRY; i++) { unsigned status = _xbegin(); if (status == _XBEGIN_STARTED) { if (! lock) return; // successfully started _xabort(0xff); } if ((status & _XABORT_EXPLICIT) && _XABORT_CODE(status) == 0xff && ! (status & _XABORT_NESTED) { while (lock) _mm_pause(); // busy-wait } else if (!(status & _XABORT_RETRY)) { break; } } while (__sync_lock_test_and_set(&lock, 1)) { // fallback to spin-lock while (lock) _mm_pause(); // busy-wait } } MPDLCZVTJOH*OUFM54935.
  53. 6OMPDLCZVTJOHUTYUPPMT IUUQTHJUIVCDPNBOEJLMFFOUTYUPPMT 53 rtm_unlock() { if (lock) { __sync_lock_release(&lock); }

    else { _xend(); } } VOMPDLCZVTJOH*OUFM54935.
  54. 1FSGPSNBODFPG*OUFM549 w *OUFMTBZTUIBUDPEFTPGDPBSTFHSBJOFE MPDLDBODPNQBSFXJUIDPEFTPGpOF HSBJOFEMPDL w FBTZUPXSJUFDPSFTDBMBCMFDPEFT 54

  55. 55 Applying Intel® TSX scaling Threads scaling Threads Application with

    Coarse Grain Lock Application re-written with Finer Grain Locks An example of secondary benefits of Intel® TSX Coarse Grain Lock Coarse Grain Lock + Intel® TSX Fine Grain Locks Fine Grain Locks + Intel® TSX Fine Grain Behavior at Coarse Grain Effort GSPN*OUFM%FWFMPQFS'PSVN
  56. 56 Intel® TSX Can Enable Simpler Scalable Algorithms Enabling Simpler

    Algorithms Lock-Free Algorithm • Don’t use critical section locks • Developer manages concurrency • Very difficult to get correct & optimize – Constrain data structure selection – Highly contended atomic operations State of the art lock-free algorithm Ops/sec Threads Ops/sec Threads TSX lock based algorithm Lock-Based + Intel® TSX • Use critical section locks for ease • Let hardware extract concurrency • Enables algorithm simplification – Flexible data structure selection – Equivalent data structure lock-free algorithm very hard to verify Real World Example GSPN*OUFM%FWFMPQFS'PSVN
  57. &0' 57