Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Distributed Processing in Python

Distributed Processing in Python

chie8842

July 14, 2019
Tweet

More Decks by chie8842

Other Decks in Technology

Transcript

  1. $IJF)BZBTIJEB !DIJF) • 4PGUXBSF&OHJOFFSBU$PPLQBE GSPN+BQBO! • EFWFMPQBOEDPOUSJCVUFUPTPNF044 XSJUFBSUJDMFT BOENBLF QSFTFOUBUJPOT

    • 8POGJSTUQSJ[FBUSF*OWFOU (BNF%BZ • XPOUI QSJ[F 5PQ BU1FUGJOEFS ,BHHMFDPNQFUJUJPOJOXFFLDIBMMFOHF $IJF)BZBTIJEB !DIJF • 4PGUXBSF&OHJOFFSBU$PPLQBE GSPN+BQBO! • EFWFMPQBOEDPOUSJCVUFUPTPNF044 XSJUFBSUJDMFT BOENBLF QSFTFOUBUJPOT • 8POGJSTUQSJ[FBUSF*OWFOU (BNF%BZ" • XPOUI QSJ[F 5PQ BU1FUGJOEFS ,BHHMFDPNQFUJUJPOJOXFFLDIBMMFOHF
  2. %JTDMBJNFS • 8IBUEP*UBML • CBTJDBSDIJUFDUVSFPGQBSBMMFMEJTUSJCVUFE QSPDFTTJOH DPNQVUJOH MJCSBSJFTJO1ZUIPO • 8IBUEP*OPU

    UBML • IPXUPJOTUBMMBOETFUVQFBDIMJCSBSJFT • VTBHFTPGEFUBJMFE"1*TPGXIBUXFJOUSPEVDF
  3. l1BSBMMFMQSPDFTTJOHzBOEl%JTUSJCVUFEQSPDFTTJOHz • 1BSBMMFMQSPDFTTJOH 1SPDFTTPS T .FNPSZ 1SPDFTTPS T .FNPSZ 1SPDFTTPS

    T .FNPSZ 1SPDFTTPS 1SPDFTTPS 1SPDFTTPS .FNPSZ • %JTUSJCVUFEQSPDFTTJOH /8 /8 /8 QSPDFTTPSTIBSFTNFNPSZ*OQBSBMMFMQSPDFTTJOH POUIFPUIFSIBOEQSPDFTTPSTIBTPXONFNPSJFTJOEJTUSJCVUFEQSPDFTTJOH
  4. *OXIBUTJUVBUJPOTJTUISFBEJOH FGGFDUJWF • 5BTLTXIJDIJT*0 CPVOETVDIBT • SFBEXSJUFGJMFT • %#DPOOFDUJPO •

    %PXOMPBEEBUBGSPN/8 1SPDFTT 5ISFBE  5ISFBE  5ISFBE  SVO BXBJU BXBJU SVO BXBJU SVO SVO SVO BXBJU BXBJU SVO "DRVJSF-PDL
  5. *OXIBUTJUVBUJPOTJTUISFBEJOH FGGFDUJWF • 5BTLTXIJDIJT*0 CPVOETVDIBT • SFBEXSJUFGJMFT • %#DPOOFDUJPO •

    %PXOMPBEEBUBGSPN/8 1SPDFTT 5ISFBE  5ISFBE  5ISFBE  SVO BXBJU BXBJU SVO BXBJU SVO SVO SVO *0 BXBJU SVO "DRVJSF -PDL *0 *0 *0 VTF$16 EPOPUVTF$16
  6. $FMFSZ"SDIJUFDUVSF $MJFOU CSPLFS 3BCCJU.2 3FEJT  424 FUD 2VFVF 2VFVF

    2VFVF 8PSLFS $POTVNFS 8PSLFS $POTVNFS #BDLFOE 3FTVMUT 3FEJT FUD 8PSLFS $POTVNFS 8PSLFS $POTVNFS
  7. %BTL • EJTUSJCVUFEQSPDFTTJOHGSBNFXPSLCVJMUJO1ZUIPO • CFBCMFUPVTF/VN1Z1BOEBT-JTUTMJLFQBSBMMFM PCKFDUT EBTLBSSBZ EBTLEBUBGSBNF EBTLCBH BOE"1*T

    • 4DIFEVMFSJTDVTUPNJ[BCMFUPHFUHPPEQFSGPSNBODFGPS CPUIGPSQBSBMMFMQSPDFTTJOHPOMPDBMNBDIJOFBOEGPS EJTUSJCVUFEQSPDFTTJOHPODMVTUFS • )BTHPPE8FC*OUFSGBDFGPSSFBMUJNFKPCNPOJUPSJOH
  8. 1Z4QBSL • CVJMEJO+BWBBOEIBT1ZUIPO*OUFSGBDF • DBOCFNPSFTDBMFPVUUIBODFMFSZBOE%BTL • 0SJHJOBMMZCVJMUUPSVOPOB)BEPPQDMVTUFS • GBTUBOEDPTUFGGJDJFOUQSPDFTTJOH •

    )BTHPPE8FC*OUFSGBDFGPSSFBMUJNFKPCNPOJUPSJOH • FBTZUPXSJUFDPNQSFY QSPHSBNXJUISJDIPQFSBUPST • NVMUJGVODUJPOBM • TUSFBN "1* • NBDIJOFMFBSOJOH"1* • DBOCFVTFJONBOBHFNFOUTFSWJDFJODMPVETFSWJDF
  9. $PODMVTJPO • 8FDBOVTFNBDIJOFSFTPVSDFTFGGJDJFOUMZBOETQFFEVQ PVSQSPHSBNXJUIQBSBMMFMEJTUSJCVUFEQSPDFTTJOH • 5IFSFBSFTFWFSBMQBSBMMFMQSPDFTTJOHMJCSBSJFTBWBJMBCMFJO 1ZUIPO • *SFDPNNFOE •

    KPCMJC GPSHFOFSBMVTBHFPGQBSBMMFMJ[N JOPOFNBDIJOF • %BTL GPSQBSBMMFMJ[FQBOEBT%BUB'SBNF QSPHSBN • 1Z4QBSL GPSIVHFEBUBXIJDIDBOOPUIBOEMFJOPOFNBDIJOF
  10. MPI • It is a low-level API and faster than

    other libraries, but it is difficult to use MPI to create software that can be used at the production level • MPI is a low level API. MPI can calcrate faster than Python Libraries with proper integration • MPI doesnʼt have rich partition-tolerance functions like PySpark or other libraries • • PyTorch and ChainerMN use it internally for Distributed DeepLearning