8-bit Quantization of Transformer Model

8-bit Quantization of Transformer Model

A42dd3541cd40296dcd8a5e6b4a01bef?s=128

Scatter Lab Inc.

April 29, 2020
Tweet

Transcript

 1. #JU2VBOUJ[BUJPOPG5SBOTGPSNFS.PEFM ੿਌੤ .BDIJOF-FBSOJOH4PGUXBSF&OHJOFFS 1JOHQPOH

 2. ݾର ݾର &GGJDJFOU#JU2VBOUJ[BUJPOPG5SBOTGPSNFS/FVSBM.BDIJOF-BOHVBHF5SBOTMBUJPO.PEFM "CTUSBDU *OUSPEVDUJPO .FUIPE

   3FTVMU
 3. &GGJDJFOU#JU2VBOUJ[BUJPOPG5SBOTGPSNFS /FVSBM.BDIJOF-BOHVBHF5SBOTMBUJPO.PEFM &GGJDJFOU#JU2VBOUJ[BUJPOPG5SBOTGPSNFS/FVSBM.BDIJOF-BOHVBHF5SBOTMBUJPO.PEFM

 4. "CTUSBDU &GGJDJFOU#JU2VBOUJ[BUJPOPG5SBOTGPSNFS/FVSBM.BDIJOF-BOHVBHF5SBOTMBUJPO.PEFM • *$.- *OUFM • ߓ੄ࣘب UISPVHIQVUೱ࢚ਸ঳ਵݶࢲ੿ب੄#-&6TDPSFBDDVSBDZ݅ڄয૗ • *OUFMDQVী୭੸ച

  • 5FOTPS'MPX • '1*OU 4
 5. *OUSPEVDUJPO &GGJDJFOU#JU2VBOUJ[BUJPOPG5SBOTGPSNFS/FVSBM.BDIJOF-BOHVBHF5SBOTMBUJPO.PEFM • ୭न੄*OUFM$16ٜ਷WFDUPSJ[FEOFVSBMOFUXPSLJOTUSVDUJPO 7//* ٜਸನೣ • ѐ੄CJUܳ'." 'VTFE.VMUJQMZBOE"EE 0QFSBUJPOਸجܻחѪਸ$ZDMF۽ࣻ೯

  • .BJO$POUSJCVUJPO • '1*/5RVBOUJ[BUJPOਸ޷݅੄405"#-&64DPSFೞۅ݅ਵ۽੉ܖযն • 1FSGPSNBODF0QUJNJ[BUJPO • .BU.VM • 2VBOUJ[FE.BU.VM(SBQI0QUJNJ[BUJPO • *OQVU1JQFMJOF0QUJNJ[BUJPO • 1BSBMMFM&YFDVUJPO 5
 6. .PEFM%FTDSJQUJPO &GGJDJFOU#JU2VBOUJ[BUJPOPG5SBOTGPSNFS/FVSBM.BDIJOF-BOHVBHF5SBOTMBUJPO.PEFM • ೨ब'1*/5۽QSFDJTJPOਸ൞ࢤೞ؊ۄب୽࠙൤ࢿמ੉ੜաৢஹನք౟ܳ2VBOUJ[F • 4PGUNBY -BZFS/PSNBMJ[BUJPOҗэ਷҃਋ח*/5۽աఋղӝী൨ٚч੉݆ই
 ֫਷BDD൞ࢤ੉৘࢚ؽ • .FBO

  7BSJBODF &YQ١੄҅࢑੉*/5۽աఋղӝী൨ٝ 6
 7. /BJWF2VBOUJ[BUJPO &GGJDJFOU#JU2VBOUJ[BUJPOPG5SBOTGPSNFS/FVSBM.BDIJOF-BOHVBHF5SBOTMBUJPO.PEFM • ӝઓ݆੉ॳ੉؍-JOFBS2VBOUJ[BUJPOߑधࢎਊ • .BY .JOਸ҅࢑೧ঠೞ޲۽-JOFBSTDBOਸ೙ਃ۽ೣ • 2VBOUJ[BUJPO0WFSIFBE0 /

   7
 8. /BJWF2VBOUJ[BUJPO &GGJDJFOU#JU2VBOUJ[BUJPOPG5SBOTGPSNFS/FVSBM.BDIJOF-BOHVBHF5SBOTMBUJPO.PEFM 8

 9. /BJWF2VBOUJ[BUJPO &GGJDJFOU#JU2VBOUJ[BUJPOPG5SBOTGPSNFS/FVSBM.BDIJOF-BOHVBHF5SBOTMBUJPO.PEFM 9 .JHBD[ 4CJUJOGFSFODFXJUIUFOTPSSU 63-IUUQPOEFNBOEHQVUFDIDPOGDPNHUDQSFTFOUBUJPOTCJUJOGFSFODFXJUIUFOTPSSUQEG

 10. &GGJDJFOU#JU2VBOUJ[BUJPOPG5SBOTGPSNFS/FVSBM.BDIJOF-BOHVBHF5SBOTMBUJPO.PEFM ,-%JWFSHFODFGPSPQUJNBMUISFTIPME • 2VBOUJ[BUJPO਷যରೖNBQQJOHਸযڌѱੜೞוջоޙઁ • '1UFOTPSEJTUSJCVUJPO_*/5UFOTPSEJTUSJCVUJPO • ߈ࠂ೧оݶࢲ*/5߸ജਸਤೠ0QUJNBM.JO .BYܳ଺ח׮ •

  0QUJNBM౸ױਸ,-%JWFSHFODF۽҅࢑ • 7BMJEBUJPO%BUBTFUѐ੄ޙ੢઺ѐ੿بSBOEPNTBNQMJOH • .JO .BY5ISFTIPMEفѐܳ੿೧ঠೞחؘ Ӓߑߨਸࣁо૑੿ب۽ա־যࠆ 10 4ZNNFUSJD $POKVHBUF Ӓրٮ۽҅࢑
 11. &GGJDJFOU#JU2VBOUJ[BUJPOPG5SBOTGPSNFS/FVSBM.BDIJOF-BOHVBHF5SBOTMBUJPO.PEFM ,-%JWFSHFODFGPSPQUJNBMUISFTIPME 11

 12. &GGJDJFOU#JU2VBOUJ[BUJPOPG5SBOTGPSNFS/FVSBM.BDIJOF-BOHVBHF5SBOTMBUJPO.PEFM ,-%JWFSHFODFGPSPQUJNBMUISFTIPME • /BJWF2VBOUJ[BUJPO਷45015PLFOਸյࣻহ঻ӝٸޙী/" • Ӕؘࣘب࢚0GGTFU੉;FSPоغחಞ੉޷ࣁೞѱࣘبо؊ࡅܰ׮ • Ӓېࢲ4ZNNFUSJDࢎਊೣ 12

 13. &GGJDJFOU#JU2VBOUJ[BUJPOPG5SBOTGPSNFS/FVSBM.BDIJOF-BOHVBHF5SBOTMBUJPO.PEFM ,-%JWFSHFODFGPSPQUJNBMUISFTIPME 13 .JHBD[ 4CJUJOGFSFODFXJUIUFOTPSSU 63-IUUQPOEFNBOEHQVUFDIDPOGDPNHUDQSFTFOUBUJPOTCJUJOGFSFODFXJUIUFOTPSSUQEG

 14. &GGJDJFOU#JU2VBOUJ[BUJPOPG5SBOTGPSNFS/FVSBM.BDIJOF-BOHVBHF5SBOTMBUJPO.PEFM 1FSGPSNBODF0QUJNJ[BUJPO • 7//**OTUSVDUJPOࢎਊ • 0QFSBUJPO୨іࣻ઴੉ӝ • 3FPSEFS0QFSBUJPO • .,-۽0QUJNJ[BUJPO૓೯

  • 1BSBMMFMJ[FCBUDIJOHFYFDVUJPO 14
 15. &GGJDJFOU#JU2VBOUJ[BUJPOPG5SBOTGPSNFS/FVSBM.BDIJOF-BOHVBHF5SBOTMBUJPO.PEFM 1FSGPSNBODF0QUJNJ[BUJPO2VBOUJ[FE.BU.VMT • "79 ࠺౟੄'."ো࢑$ZDMFীоמ • '1ѐ */5ѐ • $BTDBEF-BLF$16ࠗఠ*/57//*ܳ؊୭੸ച

  • 7//*ࢎਊೠ*/5.BU.VM਷"79ࢎਊೠ'1.BU.VMࠁ׮Yࡅܴ • 7//*ࢎਊೠ*/5.BU.VM਷"79ࢎਊೠ*/5.BU.VMࠁ׮Yࡅܴ 15
 16. &GGJDJFOU#JU2VBOUJ[BUJPOPG5SBOTGPSNFS/FVSBM.BDIJOF-BOHVBHF5SBOTMBUJPO.PEFM 1FSGPSNBODF0QUJNJ[BUJPO2VBOUJ[FE.BU.VMT 16

 17. &GGJDJFOU#JU2VBOUJ[BUJPOPG5SBOTGPSNFS/FVSBM.BDIJOF-BOHVBHF5SBOTMBUJPO.PEFM 1FSGPSNBODF0QUJNJ[BUJPO2VBOUJ[FE.BU.VMT • ೞ૑݅5FOTPS'MPXחJOUFHFS.BU.VMਊਵ۽(&..-081ۄחPQFOTPVSDFܳࢎਊೣ • (&..-081ח*/57//*ܳࢎਊೞ૑ঋҊ */5ো࢑द߸ജҗ੿ژೠ೙ਃೣ • Ӓېࢲ.,-#-"4ೣٜࣻ۽૒੽2VBOUJ[BUJPO4UFQ੘ࢿ •

  /PO;FSP0GGTFUبহছ 17
 18. &GGJDJFOU#JU2VBOUJ[BUJPOPG5SBOTGPSNFS/FVSBM.BDIJOF-BOHVBHF5SBOTMBUJPO.PEFM 1FSGPSNBODF0QUJNJ[BUJPO2VBOUJ[FE.BU.VMT 18 • HPPHMFHFNNMPXQ • UFOTPSGMPXUFOTPSGMPXীࢲחই૒ࢎਊೞחѪਸഛੋ • UFOTPSGMPXDPSFLFSOFMTRVBOUJ[FE@NBUNVM@PQTDD •

  UFOTPSGMPXMJUFLFSFOFMTDQV@CBDLFOE@HFNN@HFNNMPXQI
 19. &GGJDJFOU#JU2VBOUJ[BUJPOPG5SBOTGPSNFS/FVSBM.BDIJOF-BOHVBHF5SBOTMBUJPO.PEFM 19 UFOTPSGMPXDPSFLFSOFMTRVBOUJ[FE@NBUNVM@PQTDD

 20. &GGJDJFOU#JU2VBOUJ[BUJPOPG5SBOTGPSNFS/FVSBM.BDIJOF-BOHVBHF5SBOTMBUJPO.PEFM 20 UFOTPSGMPXDPSFLFSOFMTRVBOUJ[FE@NBUNVM@PQTDD

 21. &GGJDJFOU#JU2VBOUJ[BUJPOPG5SBOTGPSNFS/FVSBM.BDIJOF-BOHVBHF5SBOTMBUJPO.PEFM 21 UFOTPSGMPXDPSFLFSOFMTEFRVBOUJ[F@PQDD

 22. &GGJDJFOU#JU2VBOUJ[BUJPOPG5SBOTGPSNFS/FVSBM.BDIJOF-BOHVBHF5SBOTMBUJPO.PEFM 22 UFOTPSGMPXDPSFLFSOFMTEFRVBOUJ[F@PQDD

 23. &GGJDJFOU#JU2VBOUJ[BUJPOPG5SBOTGPSNFS/FVSBM.BDIJOF-BOHVBHF5SBOTMBUJPO.PEFM 1FSGPSNBODF0QUJNJ[BUJPO2VBOUJ[FE.BU.VMT 23 • /PO;FSP0GGTFUੌ҃਋HFNN@TVT੄ো࢑

 24. &GGJDJFOU#JU2VBOUJ[BUJPOPG5SBOTGPSNFS/FVSBM.BDIJOF-BOHVBHF5SBOTMBUJPO.PEFM 1FSGPSNBODF0QUJNJ[BUJPO(BUIFS/% 24 • (BUIFS/%חݫݽܻ*0о઺ਃೠো࢑ੋ݅ఀ */5۽઴੉؊ۄبো࢑ࣘبীח੉ٙ੉হ਺ • ೞ૑݅'1ࠁ׮*/5੉EBUBTJ[Fоߓ पઁ۽חY ੘ਵ޲۽*0੄ೱ࢚ਸӝ؀ೡࣻ੓਺

  • (FOFSBUJPO-PPQীࢲ੹ױ҅੄Ѿҗীࢲ%FRVBOUJ[FEܳউೞҊ(BUIFS/%ܳ߄۽ࣻ೯ೡ ࣻ੓ਵ޲۽Y੄ࢿמೱ࢚੉੓঻਺
 25. &GGJDJFOU#JU2VBOUJ[BUJPOPG5SBOTGPSNFS/FVSBM.BDIJOF-BOHVBHF5SBOTMBUJPO.PEFM 1FSGPSNBODF0QUJNJ[BUJPO*OQVU1JQFMJOF 25 • 1BE5PLFO੉ҭ੢൤௾0WFSIFBE੉ӝٸޙী೧׼ష௾ਸহগࠁ۰חदب • 5PSDI੄1BDLFE4FRVFODFܳࢤп೮חؘ ӒѤইצ٠ • ੉֤ޙীࢲחৈ۞ߓ஖ܳTPSUJOH೧ࢲQBEUPLFO੉୭ࣗ۽ٜযоب۾ೣ

  • ੿ب੄ࢿמೱ࢚ਸ঳਺ • ੉ࠗ࠙਷5PSDI੄1BDLFE4FRVFODFэ਷ো࢑җэ਷ѐ֛ਸҳഅ೧ࢲॳݶ੿۳ೡ೙ਃبহ ੉഻ঁࡅܳѪэ׮
 26. &GGJDJFOU#JU2VBOUJ[BUJPOPG5SBOTGPSNFS/FVSBM.BDIJOF-BOHVBHF5SBOTMBUJPO.PEFM 1FSGPSNBODF0QUJNJ[BUJPO(SBQI0QUJNJ[BUJPO 26 • ੉੹ীࢲࣿ೮٠,-%JWFSHFODFܳ੉ਊ೧ UISFTEIPMETܳ଺חߑध਷NJO NBYܳো࢑ೞח दрਸহগળ׮$POTU0QFSBUJPOਵ۽߄Պ • 3FRVBOUJ[F৬3FRVBOUJ[BUJPO3BOHF

  0QFSBUJPOਸ(SBQIীࢲহঞҊ */5ীࢲ '-0"5۽߄۽߄Բب۾ো࢑ਸ߸҃೮׮ ӝઓ ো࢑਷*/5*/5'-0"5 
 27. &GGJDJFOU#JU2VBOUJ[BUJPOPG5SBOTGPSNFS/FVSBM.BDIJOF-BOHVBHF5SBOTMBUJPO.PEFM 1FSGPSNBODF0QUJNJ[BUJPO(SBQI0QUJNJ[BUJPO 27

 28. &GGJDJFOU#JU2VBOUJ[BUJPOPG5SBOTGPSNFS/FVSBM.BDIJOF-BOHVBHF5SBOTMBUJPO.PEFM 1FSGPSNBODF0QUJNJ[BUJPO1BSBMMFM#BUDIJOH 28 • &YFDVUJPOUJNF਷CBUDIউ੄TFOUFODFMFOHUIী੄ઓ੸੉׮ • -POHFSTFOUFODFח$16഻ܳঁബਯ੸ਵ۽ॳחѪਸҙ଴ೡࣻ੓঻Ҋ • 4FSJBMFYFDVUJPOदীח഻ঁ࠺ബਯ੸ਵ۽ॳחѪਸࠅࣻ੓঻׮

  • ੉ѦQBSBMMFMFYFDVUJPOೞݶYࢿמೱ࢚੉ઓ੤ೣ • *NQMFNFOUBUJPO • '*'02VFVFܳҙܻೞח1BSFOU5FOTPS'MPX4FTTJPO੘ࢿ • ౠ੿$16௏য৬MPDBMNFNPSZীBGGJOJUJ[FEػ /6." DIJMEQSPDFTTGPSL • $IJMEQSPDFTTח2VFVFীࢲ"TZODISPOPVTೞѱো࢑૓೯
 29. &GGJDJFOU#JU2VBOUJ[BUJPOPG5SBOTGPSNFS/FVSBM.BDIJOF-BOHVBHF5SBOTMBUJPO.PEFM 1FSGPSNBODF0QUJNJ[BUJPO1BSBMMFM#BUDIJOH 29

 30. &GGJDJFOU#JU2VBOUJ[BUJPOPG5SBOTGPSNFS/FVSBM.BDIJOF-BOHVBHF5SBOTMBUJPO.PEFM 1FSGPSNBODF0QUJNJ[BUJPO$PODMVTJPO 30

 31. &GGJDJFOU#JU2VBOUJ[BUJPOPG5SBOTGPSNFS/FVSBM.BDIJOF-BOHVBHF5SBOTMBUJPO.PEFM 1FSGPSNBODF0QUJNJ[BUJPO$PODMVTJPO 31

 32. &GGJDJFOU#JU2VBOUJ[BUJPOPG5SBOTGPSNFS/FVSBM.BDIJOF-BOHVBHF5SBOTMBUJPO.PEFM 1FSGPSNBODF0QUJNJ[BUJPO$PODMVTJPO 32

 33. хࢎ೤פ׮✌ ୶о૕ޙژחҾӘೠ੼੉੓׮ݶ঱ઁٚইېোۅ୊۽োۅ઱ࣁਃ ੿਌੤ .BDIJOF-FBSOJOH4PGUXBSF&OHJOFFS 1JOHQPOH &NBJMVLKBF!TDBUUFSMBCDPLS 'BDFCPPL!KFPOHVLKBF -JOLFEJO!KFPOHVLKBF