$30 off During Our Annual Pro Sale. View Details »

FARIS: Fast and Memory-efficient URL Filter by Domain Specific Machine

ytakano
September 28, 2016

FARIS: Fast and Memory-efficient URL Filter by Domain Specific Machine

ytakano

September 28, 2016
Tweet

More Decks by ytakano

Other Decks in Research

Transcript

  1. '"3*4'BTUBOE.FNPSZF⒏DJFOU63-'JMUFS CZ%PNBJO4QFDJpD.BDIJOF *&&&*$*5$4 :VVLJ5BLBOPBOE3ZPTVLF.JVSB /BUJPOBM*OTUJUVUFPG*OGPSNBUJPOBOE$PNNVOJDBUJPOT 5FDIOPMPHZ +BQBO 

  2. #BDLHSPVOE  63-pMUFSJOHJTBEPQUFEGPSTFWFSBMQVSQPTFT QBSFOUBMDPOUSPM JOUSVTJPOEFUFDUJPO QSPUFDUJOHXFCCSPXTJOH "E#MPDL1MVTJTPOFPGUIFNPTUGBNPVT63-pMUFS JUCMPDLTVOQMFBTBOUBETJUFT BMTP JUCMPDLTXFCUSBDLJOHTJUFT

    XIJDIUISFBUFOUIFQSJWBDZPGXFCVTFST 
  3. #BDLHSPVOE  UIFOVNCFSPGpMUFSSVMFTUIBU"E#MPDL1MVT VUJMJ[FTIBTJODSFBTFE SFRVJSFNVDINFNPSZ TFWFSBMIVOESFET<.#>  BOE$16SFTPVSDFTGPS63-pMUFSJOH OPUTVJUBCMFGPSNPCJMFEFWJDFT XIJDIIBWF

    MJNJUFESFTPVSDFT 
  4. $POUSJCVUJPOT QSPQPTFBEPNBJOTQFDJpDQTFVEPNBDIJOF CZUF DPEFJOUFSQSFUFS DBMMFE'"3*4 GPS"E#MPDL`TSVMFT JUDPOTVNFTPOMZPGNFNPSZJODPNQBSFE XJUIUIFDPOWFOUJPOBMJNQMFNFOUBUJPO JUJTUJNFTGBTUFSUIBOUIFDPOWFOUJPOBM JNQMFNFOUBUJPO

    
  5. "E#MPDL1MVT`T4ZOUBY BOE3FHVMBS&YQSFTTJPOT "E#MPDL1MVT BOEPUIFSBQQMJDBUJPOTMJLF V#MPDL VTFTSFHVMBSFYQSFTTJPOTJOUFSOBMMZ  TABLE I FILTER

    SYNTAX OF ADBLOCK PLUS AND ITS REGULAR EXPRESSIONS AdBlock’s syntax regular expression * /.*/ | of the beginning of a line /ˆ/ | of the end of a line /$/ || of the beginning of a line /[\w\-]+\/+/ ˆ /[\x00-\x24\x26-\x2C\x2F\x3A-\x40\x5B-\x5E\x60\x7B-\x7F]|$/ a hash function to the pieces to minimize hash colli- J. Garnica et al. [19] proposed an architecture for URL in 100 GbE networks. Their architecture consists of TABLE II THE MACHINE INSTRUCTIONS OF FARIS opcode operand operation
  6. '"3*4 '"TUVOJGPSN3FTPVSDF*EFOUJpFS4QFDJpDpMUFS JUIBTJOTUSVDUJPOT SFHJTUFST BOEGSBNFTUBDL  'JMUFS3VMFT '"3*4 CZUFDPEFT '"3*4

    QTFVEPNBDIJOF 63- SFTVMU NBUDIPSOPU
  7. '"3*4 3FHJTUFSTBOE'SBNF4UBDL  JOQVUIUUQXXXHPPHMFDPN CZUFDPEFYYYYʜY 41 TUSJOHQPJOUFS 1$ QSPHSBNDPVOUFS 41

    1$ 41 1$ 'SBNF4UBDL
  8. '"3*4 .BDIJOF*OTUSVDUJPOT  PQDPEF PQFSBOE PQFSBUJPO DIBS D JG41JTQPJOUJOHUPD JODSFNFOU1$BOE41PUIFSXJTF

    JG UIFGSBNFTUBDLJTFNQUZ UIFOBCPSUNBUIJOH FMTFQPQ1$ BOE41GSPNUIFGSBNFTUBDL TLJQ@UP D JODSFNFOU41VOUJMJUQPJOUTUPDJGDXBTOPUGPVOE BCPSU NBUDIJOHPUIFSXJTFJODSFNFOU1$BOEQVTI1$ BOE 41UPUIFGSBNFTUBDL TLJQ@TDIFNF JG41JTQPJOUJOHUP63-TDIFNF JODSFNFOU41VOUJMJUJTOPU QPJOUJOHUP63-TDIFNFPUIFSXJTF BCPSUNBUDIJOH NBUDI pOJTINBUDIJOHTVDDFTTGVMMZ
  9. '"3*4 $PNQJMBUJPO3VMFT  JOQVU JOTUSVDUJPO D TLJQ@UPD ? TLJQ@UPseparator D

    DIBSD ? DIBSseparator ccPGUIFCFHJOOJOHPGBMJOF DIBShead TLJQ@TDIFNF cPGUIFCFHJOOJOHPGBMJOF DIBShead cPGUIFFOEPGBMJOF DIBStail
  10. '"3*4 &YBNQMF  ccXXX BE?QIQ DIBSIFBE TLJQ@TDIFNF DIBSX DIBSX DIBSX

    TLJQ@UPB DIBSE DIBSTFQBSBUPS DIBSQ DIBSI DIBSQ NBUDI IUUQFYBNQMFDPN IUUQXXXFYBNQMFDPNBEQIQ BCPSUIFSF NBUDI
  11. 0UIFS0QUJNJ[BUJPO VTFIBTIUBCMFTGPSNVMUJQMFSVMFT NVMUJUISFBEJOHTVQQPSUGPSNVMUJDPSF$16T 

  12. *NQMFNFOUBUJPO JNQMFNFOUFEJO$  EJTUSJCVUFEPOIUUQTHJUIVCDPNTUBSCFE GBSJTWN #4%MJDFOTF 

  13. &WBMVBUJPO   %"5"4&5 DBQUVSFSFBMUSB⒏DPG8*%&DBNQTQSJOH SBOEPNMZDIPTF 63-TGSPNUIFIUUQUSB⒏D  TABLE V

    ANTS OF WIDE CAMP 2015 SPRING Japanese non-Japanese nt non-student non-student total 26 67 9 102 5 1 0 6 31 68 9 108 m jp m jp m m jp jp jp jp m jp m jp et rg Fig. 3. Histogram of URL Length (1,000 Samples) IJTUPHSBNPG63-MFOHUI  TBNQMFT
  14. &WBMVBUJPO   &OWJSPONFOU $16 $PSFJ*)2<()[> $PSF)5 UVSCPCPPTU<()[> 04 .BD04

    $ DPNQJMFS MMWNDMBOH  
  15. &WBMVBUJPO   .FNPSZ6TBHF  CPU Core i7 I7-4870HQ 2.5

    [GHz], 4-core, HT turbo boost 3.7 [GHz] OS MacOS 10.10.2 C++ compiler llvm clang++ 6.0 all typical easylist memory usage [MB] 0 175 350 525 700 hashed FARIS FARIS RE2 Irregexp (V8) typic easyl only 12.5% JavaScript w 2) Throug of our throug engines. Wh DPOTVNFPOMZNFNPSZJODPNQBSJTPOXJUI UIFDPOWFOUJPOBMJNQMFNFOUBUJPO '"3*4WT7 pMUFSpMFT pMUFSpMFT pMUFSpMFT
  16. &WBMVBUJPO   .BUDIJOH5ISPVHIQVU  1,511 352 24,471 84 1,102

    2,856 1,645 72,424 652 2,548 4,320 3,645 119,289 1,066 4,068 1,657 1,221 47,436 438 1,596 77,139 44,725 1,529,108 6,671 59,131 T ] all typical easylist #URLs per second 1 100 10000 hashed FARIS FARIS RE2 Irregexp (V8) BCPVUUJNFTGBTUFSUIBO UIFDPOWFOUJPOBMJNQMFNFOUBUJPO '"3*4WT7 pMUFSpMFT pMUFSpMFT pMUFSpMFT
  17. $PODMVTJPO QSPQPTF'"3*4GPS63-pMUFSJOH JUDBOEFBMXJUI"E#MPDL1MVT`TpMUFST JUJTUJNFTGBTUFSBOEDPOTVNFTPOMZ PGNFNPSZJODPNQBSJTPOXJUIUIF DPOWFOUJPOBMJNQMFNFOUBUJPO