Upgrade to Pro — share decks privately, control downloads, hide ads and more …

FARIS: Fast and Memory-efficient URL Filter by Domain Specific Machine

ytakano
September 28, 2016

FARIS: Fast and Memory-efficient URL Filter by Domain Specific Machine

ytakano

September 28, 2016
Tweet

More Decks by ytakano

Other Decks in Research

Transcript

  1. '"3*4'BTUBOE.FNPSZF⒏DJFOU63-'JMUFS
    CZ%PNBJO4QFDJpD.BDIJOF
    *&&&*$*5$4

    :VVLJ5BLBOPBOE3ZPTVLF.JVSB
    /BUJPOBM*OTUJUVUFPG*OGPSNBUJPOBOE$PNNVOJDBUJPOT
    5FDIOPMPHZ +BQBO

    View full-size slide

  2. #BDLHSPVOE

    63-pMUFSJOHJTBEPQUFEGPSTFWFSBMQVSQPTFT
    QBSFOUBMDPOUSPM
    JOUSVTJPOEFUFDUJPO
    QSPUFDUJOHXFCCSPXTJOH
    "E#MPDL1MVTJTPOFPGUIFNPTUGBNPVT63-pMUFS
    JUCMPDLTVOQMFBTBOUBETJUFT
    BMTP JUCMPDLTXFCUSBDLJOHTJUFT XIJDIUISFBUFOUIFQSJWBDZPGXFCVTFST

    View full-size slide

  3. #BDLHSPVOE

    UIFOVNCFSPGpMUFSSVMFTUIBU"E#MPDL1MVT
    VUJMJ[FTIBTJODSFBTFE
    SFRVJSFNVDINFNPSZ TFWFSBMIVOESFET<.#>

    BOE$16SFTPVSDFTGPS63-pMUFSJOH
    OPUTVJUBCMFGPSNPCJMFEFWJDFT XIJDIIBWF
    MJNJUFESFTPVSDFT

    View full-size slide

  4. $POUSJCVUJPOT
    QSPQPTFBEPNBJOTQFDJpDQTFVEPNBDIJOF CZUF
    DPEFJOUFSQSFUFS
    DBMMFE'"3*4 GPS"E#MPDL`TSVMFT
    JUDPOTVNFTPOMZPGNFNPSZJODPNQBSFE
    XJUIUIFDPOWFOUJPOBMJNQMFNFOUBUJPO
    JUJTUJNFTGBTUFSUIBOUIFDPOWFOUJPOBM
    JNQMFNFOUBUJPO

    View full-size slide

  5. "E#MPDL1MVT`T4ZOUBY
    BOE3FHVMBS&YQSFTTJPOT
    "E#MPDL1MVT BOEPUIFSBQQMJDBUJPOTMJLF
    V#MPDL
    VTFTSFHVMBSFYQSFTTJPOTJOUFSOBMMZ

    TABLE I
    FILTER SYNTAX OF ADBLOCK PLUS AND ITS REGULAR EXPRESSIONS
    AdBlock’s syntax regular expression
    * /.*/
    | of the beginning of a line /ˆ/
    | of the end of a line /$/
    || of the beginning of a line /[\w\-]+\/+/
    ˆ /[\x00-\x24\x26-\x2C\x2F\x3A-\x40\x5B-\x5E\x60\x7B-\x7F]|$/
    a hash function to the pieces to minimize hash colli-
    J. Garnica et al. [19] proposed an architecture for URL
    in 100 GbE networks. Their architecture consists of
    TABLE II
    THE MACHINE INSTRUCTIONS OF FARIS
    opcode operand operation

    View full-size slide

  6. '"3*4
    '"TUVOJGPSN3FTPVSDF*EFOUJpFS4QFDJpDpMUFS
    JUIBTJOTUSVDUJPOT SFHJTUFST BOEGSBNFTUBDL

    'JMUFS3VMFT
    '"3*4
    CZUFDPEFT
    '"3*4
    QTFVEPNBDIJOF
    63-
    SFTVMU NBUDIPSOPU

    View full-size slide

  7. '"3*4
    3FHJTUFSTBOE'SBNF4UBDL

    JOQVUIUUQXXXHPPHMFDPN
    CZUFDPEFYYYYʜY
    41 TUSJOHQPJOUFS

    1$ QSPHSBNDPVOUFS

    41 1$
    41 1$
    'SBNF4UBDL

    View full-size slide

  8. '"3*4
    .BDIJOF*OTUSVDUJPOT

    PQDPEF PQFSBOE PQFSBUJPO
    DIBS D
    JG41JTQPJOUJOHUPD JODSFNFOU1$BOE41PUIFSXJTF JG
    UIFGSBNFTUBDLJTFNQUZ UIFOBCPSUNBUIJOH FMTFQPQ1$
    BOE41GSPNUIFGSBNFTUBDL
    TLJQ@UP D
    JODSFNFOU41VOUJMJUQPJOUTUPDJGDXBTOPUGPVOE BCPSU
    NBUDIJOHPUIFSXJTFJODSFNFOU1$BOEQVTI1$BOE
    41UPUIFGSBNFTUBDL
    TLJQ@TDIFNF JG41JTQPJOUJOHUP63-TDIFNF JODSFNFOU41VOUJMJUJTOPU
    QPJOUJOHUP63-TDIFNFPUIFSXJTF BCPSUNBUDIJOH
    NBUDI pOJTINBUDIJOHTVDDFTTGVMMZ

    View full-size slide

  9. '"3*4
    $PNQJMBUJPO3VMFT

    JOQVU JOTUSVDUJPO
    D TLJQ@UPD
    ? TLJQ@UPseparator
    D DIBSD
    ? DIBSseparator
    ccPGUIFCFHJOOJOHPGBMJOF DIBShead
    TLJQ@TDIFNF
    cPGUIFCFHJOOJOHPGBMJOF DIBShead
    cPGUIFFOEPGBMJOF DIBStail

    View full-size slide

  10. '"3*4
    &YBNQMF

    ccXXXBE?QIQ
    DIBSIFBE
    TLJQ@TDIFNF
    DIBSX
    DIBSX
    DIBSX
    TLJQ@UPB
    DIBSE
    DIBSTFQBSBUPS
    DIBSQ
    DIBSI
    DIBSQ
    NBUDI
    IUUQFYBNQMFDPN IUUQXXXFYBNQMFDPNBEQIQ
    BCPSUIFSF
    NBUDI

    View full-size slide

  11. 0UIFS0QUJNJ[BUJPO
    VTFIBTIUBCMFTGPSNVMUJQMFSVMFT
    NVMUJUISFBEJOHTVQQPSUGPSNVMUJDPSF$16T

    View full-size slide

  12. *NQMFNFOUBUJPO
    JNQMFNFOUFEJO$
    EJTUSJCVUFEPOIUUQTHJUIVCDPNTUBSCFE
    GBSJTWN
    #4%MJDFOTF

    View full-size slide

  13. &WBMVBUJPO

    %"5"4&5
    DBQUVSFSFBMUSB⒏DPG8*%&DBNQTQSJOH
    SBOEPNMZDIPTF 63-TGSPNUIFIUUQUSB⒏D

    TABLE V
    ANTS OF WIDE CAMP 2015 SPRING
    Japanese non-Japanese
    nt non-student non-student total
    26 67 9 102
    5 1 0 6
    31 68 9 108
    m
    jp
    m
    jp
    m
    m
    jp
    jp
    jp
    jp
    m
    jp
    m
    jp
    et
    rg
    Fig. 3. Histogram of URL Length (1,000 Samples)
    IJTUPHSBNPG63-MFOHUI TBNQMFT

    View full-size slide

  14. &WBMVBUJPO

    &OWJSPONFOU
    $16
    $PSFJ*)2<()[> $PSF)5 UVSCPCPPTU<()[>
    04
    .BD04
    $DPNQJMFS
    MMWNDMBOH

    View full-size slide

  15. &WBMVBUJPO

    .FNPSZ6TBHF

    CPU Core i7 I7-4870HQ
    2.5 [GHz], 4-core, HT
    turbo boost 3.7 [GHz]
    OS MacOS 10.10.2
    C++ compiler llvm clang++ 6.0
    all
    typical
    easylist
    memory usage [MB]
    0 175 350 525 700
    hashed FARIS FARIS RE2 Irregexp (V8)
    typic
    easyl
    only 12.5%
    JavaScript w
    2) Throug
    of our throug
    engines. Wh
    DPOTVNFPOMZNFNPSZJODPNQBSJTPOXJUI
    UIFDPOWFOUJPOBMJNQMFNFOUBUJPO '"3*4WT7

    pMUFSpMFT
    pMUFSpMFT
    pMUFSpMFT

    View full-size slide

  16. &WBMVBUJPO

    .BUDIJOH5ISPVHIQVU

    1,511 352 24,471 84 1,102
    2,856 1,645 72,424 652 2,548
    4,320 3,645 119,289 1,066 4,068
    1,657 1,221 47,436 438 1,596
    77,139 44,725 1,529,108 6,671 59,131
    T
    ]
    all
    typical
    easylist
    #URLs per second
    1 100 10000
    hashed FARIS FARIS RE2 Irregexp (V8)
    BCPVUUJNFTGBTUFSUIBO
    UIFDPOWFOUJPOBMJNQMFNFOUBUJPO '"3*4WT7

    pMUFSpMFT
    pMUFSpMFT
    pMUFSpMFT

    View full-size slide

  17. $PODMVTJPO
    QSPQPTF'"3*4GPS63-pMUFSJOH
    JUDBOEFBMXJUI"E#MPDL1MVT`TpMUFST
    JUJTUJNFTGBTUFSBOEDPOTVNFTPOMZ
    PGNFNPSZJODPNQBSJTPOXJUIUIF
    DPOWFOUJPOBMJNQMFNFOUBUJPO

    View full-size slide