軽量なインデックス機構を用いた全文検索ツールの高速化の検討/wsa6_sifter

 軽量なインデックス機構を用いた全文検索ツールの高速化の検討/wsa6_sifter

2020.04.26 Web System Architecture 研究会 (WSA研) #6
https://websystemarchitecture.hatenablog.jp/entry/2019/12/11/165624

Cd3d2cb2dadf5488935fe0ddaea7938a?s=128

monochromegane

April 26, 2020
Tweet

Transcript

  1. ࡾ୐༔հ / Pepabo R&D Institute, GMO Pepabo, Inc. 2020.04.26 Web

    System Architecture ݚڀձ (WSAݚ) #6 ܰྔͳΠϯσοΫεػߏΛ༻͍ͨ શจݕࡧπʔϧͷߴ଎Խͷݕ౼
  2. 1SJODJQBMFOHJOFFS :VTVLF.*:",&!NPOPDISPNFHBOF 1FQBCP3%*OTUJUVUF (.01FQBCP *OD IUUQTCMPHNPOPDISPNFHBOFDPN

  3. 1. ͸͡Ίʹ 2. શจݕࡧπʔϧͷߴ଎Խͷ՝୊ 3. ܰྔͳΠϯσοΫεػߏΛ༻͍ͨશจݕࡧπʔϧͷߴ଎ Խͷݕ౼ 4. ධՁ 5.

    ·ͱΊ 3 ໨࣍
  4. 1. ͸͡Ίʹ

  5. • ίϚϯυϥΠϯϕʔεͰ΋ར༻Ͱ͖Δܰྔɾߴ଎ɾ൚༻ͳΠϯσοΫεػߏͷ ࣮ݱ͸Մೳ͔ʁ 5 ຊݚڀͷʮ໰͍ʯ

  6. • ίϚϯυϥΠϯϕʔεͰ΋ར༻Ͱ͖Δܰྔɾߴ଎ɾ൚༻ͳΠϯσοΫεػߏͷ ࣮ݱ͸Մೳ͔ʁ • ຊใࠂͰ͸ɺίϚϯυϥΠϯͱͯ͠શจݕࡧπʔϧΛ૝ఆ͠ɺ্هͷΠϯ σοΫεػߏͷ۩ମతͳ࣮ݱΛݕ౼͢Δɻ 6 ຊݚڀͷʮ໰͍ʯ

  7. • ίϚϯυϥΠϯϕʔεͰ΋ར༻Ͱ͖Δܰྔɾߴ଎ɾ൚༻ͳΠϯσοΫεػߏͷ ࣮ݱ͸Մೳ͔ʁ • ຊใࠂͰ͸ɺίϚϯυϥΠϯͱͯ͠શจݕࡧπʔϧΛ૝ఆ͠ɺ্هͷΠϯ σοΫεػߏͷ۩ମతͳ࣮ݱΛݕ౼͢Δɻ • ·ͨɺ্هͷΠϯσοΫεػߏͱͷ૊Έ߹ΘͤʹΑΓɺશจݕࡧπʔϧͷ ༗༻ੑ͕޲্͢Δ͜ͱΛ֬ೝ͢Δɻ 7

    ຊݚڀͷʮ໰͍ʯ
  8. 2. શจݕࡧπʔϧͷߴ଎Խͷ՝୊

  9. • ͻͱͭɺ͋Δ͍͸ෳ਺ͷςΩετϑΝΠϧ͔Βࢦఆͨ͠จࣈྻΛݕࡧ͢ΔίϚ ϯυϥΠϯπʔϧ • grep, ag, pt etc… • ϓϩδΣΫτ഑Լͷιʔείʔυݕࡧʹར༻͞ΕΔ

    • ଟ༷ͳΦϓγϣϯʹΑΔࠩҟԽ • ݁Ռͷ৭෇͚ɺલޙͷߦͷදࣔɺgitignoreͷߟྀɺจࣈίʔυରԠͳͲ • ओཁͳࠩҟԽͷཁҼ͸ʮݕࡧ଎౓ʯ 9 શจݕࡧπʔϧ
  10. • ࠶ؼతͳશจݕࡧ͸ʮfindʯʮgrepʯʮprintʯͷཁૉ͔Β੒Δ • ֤ཁૉͰߴ଎Խͷָ͠Έ͕͋Δ[1] • find: readdirentʹΑΔstatγεςϜίʔϧͷ࡟ݮɺฒྻԽ • grep: ߦ୯ҐͰ͸ͳ͘ݻఆ௕Ͱͷݕࡧͱ෮ݩɺSIMDɺޮ཰తͳΞϧΰϦζ

    ϜɺฒྻԽɺʢOSͷϑΝΠϧΩϟογϡͷԸܙ΋େ͖͍ͱ͜Ζʣ • print: όοϑΝϦϯάɺલஈͷॲཧͷϘτϧωοΫʹͳΔ͜ͱΛճආ • ฒྻԽ਺ΛؚΊɺܭࢉࢿݯΛޮ཰Α͘࠷େݶʹར༻͢Δ [2] 10 શจݕࡧπʔϧͷߴ଎Խ <>:VTVLF.JZBLF 0QUJNJ[BUJPOGPS/VNCFSPGHPSPVUJOFT6TJOH'FFECBDL$POUSPM (PQIFS$PO.BSSJPUU.BSRVJT4BO%JFHP.BSJOB $BMJGPSOJB +VMZ <>:VTVLF.JZBLF UIF@QMBUJOVN@TFBSDIFS IUUQTHJUIVCDPNNPOPDISPNFHBOFUIF@QMBUJOVN@TFBSDIFS
  11. • ըظతͳΞϧΰϦζϜͰ͸ͳ͘஍ಓͳߴ଎Խͷ౒ྗͷੵΈॏͶ • ੑೳ޲্͸಄ଧͪͷ܏޲ • ݕࡧର৅ͷιʔείʔυ͸ৗʹมԽ͠͏Δ͜ͱ͔Βɺ ౎౓ɺશϑΝΠϧͷશจΛݕࡧ͢Δඞཁ͕͋ΔͨΊ 11 શจݕࡧπʔϧͷߴ଎Խͷ՝୊

  12. • ըظతͳΞϧΰϦζϜͰ͸ͳ͘஍ಓͳߴ଎Խͷ౒ྗͷੵΈॏͶ • ੑೳ޲্͸಄ଧͪͷ܏޲ • ݕࡧର৅ͷιʔείʔυ͸ৗʹมԽ͠͏Δ͜ͱ͔Βɺ ౎౓ɺશϑΝΠϧͷશจΛݕࡧ͢Δඞཁ͕͋ΔͨΊ 12 શจݕࡧπʔϧͷߴ଎Խͷ՝୊ •

    Մೳੑͷ͋ΔϑΝΠϧ͔ΒͷΈɺશจΛݕࡧ͢Ε͹ߴ଎Խ͕ظ଴Ͱ͖Δ → ΠϯσοΫεΛ࢖ͬͨΞϓϩʔνΛݕ౼
  13. • ϓϩάϥϛϯάݴޠͷʮΦϒδΣΫτʢؔ਺΍ߏ଄ମͳͲʣʯͷΠϯσοΫε Λੜ੒͢Δʢ࣮͸΄ͱΜͲ࢖͍ͬͯͳ͍ɻˎཁαʔϕΠʣ • ΦϒδΣΫτΛʮλάʯͱͯ͠ɺ͜ΕΛఆ͍ٛͯ͠ΔϑΝΠϧ໊Λؔ࿈͚ͮΔ • λάϑΝΠϧͷϑΥʔϚοτ͸͍ΘΏΔసஔΠϯσοΫεͷܗࣜ • λά໊ɺϑΝΠϧ໊΋ؚΊͨςΩετܗࣜͰ͋ΓαΠζ͕૿Ճ͠΍͍͢ •

    λάϑΝΠϧͷϩʔυʹ͕͔͔࣌ؒΔΑ͏ʹͳΔ • ιʔείʔυݕࡧ͸ΦϒδΣΫτҎ֎΋ର৅ͱͳΓ͏Δ • ίϝϯτ΍ΤϥʔϝοηʔδͰݕࡧ͍ͨ͠ɺͳͲ 13 ίϚϯυϥΠϯπʔϧͷΠϯσοΫεʢctagsʣ
  14. • ͋Δ༻ޠͱɺͦͷ༻ޠ͕ग़ݱ͢ΔจॻIDͷϦετ͔ΒͳΔࣙॻ • ग़ݱස౓΍ग़ݱҐஔͷ؅ཧ͕Մೳ • ڞ௨ू߹ʹର͢ΔΫΤϦ΋ಘҙ • ༻ޠ਺ɺจॻ਺ʹൺྫͯ͠ΠϯσοΫεͷαΠζ͕େ͖͘ͳΔ • ͨͩ͠ɺѹॖͷखཱͯ͸ଟ਺͋Γͦ͏[3]ʢˎཁαʔϕΠʣ

    14 શจݕࡧΤϯδϯͷߴ଎ԽʢసஔΠϯσοΫεʣ <>$ISJTUPQIFS%.BOOJOH 1SBCIBLBS3BHIBWBO )JOSJDI4DIVU[F ؠ໺࿨ੜ ࠇ઒ར໌ ᖛా੣࢘ ଜ্໌ࢠ ৘ใݕࡧͷجૅ ڞཱग़൛ 
  15. • ू߹ͷதʹ೚ҙͷཁૉؚ͕·ΕΔ͔Λ໰͍߹ΘͤΔ֬཰తσʔλߏ଄ • ϑΟϧλͷαΠζ͕༻ޠ਺ʹґଘ͠ͳ͍ • ཁૉͷ௥Ճɺཁૉͷ໰͍߹Θͤ΋ݻఆ࣌ؒͰ͢Ή • ͨͩ͠ɺཁૉͷ໰͍߹Θͤʹfalse positive͕ൃੜ͢Δ •

    จॻ͝ͱʹϒϧʔϜϑΟϧλΛ࡞੒͠ɺ͜ͷू߹͔Β༻ޠؚ͕·ΕΔจॻΛݕ ࡧ͢Δ • ͜ͷεʔύʔվળ൛͕BingͷݕࡧΤϯδϯʹ࢖ΘΕͨʢBitFunnelʣ[4][5] 15 શจݕࡧΤϯδϯͷߴ଎ԽʢϒϧʔϜϑΟϧλʣ <>#JOHݕࡧͷཪଆʕ#JU'VOOFMͷΞϧΰϦζϜ IUUQTEFWFMPQFSIBUFOBTUB⒎DPNFOUSZ <>#PC(PPEXJO .JDIBFM)PQDSPGU %BO-VV "MFY$MFNNFS .JIBFMB$VSNFJ 4BNFI&MOJLFUZ BOE:VYJPOH)F#JU'VOOFM3FWJTJUJOH4JHOBUVSFTGPS 4FBSDI*O1SPDFFEJOHTPGUIFUI*OUFSOBUJPOBM"$.4*(*3$POGFSFODFPO3FTFBSDIBOE%FWFMPQNFOUJO*OGPSNBUJPO3FUSJFWBM 4*(*3` "TTPDJBUJPOGPS $PNQVUJOH.BDIJOFSZ /FX:PSL /: 64" r%0*IUUQTEPJPSH
  16. • ͻͱͭͷϒϧʔϜϑΟϧλ͸ Ϗοτͷ഑ྻ͔Β੒Δ m 16 ϒϧʔϜϑΟϧλ 0, 0, 0, 0,

    0, 0, 0, 0, 0, 0, 0 Bloom filter(m = 10)
  17. • ͻͱͭͷϒϧʔϜϑΟϧλ͸ Ϗοτͷ഑ྻ͔Β੒Δ • ཁૉ͸ ݸͷϋογϡؔ਺͔ΒಘΒΕΔ഑ྻͷఴࣈҐஔͷू߹ʹม׵͞ΕΔ m k 17 ϒϧʔϜϑΟϧλʢཁૉͷ௥Ճʣ

    H1 (element1 ) = 0 H2 (element1 ) = 9 element1 Hash function(k = 2)
  18. • ͻͱͭͷϒϧʔϜϑΟϧλ͸ Ϗοτͷ഑ྻ͔Β੒Δ • ཁૉ͸ ݸͷϋογϡؔ਺͔ΒಘΒΕΔ഑ྻͷఴࣈҐஔͷू߹ʹม׵͞ΕΔ • ू߹͸શཁૉͷ഑ྻͷఴࣈͷ࿨ू߹Λ1ͱ͢Δ഑ྻͱͯ͠දݱ͞ΕΔ m k

    18 ϒϧʔϜϑΟϧλʢཁૉͷ௥Ճʣ 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1 H1 (element1 ) = 0 H2 (element1 ) = 9 element1 Bloom filter(m = 10) Hash function(k = 2)
  19. • ͻͱͭͷϒϧʔϜϑΟϧλ͸ Ϗοτͷ഑ྻ͔Β੒Δ • ཁૉ͸ ݸͷϋογϡؔ਺͔ΒಘΒΕΔ഑ྻͷఴࣈҐஔͷू߹ʹม׵͞ΕΔ • ू߹͸શཁૉͷ഑ྻͷఴࣈͷ࿨ू߹Λ1ͱ͢Δ഑ྻͱͯ͠දݱ͞ΕΔ m k

    19 ϒϧʔϜϑΟϧλʢཁૉͷ௥Ճʣ 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1 H1 (element2 ) = 1 H2 (element2 ) = 9 element2 Bloom filter(m = 10) Hash function(k = 2)
  20. • ໰͍߹ΘͤΔཁૉʹରͯ͠kݸͷϋογϡؔ਺͔ΒಘΒΕͨఴࣈҐஔΛ࢖͏ • ҰͭͰ΋0͕͋Ε͹ʮઈରʹʯؚ·Εͳ͍ 20 ϒϧʔϜϑΟϧλʢཁૉͷ໰͍߹Θͤʣ 1, 1, 0, 0,

    0, 0, 0, 0, 0, 0, 1 H1 (element3 ) = 1 H2 (element3 ) = 8 element3 Bloom filter(m = 10) Hash function(k = 2) ❌
  21. • ໰͍߹ΘͤΔཁૉʹରͯ͠kݸͷϋογϡؔ਺͔ΒಘΒΕͨఴࣈҐஔΛ࢖͏ • ҰͭͰ΋0͕͋Ε͹ʮઈରʹʯؚ·Εͳ͍ • શͯ1ʹͳ͍ͬͯΕ͹ʮɹɹʯؚ·ΕΔ 21 ϒϧʔϜϑΟϧλʢཁૉͷ໰͍߹Θͤʣ 1, 1,

    0, 0, 0, 0, 0, 0, 0, 0, 1 H1 (element1 ) = 0 H2 (element1 ) = 9 element1 Bloom filter(m = 10) Hash function(k = 2) ⭕
  22. • ໰͍߹ΘͤΔཁૉʹରͯ͠kݸͷϋογϡؔ਺͔ΒಘΒΕͨఴࣈҐஔΛ࢖͏ • ҰͭͰ΋0͕͋Ε͹ʮઈରʹʯؚ·Εͳ͍ • શͯ1ʹͳ͍ͬͯΕ͹ʮଟ෼ʯؚ·ΕΔ 22 ϒϧʔϜϑΟϧλʢཁૉͷ໰͍߹Θͤʣ 1, 1,

    0, 0, 0, 0, 0, 0, 0, 0, 1 H1 (element4 ) = 0 H2 (element4 ) = 1 element4 Bloom filter(m = 10) Hash function(k = 2) ❓
  23. 3. ܰྔͳΠϯσοΫεػߏΛ༻͍ͨ શจݕࡧπʔϧͷߴ଎Խͷݕ౼

  24. • ܰྔ • ΠϯσοΫεͷαΠζ͕খ͍͞΄ͲಡΈࠐΈʢ໰͍߹Θͤͷىಈʣ͕଎͍ • ߴ଎ • ߏங࣌: ߴ଎ʹΠϯσοΫεߏங͕Ͱ͖Ε͹ݕࡧର৅΁ෛ୲ͳ͘௥ै •

    ݕࡧ࣌: ߴ଎ʹ໰͍߹Θ͕ͤͰ͖Ε͹શจݕࡧશମͷ࣮࣌ؒΛ୹ॖ • ൚༻ • ಛఆͷπʔϧʹґଘͤͣɺ૊Έ߹Θͤͯར༻Մೳʹ͢Δ͜ͱͰ༗༻ੑ͕޲্ 24 શจݕࡧπʔϧʹ͓͚ΔΠϯσοΫεػߏͷཁ݅
  25. • ܰྔ / ߴ଎ • ϒϧʔϜϑΟϧλΛ༻͍Δ • αΠζ͕༻ޠ਺ʹґଘͤͣɺ໰͍߹Θ͕ͤݻఆ࣌ؒͰࡁΉಛੑΛར༻ • ൚༻

    • ΠϯσοΫεΛݕࡧ͠ɺ֘౰͢ΔΩʔϫʔυؚ͕·ΕΔϑΝΠϧҰཡΛฦ͢ ίϚϯυΛఏڙ͢Δ • ೚ҙͷશจݕࡧπʔϧ͸ҰཡΛArgsͱͯ͠શจݕࡧΛߦ͏ • ِཅੑʹΑΔޡݕग़͸શจݕࡧπʔϧʹΑͬͯϑΟϧλ͞ΕΔ 25 શจݕࡧπʔϧʹ͓͚ΔΠϯσοΫεػߏͷݕ౼
  26. • A lightweight index for full text search tools using

    bloom filter. [6] • ఏҊख๏ͷGo࣮૷ʢWIPʣ • “sifter"͸ྉཧ༻ͷ;Δ͍ɺͱ͔ɺબΓ෼͚Δਓɺͷҙ 26 monochromegane/sifter <>NPOPDISPNFHBOFTJGUFS IUUQTHJUIVCDPNNPOPDISPNFHBOFTJGUFS
  27. • σΟϨΫτϦ഑ԼͷςΩετϑΝΠϧʹରͯͦ͠ΕͧΕϒϧʔϜϑΟϧλΛੜ ੒͢ΔʢϑΝΠϧ਺ * m bitʣ • ࣄલʹݕࡧΩʔϫʔυ͕ෆ໌ɺ͔ͭτʔΫϯԽ͕೉͍͠೔ຊޠ΋ؚ·ΕΔ͜ͱ ͔Βɺn-gramΛ࠾༻ͨ͠ʢ࠷େ3-gramʣ 27

    monochromegane/sifter $ sifter -m 5 -k 3 build 1, 1, 0, 0, 0 func init() { fun unc nc_ c_i Hk
  28. • ໰͍߹Θͤ࣌ʹɺશͯͷϒϧʔϜϑΟϧλΛಡΈࠐΉඞཁ͕͋ΔͨΊɺBit- sliced signatureԽ͢Δ͜ͱͰಡΈࠐΉσʔλΛ࡟ݮ͠ɺߴ଎Խ͢Δ 28 monochromegane/sifter $ sifter -m 5

    -k 3 build 1, 1, 0, 0, 0 1, 0, 1, 0, 0 1, 0, 0, 1, 0 H1 = 4 & 00001 & 00001 & 00001 શͯͷϒϧʔϜϑΟϧλʹ ରͯ͠໰͍߹Θ͕ͤൃੜ
  29. • ໰͍߹Θͤ࣌ʹɺશͯͷϒϧʔϜϑΟϧλΛಡΈࠐΉඞཁ͕͋ΔͨΊɺBit- sliced signatureԽ͢Δ͜ͱͰಡΈࠐΉσʔλΛ࡟ݮ͠ɺߴ଎Խ͢Δ 29 monochromegane/sifter $ sifter -m 5

    -k 3 build 1, 1, 0, 0, 0 1, 0, 1, 0, 0 1, 0, 0, 1, 0 H1 = 4 & 00001 & 00001 & 00001 ࣮࣭ɺఴࣈͷ෦෼͔͠࢖ͬ ͯͳͦ͞͏
  30. • ໰͍߹Θͤ࣌ʹɺશͯͷϒϧʔϜϑΟϧλΛಡΈࠐΉඞཁ͕͋ΔͨΊɺBit- sliced signatureԽ͢Δ͜ͱͰಡΈࠐΉσʔλΛ࡟ݮ͠ɺߴ଎Խ͢Δ 30 monochromegane/sifter $ sifter -m 5

    -k 3 build 1, 1, 0, 0, 0 1, 0, 1, 0, 0 1, 0, 0, 1, 0 1, 1, 1 1, 0, 0 0, 1, 0 0, 0, 1 0, 0, 0 H1 = 4 ֘౰͢ΔఴࣈͷΈΛूΊΔ ʢϒϧʔϜϑΟϧλͷू߹ ΛߦྻͱݟΔͱసஔͨ͠ܗ ʹ૬౰ʣ ∣ F ∣ m m ∣ F ∣
  31. • ໰͍߹Θͤ࣌ʹɺશͯͷϒϧʔϜϑΟϧλΛಡΈࠐΉඞཁ͕͋ΔͨΊɺBit- sliced signatureԽ͢Δ͜ͱͰಡΈࠐΉσʔλΛ࡟ݮ͠ɺߴ଎Խ͢Δ 31 monochromegane/sifter $ sifter -m 5

    -k 3 build 1, 1, 0, 0, 0 1, 0, 1, 0, 0 1, 0, 0, 1, 0 1, 1, 1 1, 0, 0 0, 1, 0 0, 0, 1 0, 0, 0 & 111 H1 = 4 ֘౰͢ΔఴࣈͷΈΛूΊͨ ෦෼͚ͩʹ໰͍߹ΘͤΕ͹ ྑ͍
  32. • ໰͍߹Θͤ࣌͸ύλʔϯจࣈྻΛ3-gramԽ͠ɺͦΕͧΕͷϋογϡؔ਺͔Β ಘΒΕͨఴࣈͷ࿨ू߹Λ΋ͬͯ໰͍߹ΘͤΛߦ͏ 32 monochromegane/sifter $ sifter -m 5 -k

    3 find PATTERN 1, 1, 1 1, 0, 0 0, 1, 0 0, 0, 1 0, 0, 0 & 111 1, 1, 0, 0, 0 PATTERN PAT ATT TER ERN Hk & 111 ൪໨ͷϑΝΠϧʹ͸ ʮଟ෼ʯؚ·ΕͯΔ ൪໨ͷϑΝΠϧʹରԠ ͢ΔϑΝΠϧ໊Λग़ྗ ⭕❌ ❌
  33. • શจݕࡧπʔϧ͸sifterʹΑͬͯߜΓࠐ·ΕͨީิͷΈ͔ΒશจݕࡧΛߦ͏ • ِཅੑʹΑΔޡݕग़͸શจݕࡧπʔϧʹΑͬͯϑΟϧλ͞ΕΔ • ِӄੑʹΑΔޡݕग़͸ൃੜ͠ͳ͍ͷͰݕࡧ࿙Ε͸ൃੜ͠ͳ͍ 33 monochromegane/sifter $ pt

    PATTERN `sifter -m 5 -k 3 find PATTERN`
  34. 4. ධՁ

  35. 35 ධՁ • ൚༻ • શจݕࡧπʔϧͱ૊Έ߹Θͤͨॲཧ࣌ؒͷ୹ॖ • ܰྔ • ΠϯσοΫεͷαΠζ

    • ߴ଎ • ໰͍߹Θͤͷ࣌ؒ • ΠϯσοΫεߏஙͷ࣌ؒ
  36. • CentOS Linux release 8.1.1911 (Core) on Vagrant • CPU:

    4, Memory: 5,120MB • https://github.com/torvalds/linux (c578ddb) • ૯ϑΝΠϧ਺: 67,947 • ϒϧʔϜϑΟϧλ( ) k = 3, m = 10,000 36 ධՁ؀ڥ
  37. • ݕࡧΩʔϫʔυ: ‘GPL-2.0-or-later' (8,168/67,947 = ໿12%) 37 ධՁ: શจݕࡧπʔϧͱͷ૊Έ߹Θͤ Ωϟογϡͳ͠

    ඵ Ωϟογϡ͋Γ ඵ HSFQ   HSFQ TJGUFS   QU   QU TJGUFS   ఏҊख๏ʹΑΔݕࡧର৅ͷࣄલߜΓࠐΈ ʹΑͬͯɺTJGUFSͷ࣮ߦ࣌ؒΛࠩ͠Ҿ͍ͯ ΋ɺશମͱͯ͠େ෯ͳݕࡧ଎౓ͷվળ͕ ֬ೝͰ͖ͨɻͳ͓ɺTJGUFS͸ ݅ͷީ ิΛTTͰฦ͍ͯ͠Δɻ
  38. • ݕࡧΩʔϫʔυ: ‘GPL-2.0-or-later' (8,168/67,947 = ໿12%) 38 ධՁ: શจݕࡧπʔϧͱͷ૊Έ߹Θͤ Ωϟογϡͳ͠

    ඵ Ωϟογϡ͋Γ ඵ HSFQ   HSFQ TJGUFS   QU   QU TJGUFS   Ωϟογϡͳ͠ ඵ Ωϟογϡ͋Γ ඵ HSFQ   HSFQ TJGUFS   QU   QU TJGUFS   • ݕࡧΩʔϫʔυ: ‘#define BYT_RT5640_MAP(quirk)' (2/67,947 = ໿0.003%) ఏҊख๏ʹΑΔࣄલͷߜΓࠐΈͷޮՌ͕ߴ ͍৔߹ʹ͸ɺΑΓݦஶͳ࣮ߦ࣌ؒͷ୹ॖ͕ ֬ೝ͞ΕͨʢTJGUFS݅TTʣ ͳ͓ɺૉͷQUͷվળ͸ύλʔϯʹ߹க͠ͳ ͚Ε͹ਫ਼ࠪ͠ͳ͍࣮૷ͷ޻෉ʹΑΔ
  39. • 67,947bit=8,494byte*10,000=84.94MB • du -h linux 1.2G • શମͱͯ͠΋ϦϙδτϦͷαΠζͱൺֱͯ͠े෼ʹখ͍͞ •

    ໰͍߹Θͤ࣌ʹ͸ k*8,494byte ͷΈͷಡΈࠐΈͰࡁΉ 39 ධՁ: ΠϯσοΫεͷαΠζ
  40. • ݱࡏɺ1.2G ͷϦϙδτϦʹରͯ͠20෼ఔ౓͔͔Δ͜ͱ͔Βվળ͕ඞཁ… • ϘτϧωοΫ͸ϋογϡؔ਺ [7][8] • 1จࣈʹରͯ͠{1,2,3}-gram*k(3)ճͷϋογϡؔ਺͕࣮ߦ͞ΕΔ (=0.01ms) •

    98KbͷϑΝΠϧͰ͓͓Αͦ1s͔͔Δܭࢉ • ΠϯσοΫεߏஙͷߴ଎Խʹ޲͚ͯɺϋογϡ݁ՌͷΩϟογϡɺߴ଎ͳ ϋογϡؔ਺[9]ͷద༻ɺޮ཰తͳτʔΫϯԽͷݕ౼ͳͲ͕ඞཁ gi (x) = h1 (x) + ih2 (x) mod m 40 ධՁ: ΠϯσοΫεͷߏங <>,JSTDI "EBN BOE.JDIBFM.JU[FONBDIFS-FTTIBTIJOH TBNFQFSGPSNBODFCVJMEJOHBCFUUFSCMPPNpMUFS&VSPQFBO4ZNQPTJVNPO"MHPSJUINT 4QSJOHFS #FSMJO )FJEFMCFSH  <>(PMBOHͰ#MPPN'JMUFSΛ࣮૷ͯ͠Έͨ IUUQTDJQFQTFSIBUFOBCMPHDPNFOUSZ <>.VSNVS)BTI IUUQTUBOKFOUMJWFKPVSOBMDPNIUNM
  41. 5. ·ͱΊ

  42. • શจݕࡧπʔϧͰར༻Ͱ͖Δܰྔɾߴ଎ɾ൚༻ͳΠϯσοΫεػߏΛఏҊͨ͠ • ϒϧʔϜϑΟϧλΛ࠾༻͢Δ͜ͱͰܰྔ͔ͭ໰͍߹Θͤͷߴ଎ԽΛ࣮ݱͨ͠ • ީิͷΈΛฦ٫͢ΔผπʔϧΛఏڙ͢Δ͜ͱͰ൚༻ੑΛߴΊͨ • ҰํͰɺϋογϡؔ਺ͷ࣮ߦ͕࣌ؒϘτϧωοΫͱͳΓେن໛ͳϦϙδτϦʹ ର͢ΔΠϯσοΫεͷߏஙʹ͕͔͔࣌ؒΔͨΊࠓޙͷվળ͕ඞཁ •

    ࠓޙɺ໰͍߹ΘͤࣗମʹΦʔόϔου͕ൃੜ͢ΔΞʔΩςΫνϟ[10]ͱͷ࿈ܞ ΋ݕ౼͢Δ͜ͱͰWebγεςϜͷ෼໺΁ݚڀΛൃల͍ͤͨ͞ 42 ·ͱΊ <>Ѩ෦ത ౡܚҰ ٶຊେี ؔ୩༐࢘ ੴݪ஌༸ Ԭా࿨໵ தଜྒྷ দӜ஌࢙ ࣰాཅҰ࣌ؒ࣠ݕࡧʹ࠷దԽͨ͠εέʔϧΞ΢τՄೳͳߴ଎ϩάݕࡧΤϯδϯͷ࣮ݱͱධ Ձ৘ใॲཧֶձ࿦จࢽ 7PM /P QQr 
  43. None