Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[論文紹介] VCC-Finder: Finding Potential Vulnerabilities in Open-Source Projects to Assist Code Audits

[論文紹介] VCC-Finder: Finding Potential Vulnerabilities in Open-Source Projects to Assist Code Audits

Henning Perl, Sergej Dechand, Matthew Smith, Daniel Arp, Fabian Yamaguchi, Konrad Rieck, Sascha Fahl, and Yasemin Acar. 2015. VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assist Code Audits. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security (CCS '15). ACM, New York, NY, USA, 426-437. DOI=http://dx.doi.org/10.1145/2810103.2813604

Kenta YAMAMOTO

May 30, 2016
Tweet

Other Decks in Science

Transcript

  1. ࿦จ঺հ VCC-FINDER: FINDING POTENTIAL VULNERABILITIES IN OPEN-SOURCE PROJECTS TO ASSIST

    CODE AUDITS ࿦จ: ηΩϡϦςΟΧϯϑΝϨϯε ACM CCS 2015 http:// www.sigsac.org/ccs/CCS2015/ ʹͯൃද Henning Perl, Sergej Dechand, Matthew Smith, Daniel Arp, Fabian Yamaguchi, Konrad Rieck, Sascha Fahl, and Yasemin Acar. 2015. VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assist Code Audits. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security (CCS '15). ACM, New York, NY, USA, 426-437. DOI=http://dx.doi.org/ 10.1145/2810103.2813604 ൃදऀ: Kenta Yamamoto <[email protected]> ৘ใՊֶݚڀՊʢࣾձਓίʔεʣ ໘ݚڀࣨॴଐ
  2. ໨࣍ ֓ཁ VCCͱ͸ VCC-Finderͱ͸ എܠ ηΩϡϦςΟΠϯγσϯτͷٸ૿ ϨϏϡΞʔͷෆ଍ طଘπʔϧͷ໰୊఺ طଘπʔϧͷݶք ؔ࿈ݚڀ

    ੩తղੳ ϦϙδτϦղੳ ػցֶश VCCσʔλϕʔεͷ࡞੒ ػցֶशʹΑΔVCCͷݕग़ ධՁ ൃදऀॴײ
  3. ֓ཁ - VCC-FINDERͱ༻ޠ VCC-Finder ιʔείʔυͷ੬ऑੑΛൃݟ͢Δख๏ɻίʔυ ϝτϦΫεղੳʹϦϙδτϦʹ͋Δϝλσʔλ Λ༻͍Δ͜ͱͰɺطଘͷπʔϧʹൺ΂ͯߴਫ਼౓ (௿false-positive) Ͱ͋Δɻ ༻ޠ

    “VCC” (Vulnerability-contributing Commits):
 ʮ੬ऑੑͷݪҼͱͳ͍ͬͯΔίϛοτʯΛҙຯ ͢Δ ຊ࿦จʹΑΔओͳߩݙ CVEσʔλϕʔεͷGitHubίϛοτ΁ͷେن໛Ϛοϐϯ άΛߦ͍ɺ640݅ͷVCCσʔλϕʔεΛެ։ͨ͠ ্هσʔλϕʔεΛ༻͍ͯSVM෼ྨثΛ܇࿅͠ɺ੬ऑੑ ͷࠞೖ͕ٙΘΕΔίϛοτͷݕग़ΛՄೳͱͨ͠ɻ͜Ε ʹΑΓFlawFinderͱಉ౳ͷrecallͰfalse-positiveͳܯࠂΛ 99%࡟ݮͨ͠ ্هख๏ʹΑΔ෼ੳͷ݁Ռʹ͍ͭͯɺྔతɾ࣭తͳ෼ ੳΛߦͬͨ
  4. എܠ - ੬ऑੑࣄҊͷٸ૿ͱطଘπʔϧͷ໰୊఺ CVEσʔλϕʔεొ࿥݅਺ 2000೥ 1000݅ 2010೥ 4500݅ 2014೥ 8000݅

    ੬ऑੑΛݮΒͨ͢ΊͷϕετϓϥΫςΟεʢʮϨϏϡʔ͸ίʔ υ͕ϦϦʔε͞ΕΔલʹͳ͞ΕΔ΂͖ʯɺʮ੬ऑੑ͸σϓϩ Π͢Δલʹ௚͢ʯͳͲʣ͸෼͔͍ͬͯΔ͕ɺίʔυϨϏϡʔ Λߦ͏ਓख͕଍Γ͍ͯͳ͍ OSS΋ϨϏϡΞʔෆ଍ͷྫ֎Ͱ͸ͳ͍ɻཧ۶ͷ্Ͱ͸ ʮ୭Ͱ΋ϨϏϡʔ͸Ͱ͖Δʯ͕ɺ࣮ࡍʹ͸গਓ਺ͷνʔ Ϝ͔ΒͳΔίΞ։ൃऀ͕ίʔυϨϏϡʔΛ͍ͯ͠Δ ͢Ͱʹ੬ऑੑ΍όάʢྫ͑͹if-statement಺Ͱͷม਺ͷఆٛɺ ౸ୡෆՄೳͳswitch-statementɺෆਖ਼ͳϝϞϦ΁ͷΞΫηεʣ Λݕग़͢Δπʔϧ͸ͨ͘͞Μ͋ΔɻFlawFinderͳͲ طଘπʔϧͷ໰୊఺ ةݥͳίʔυΛݕग़͢Δ͕ٖཅੑ͕ଟ͘Ϩϙʔτ͕ڊେ ʹͳΔ ιϑτ΢ΣΞͷ੒௕ʹ߹ΘͤͯϓϩδΣΫτશମΛݕࠪ ͢ΔπʔϧΛ࢖͍͍͕ͨɺݱ࣮తʹݟ͖Εͳ͍ྔͷϨϙʔ τ্͕͕͖ͬͯͯ͠·͏ ຊݚڀͰ༻͍ͨσʔληοτʹର͠ɺFlawfinder͸53ͷ true positiveʹର͠5,460΋ͷfalse positiveͳܯࠂΛൃͯ͠ ͓Γɺ͜ΕΛਓखͰ൑அ͢Δʹ͸΄ͱΜͲෆՄೳͳ࡞ ۀྔͱͳΔͨΊπʔϧͷ༗༻ੑ͕ٙΘΕΔ ίʔυมߋͷࣗવͳ୯Ґ͕όʔδϣϯ؅ཧγεςϜͷ ʮ1ίϛοτʯͰ͋Δͷʹ΋͔͔ΘΒͣɺίϛοτ಺ͷ ίʔυεχϖοτ͚ͩΛݕࠪ͢Δ͜ͱ͸Ͱ͖ͳ͍ͨΊϨ ϏϡΞʔ͸ϓϩδΣΫτશମʹղੳΛ͔͚ͨ͋ͱʹίϛο τʹ֘౰͢ΔϨϙʔτΛݟ͚ͭͳ͚Ε͹ͳΒͳ͍
  5. ؔ࿈ݚڀ ੩తղੳ ܰྔͰߴ଎ͳख๏ɻ༗໊ͳͷ͸FlawFinderɻଞʹ͸Rats, Prefast, SplintͳͲɻղੳͷͨΊʹΞϊςʔγϣϯ͕ඞཁ ͳπʔϧ΋͋Δ CoventryͳͲͷ঎༻πʔϧ΋͋Δɻઃఆ͞Εͨϧʔϧ ηοτʹ΋ͱ͍ͮͨղੳΛߦ͍ɺ࣮ߦ࣌ؒ͸௕͍ ϦϙδτϦղੳ ϓϩδΣΫτͷSCM

    (Software configuration management) ϨϙδτϦ͔Βʮfixʯ΍ʮbugʯͱ͍͏ΩʔϫʔυΛ༻ ͍ͯόάΛൃੜͤͨ͞มߋΛநग़͠ɺͦͷίϛοτͷ ಛ௃ͰSVMΛ܇࿅ͨ͠ݚڀ͕͋Δ ػցֶश΍σʔλϚΠχϯάʹΑΔ੬ऑੑൃݟͷख๏ ιʔείʔυͷςΩετͷಛ௃ʹΑͬͯιϑτ΢ΣΞ ίϯϙʔωϯτͷ੬ऑੑΛݕग़͢Δݚڀ Cιʔείʔυͷ੩తԚછ΍มଇੑΛ΋ͱʹڭࢣͳֶ͠ शΛߦ͍ɺνΣοΫͷ͠๨ΕΛݟ͚ͭग़͢ݚڀ ແࢹঢ়ଶ΍҉໧ͷঢ়ଶنଇΛσʔλϚΠχϯάʹΑͬ ͯൃݟ͢Δݚڀ
  6. 3. VCCσʔλϕʔεͷ࡞੒ ର৅ϨϙδτϦ 66ϓϩδΣΫτ, 170,860ίϛοτ, 718CVE੬ऑੑ ϓϩάϥϛϯάݴޠ: C͓ΑͼC++ VCCൃݟख๏ʹ΋ͱ͍ͮͯநग़͞Εͨσʔλ͸ϨϏϡʔ Ξʔ޲͚ʹެ։͞Ε͍ͯΔ

    https://www.dropbox.com/s/x1shbyw0nmd2x45/vcc- database.dump?dl=0 खॱ ੬ऑੑΛमਖ਼ͨ͠ίϛοτΛݟ͚ͭΔ मਖ਼ίϛοτ͔ΒVCC΁͔͞ͷ΅Δ
  7. #1 ੬ऑੑͷमਖ਼ίϛοτΛݟ͚ͭΔ ੩తɾಈత໰Θͣɺطଘͷίʔυղੳख๏͸ʮ୭͕ίϛο τ͔ͨ͠ʯ΍ʮͲ͏΍ͬͯίϛοτ͞Ε͔ͨʯʹ͸ؔ஌͠ ͯ͜ͳ͔͕ͬͨɺ࣮ࡍʹ͸ίʔυͷ࣭ʹӨڹΛ͓Α΅͢༗ ༻ͳϝλσʔλ͸ͨ͘͞Μ͋Δ e.g.ʮίϛολʔ͸৽ਓ͔ɺ͋Δ͍͸ίϯτϦϏϡʔλ ͔ʯͳͲ ͔͠͠CVEΛίϛοτ΁ͻ΋͚ͮΔطଘͷެ։͞Εͨେن໛ Ϛοϐϯά͸ଘࡏ͠ͳ͔ͬͨͨΊɺຊݚڀͰ͸ࣗ෼ͨͪͰ

    GitHubʹϗετ͞Ε͍ͯΔϓϩδΣΫτΛCVEʹͻ΋͚ͮͨ ϓϩδΣΫτͱCVEͷͻ΋͚ͮʹ࢖ͬͨ2ͭͷσʔλιʔ ε 1. CVEʹܝࡌ͞Ε͍ͯΔίϛοτ΁ͷϦϯΫ 2. CVE IDʹݴٴ͍ͯ͠Δίϛοτϝοηʔδ 10%ͷϥϯμϜαϯϓϦϯάΛߦ͍ਓखͰݕূͨ͠ͱ͜Ζ ޡͬͨϚοϐϯά͸ͳ͔ͬͨ ݁Ռɺ718݅ͷCVEΛಘͨ ໢ཏੑ͸ͳ͍΋ͷͷɺ෼ྨثʹֶशͤ͞Δͷʹे෼ͳ ྔͷڭࢣσʔλͰ͋Δ
  8. #2 मਖ਼ίϛοτ͔ΒVCCΛൃݟ͢Δ मਖ਼ίϛοτΛݟ͚ͭΒΕͨͷͰ࣍͸VCCΛݟ͚ͭΔ GitίϛοτͳͷͰϓϩδΣΫτͷશཤྺΛ௥͏͜ͱ ͕Ͱ͖Δ (`git blame` ͳͲ) ͷͰɺͦΕΒͷπʔϧΛ༻ ͍ͯVCCΛൃݟ͢Δ

    ࣍ʹ঺հ͢Δൃݟख๏Ͱ718݅ͷCVEʹରͯ͠640݅ͷVCC Λͻ΋͚ͮͨ VCCͷ΄͏͕গͳ͘ͳ͍ͬͯΔͷ͸1ͭͷίϛοτ͸ ෳ਺ͷCVEΛؚΉ͜ͱ͕͋ΔͨΊ
  9. #2 VCCൃݟख๏ͷखॱ खॱ1. จॻͷมߋ͸ແࢹ͢Δ ϦϦʔεϊʔτ΍νΣϯδϩάͳͲ खॱ2. मਖ਼ίϛοτͰফ͞ΕͨߦΛ `blame` ͢Δ ཧ༝:

    ͋Δमਖ਼͕ߦΛมߋ͢Δඞཁ͕͋ͬͨ৔߹ɺͦΕ ͸੬ऑੑͷҰ෦ͩͬͨͱ͍͏͜ͱʹͳΔͨΊɻߦ͕ม ߋ͞ΕΕ͹௥Ճ෦෼ͱ࡟আ෦෼ͷdiff͕දࣔ͞ΕΔɻ खॱ3. मਖ਼ίϛοτͰ৽ͨʹૠೖ͞ΕͨίʔυϒϩοΫͷ લޙΛ `blame` ͢Δ ཧ༝: ηΩϡϦςΟfix͸όϦσʔγϣϯΛ௥Ճ͢Δ͜ͱ Ͱ׬ྃ͢Δ͜ͱ΋͋ΔͨΊɻԿΒ͔ͷϦιʔε΁ͷΞ Ϋηεͷ௚લ΍ؔ਺ͷݺͼग़͠ͷ௚ޙͳͲ खॱ4. ैલͷखॱʹ͓͍ͯ࠷΋ `blame` ͞ΕͨߦΛؚΉί ϛοτΛʮ੬ऑੑΛؚΜͩίϛοτ (VCC)ʯͰ͋Δͱ൑அ ͢Δ ෳ਺ͷίϛοτ͕ಉ਺͚ͩ `blame` ͞Ε͍ͯΕ͹྆ํͱ ΋VCCͩͱ൑அ͢Δ
  10. VCCൃݟख๏ͷධՁ ൃݟख๏ͷਖ਼֬ੑΛධՁ͢ΔͨΊɺ15%ͷVCC (96݅) Λϥ ϯμϜαϯϓϦϯάͯ͠खಈͰݕূͨ͠ͱ͜Ζɺ3.1% (3݅) ͕ޡͬͯ `blame` ͞Ε͍ͯͨ ޡͬͯ

    `blame` ͞Εͨίϛοτ͸3݅ͱ΋ඇৗʹେ͖ͳ ίϛοτͩͬͨ e.g. Update libtool to version 2.2.8. · vadz/ libtiff@31040a3 https://github.com/vadz/libtiff/commit/ 31040a39 VCC-Finder͸ݕग़աଟΛ཈੍͢Δ͜ͱʹ஫ྗ͍ͯ͠Δ ͨΊ3.1%ͷΤϥʔϨʔτ͸ڐ༰ൣғ಺Ͱ͋Δ VCC 640݅ʹରͯ͠ɺʮະ෼ྨʯͷίϛοτ͸169,502݅ ͋ͬͨ ݱ࣌఺Ͱ͸CVE͕ൃݟ͞Ε͍ͯͳ͍͕ɺજࡏతͳ੬ऑ ੑ͕ଘࡏ͍ͯ͠ΔͨΊʮະ෼ྨʯͱͨ͠
  11. 3-2. VCCͷಛ௃ྔ ϓϩδΣΫτ͝ͱͷಛ௃ྔ ϓϩάϥϛϯάݴޠ GitHubͷελʔ਺ GitHubͷϑΥʔΫ਺ ίϛοτ਺ ίϛοτ͝ͱͷಛ௃ྔ ίϯτϦϏϡʔγϣϯ཰: ஶऀ͝ͱͷίϛοτͷϓϩ

    δΣΫτʹ઎ΊΔׂ߹ i.e. ͋Δஶऀͷίϛοτ਺ / ౰ ֘ϓϩδΣΫτͷ૯ίϛοτ਺ ϒϩοΫ਺: 1ͭͷdiff಺ʹݱΕΔίʔυͷมߋͷ͔ͨ ·Γ (hunk) ͷ਺ ύον: ίϛοτ಺ͷ `bag of words` ͱͯ͠ද͞ΕΔς Ωετͱͯ͠ͷมߋ ύονΩʔϫʔυ਺: ύον͝ͱͷC΍C++ͷΩʔϫʔ υͷൃੜ਺
  12. 3-4. ֤ಛ௃ྔʹର͢Δ੩తղੳ Mann-Whitney Uݕఆ (ϊϯύϥϝτϦοΫݕఆ; ਖ਼ن෼෍Λ ԾఆͰ͖ͣɺ࣭తσʔλ͕2߲બ୒ܗࣜ) ʹΑͬͯɺະ෼ྨ ίϛοτͱൺ΂ͯVCCʹؚ·Ε͕ͪͳΩʔϫʔυΛ୳ͩ͠ ͨ͠

    VCCͱ࠷΋ؔ࿈ੑͷߴ͍Ωʔϫʔυ܈Λநग़ͨ͠ *ද2 ༗ҙਫ४͸ p < 0.000357, 0.01/28 ϘϯϑΣϩʔχ๏Ͱิਖ਼ (อकతͳfamilywise error rate ͷௐ੔๏Ͱ͋ΓɺЌΤϥʔͷՄೳੑ͕ߴ͘ͳΔ) ਤதͷ effect size (ޮՌྔ) ͸ԾઆΛࢧ࣋͢Δ֬཰ ྫ: Ωʔϫʔυ `if` ͸70%ͷέʔεͰະ෼ྨίϛοτ ΑΓ΋VCCʹؚ·ΕΔ܏޲͕͋Δ VCC͸ະ෼ྨίϛοτΑΓ΋ΩʔϫʔυΛଟؚ͘Ήͱ ͍͏͜ͱ΋෼͔ͬͨ
  13. 4. ػցֶशʹΑΔVCCͷݕग़ લষͰVCCͷಛ௃Λ෼ੳ͕ͨ͠ɺ͜ΕΒͷಛ௃Λݩʹਓख Ͱݕ஌ϧʔϧΛߏ੒͢Δͷ͸೉͍ͨ͠ΊɺػցֶशΛ༻͍ ͯࣗಈతʹίϛοτΛ෼ੳͯ͠ϨϏϡΞʔ͕༏ઌ౓͚ͮͰ ͖ΔΑ͏ʹϥϯΫΛࣔ͢ ෼ྨثͷཁ݅ Generality (Ұൠੑ): ڭࢣσʔλ͸ϝλσʔλͷ਺஋ϝτ

    ϦοΫ͔Βͳ͓ͬͯΓɺίϛοτϝοηʔδͷจݴ΍ ίʔυͷΩʔϫʔυͱ͍༷ͬͨʑͳಛ௃Λѻ͑Δ෼ྨ ثͰͳ͚Ε͹ͳΒͳ͍ Scalability (֦ுੑ): ਺ઍͷϑΝΠϧ΍ίϛοτΛ༗͢Δ େ͖ͳϨϙδτϦΛखࠒͳ࣌ؒͰ෼ੳͰ͖ͳ͚Ε͹ͳ Βͳ͍ Explainability (આ໌Մೳੑ): ෼ྨث͸ͳͥͦͷίϛοτ͕ ݕग़͞Εͨͷ͔ɺਓ͕ؒཧղͰ͖Δઆ໌͕Ͱ͖ͳ͚Ε ͹ͳΒͳ͍ ػցֶशͱ৘ใݕࡧͷίϯηϓτΛ༻͍ͨΞϓϩʔν छʑͷಛ௃ྔΛ Generalised Bag-of-Words Model Ͱදݱ ઢܗαϙʔτϕΫλʔϚγϯ (SVM) Ͱ൑ผཧ༝͕આ໌ Մೳͳ݁ՌΛଟ਺ͷಛ௃Λݩʹग़ྗ͢Δ ຊݚڀͰ͸਺஋ϕʔεͷίʔυϝτϦοΫͱGit, GitHubͷϝ λσʔλͷಛ௃Λඥ෇͚ΔͨΊɺτʔΫϯͷू߹ S ͱͯ͠ ༻͍Δ
  14. 4-1. BAG-OF-WORDS Ϟσϧ τʔΫϯͷू߹ S ͸ίϛοτϝοηʔδͷςΩετจࣈΛ ϓϩάϥϜͷΩʔϫʔυͱಉ༷ʹѻ͏͜ͱ͕Ͱ͖Δ ۩ମతʹ͸ɺεϖʔεͱվߦΛ࢖ͬͯίϛοτϝηʔδ ͱϓϩάϥϜͷίʔυΛ෼͚͍ͯΔ ҰൠੑͱϓϥΠόγʔอޢͷͨΊɺஶऀ໊΍emailΞυ

    Ϩεͱ͍ͬͨҰ෦ͷτʔΫϯ͸ແࢹ͍ͯ͠Δ ίϛοτ͔ΒϕΫλʔ΁ͷϚοϐϯά φ Λ࣍ͷΑ͏ʹఆٛ ͢Δ φ: X → ℝ^|S|, φ: x ⟼ (b(x, s))s∈S X ͸͢΂ͯͷίϛοτͷू߹Ͱ, x ∈ X ݸʑͷίϛο τ͸ϕΫλʔۭؒ΁ຒΊࠐΈ ิॿؔ਺ b(x, s) ͸τʔΫϯ s ͕ x ʹؚ·Ε͍ͯΔ ͔Λ 0, 1 Ͱฦ͢ Սۭͷίϛοτ x Λྫʹઆ໌ ߹੒͞ΕͨϕΫτϧۭؒ͸਺ઍ࣍ݩͱͳΔ ΄ͱΜͲͷ࣍ݩʹ͍ͭͯ0Ͱɺૄσʔλߏ଄ͱͯ͠ϝ ϞϦ֬อ͢Δ͜ͱ͕Մೳ
  15. 4-2. ෼ྨͱઆ໌Մೳੑ ଟྔͷσʔλΛѻ͑ͯɺ൑அͷཧ༝΋આ໌Ͱ͖Δֶशख๏͸ ݶΒΕ͍ͯΔ ͦͷ͏ͪͷ1͕ͭ linear Support Vector Machines (SVM)

    Linear SVM ͜ͷݹయతͳSVMͷѥछ͸Χʔωϧ๏͸༻͍ͣɺೖྗε ϖʔεͰ௚઀ॲཧΛߦ͏ͨΊɺ݁Ռͱ࣮ͯ͠ߦ࣌ؒܭࢉ ྔ͸ϕΫλͱಛ௃ͷ਺ʹઢܗʹεέʔϧ͢Δ LibLinear VCC-Finderͷ෼ྨث͸ Linear SVM ޲͚ʹ༷ʑͳ࠷దԽΞ ϧΰϦζϜඋ͑ΒΕ͍ͯΔΦʔϓϯιʔεπʔϧ LibLinear Λ༻͍ͨ ֤ΞϧΰϦζϜ͸2ͭͷ༩͑ΒΕͨΫϥεʢVCCͱະ෼ ྨίϛοτʣΛ࠷େͷϚʔδϯͰ෼ׂ͢ΔͨΊͷ௒ฏ໘ ω ΛٻΊΔ ֶश͕ೖྗεϖʔεͰߦΘΕΔͨΊɺ௒ฏ໘ϕΫλ ω Λ ෼ྨثͷ൑அͷઆ໌ʹ༻͍Δ͜ͱ͕Ͱ͖Δ φ(x) ͱ ϕΫλ ω ͷ಺ੵΛܭࢉ͢Δͱ φ(x) ͔Β௒ฏ໘΁ͷ ڑ཭Λද͢είΞΛಘΒΕΔ ͦͷίϛοτͲΕ͘Β͍੬ऑੑΛؚΜͰ͍ͦ͏͔Λҙຯ ͍ͯ͠Δ f(x) = ʪП(x), ωʫ = Σs∈S ωs b(x, s) ಺ੵ͸֤ಛ௃ͷ૯࿨Ͱܭࢉ͞Ε͍ͯΔͷͰɺͲͷಛ௃͕ ࠷΋൑ఆʹد༩͍ͯ͠Δͷ͔Λ؆୯ʹ஌Δ͜ͱ͕Ͱ͖Δ cf. ϢʔΫϦουڑ཭ Linear SVM ͷࣗ༝ύϥϝʔλΛଌఆ͢ΔͨΊɺڭࢣσʔλʹ ର͠ඪ४ͷަࠩݕূΛߦͬͨ VCCͷ൑ผʹ࠷దͳύϥϝʔλ͸ɺਖ਼نԽίετ C = 1, ΫϥεॏΈ W = 100
  16. 5. ධՁ SVMධՁثͷ༧ଌਫ਼౓ʹ͍ͭͯɺ܇࿅σʔλ (-2011) vs. ֶ शσʔλ (2011-2014) ͰධՁΛߦͬͨ cf.

    ද ਅཅੑ (TP): SVMධՁث͕܇࿅σʔλ͔Βಛ௃Λֶश͠ɺ ֶशσʔλ͔Β෼ྨثʹΑͬͯൃݟͨ͠੬ऑੑ CVE-2012-2119, Linux Karnel. සൟͳίʔυͷมߋ, ίϯ τϦϏϡʔγϣϯ࣮੷ͷগͳ͍ίϛολʔ, `socket`ͷ ར༻ CVE-2013-0862, FFmpeg. ίϯτϦϏϡʔγϣϯ࣮੷ͷ গͳ͍ίϛολʔ, 1౓Ͱͷେ͖ͳมߋ CVE-2014-1438, Linux Karnel. ྫ֎Λେྔʹ࢖༻, ίʔυ ͷࠩ෼͕େ͖͍, ΠϯϥΠϯΞηϯϒϥͷ࢖༻, `__input`΍`user`ͱ͍ͬͨϢʔβʔೖྗʹؔ͢Δม਺ CVE-2014-0148 Qemu. ޡΓ͕ͪͳόΠτྻૢ࡞ʹؔ͢ ΔΩʔϫʔυ "opaque", "*bs", "bytes" ͳͲ ٖཅੑ (FP) ΋͘͠͸ະ෼ྨίϛοτ: CVEʹؚ·Ε͍ͯͳ ͔͕ͬͨɺVCCʹ෼ྨ͞Εͨ੬ऑੑʹ͍ͭͯ ةݥͰ͋ΔͨΊެ։͠ͳ͍͕ɺFFmpeg ͰϦϦʔεલʹ मਖ਼͞Εͨίϛοτ cca1a42653 ΋ݕग़ͨ͠. ώϯτ: ੜ ͷόΠτྻૢ࡞, ܦݧͷઙ͍ίϛολ, Ұ౓Ͱͷେ͖ͳม ߋ
  17. ࢀߟ: PRECISION-RECALL CURVE Precision (P), Recall (R), true positives (Tp),

    false positive (Fp), false negative (Fn) P = Tp / (Tp + Fp) R = Tp / (Fp + Fn) ਤද Ref. “Image Matching in Large Scale Indoor Environment” - ΧʔωΪʔϝϩϯେֶ http://www.cs.cmu.edu/~hebert/ indexing.html
  18. VCC-FINDER͔ΒಘΒΕͨࣔࠦ ΤϥʔϋϯυϦϯά͸ཁ஫ҙ ෼ྨث͕VCC͔Βֶशͨ͠ͱ͜ΖʹΑΔͱʮgoto༗֐ આʯ͸ࠓ೔Ͱ΋༗ޮͰ͋ΔɻΩʔϫʔυ`goto`͕`out`΍ `error`ͱ͍ͬͨϥϕϧͱ༻͍ΒΕ͍ͯΔͱ͖ɺίʔυ ͸ΑΓ੬ऑʹͳΓ͕ͪͰ͋Δɻ͞ΒʹSVM͸`-EINVAL` ͱ͍ͬͨΤϥʔฦΓ஋΋જࡏతʹةݥͩͱ൑ผ͍ͯ͠ ͨɻ͜Ε͸CͰ͸gotoͱͱ΋ʹ༻͍Δ͜ͱͰҰൠతͳػ ߏͰ͸͋Δɻgoto͸ɺ౸ୡ͠ͳ͍ίʔυΛੜΈग़͔͢ Βةݥͱ͍͏ΑΓ͸ɺΤϥʔίϯςΫετͰසൟʹ༻

    ͍ΒΕΔ͔Βةݥͩɻ`exception`΍`error-handling`ίʔ υ͸ΑΓةݥʹͳΓ͕ͪɻྫ: AppleͷSSL/TSLόά https://www.imperialviolet.org/2014/02/22/applebug.html ϝϞϦ؅ཧʹؔ͢Δม਺ `sizeof`ͷසൟͳ࢖༻΍`len`, `length`ͱ͍ͬͨม਺ͷग़ݱ ͸VCCΛੜΈ͕ͪͰ͋Δɻଞʹ΋`buf`, `net`, `socket`ɻ ৽͍͠ίϯτϦϏϡʔλʔ͸ิॿ͢΂͠ ϓϩδΣΫτͷίϛοτʹ઎ΊΔׂ߹͕1%ҎԼͱ͍͏ Α͏ͳ৽ਓ͸໿5ഒ੬ऑੑΛੜΉ (ϙΞιϯ෼෍΁ͷΧ Πೋ৐ݕఆ: p < 0.0001)
  19. ݁࿦ ίʔυݕࠪख๏ͱͯ͠VCC-Finderͷఏএ͓ΑͼධՁΛߦͬ ͨ ͜ͷख๏͸ίʔυϝτϦΫε෼ੳͱϦϙδτϦͷϝλσʔ λΛ߹Θͤͯػցֶशʹ༻͍ͨͱ͜ΖطଘͷFlawfinderͷੑ ೳΛ༗ҙʹ্ճͬͨ C͓ΑͼC++ϓϩδΣΫτͷ170,860ίϛοτ͔ΒͳΔେ͖ͳ ςετσʔλϕʔεΛ࡞੒ͨ͠ɻຊख๏ͱͷൺֱ༻ʹ༻͍Δ ͜ͱ͕Ͱ͖Δɻ 2010೥·ͰσʔλͰ܇࿅͠ɺ2011೥͔Β2014೥ͷςετ

    σʔλʹରֶͯ͠शͤͨ͞ͱ͜ΖFlawfinderͱൺ΂ِͯཅੑ ͷݕग़͕99%Լճͬͨɻ219ݸͷط஌ͷ੬ऑੑͷ͏ͪ53ݸΛ ݕग़͠ɺِཅੑ͸Flawfinder͕5460ݸͩͬͨͷʹൺ΂36ݸ ͩͬͨɻ ͜ͷ෼໺ͰͷকདྷͷݚڀͷͨΊɺVCCσʔλϕʔεΛެ։ ͨ͠ɻ܇࿅σʔλ΍طଘख๏΁ͷϕϯνϚʔΫͱͯ͠༻͍ Δ͜ͱ͕Ͱ͖ΔɻݱࡏίϛϡχςΟ͸͜ͷछͷجૅσʔλ ͕ෆ଍͍ͯ͠ΔͨΊɺൺֱՄೳͳݚڀΛଅਐ͍ͨ͠ͱߟ͑ ͍ͯΔɻ ࠓޙͷ׆ಈʹ͍ͭͯɻ͢ͰʹFlawfinderΑΓे෼ʹྑ͍ੑೳ Ͱ͋Δ͕ɺଟछଟ༷ͳಛ௃΁ͷ෼ੳͷՄೳੑʹ͓͍ͯ͸ද ໘Λগ͠࡟ͬͨఔ౓Ͱ͔͠ͳ͍ͱ͍͏ೝࣝͰ͋Δɻ։ൃ࣌ ͷ੬ऑੑΛ࠷খԽ͢ΔͨΊΑΓҰൠతͳॿݴ͕ಘΒΕΔ͸ ͣͰ͋Δɻ
  20. APPENDIX: ຊݚڀͷର৅ͱͨ͠ϦϙδτϦ ຊݚڀͰ͸C͔C++ͷϓϩδΣΫτʹߜ͍ͬͯΔ ෼ੳʹ͋ͨͬͯίϛοτؒͷޓ׵ੑΛอͭͨΊ ιϑτ΢ΣΞͷॏཁੑ (Linux, Kerberos, OpenSSL, etc.) 66ͷGitHubϨϙδτϦ

    Portspoof, GnuPG, Kerberos, PHP, MapServer, HHVM, Mozilla Gecko, Quagga, libav, Libreswan, Redland Raptor RDF syntax library, charybdis, Jabberd2, ClusterLabs pacemaker, bdwgc, pango, qemu, glibc, OpenVPN, torque, curl, jansson, PostgreSQL, corosync, tinc, FFmpeg, nedmalloc, mosh, trojita, inspircd, nspluginwrapper, cherokee webserver, openssl, libfep, quassel, polarssl, radvd, tntnet, Android Platform Bionic, uzbl, LibRaw, znc, nbd, Pidgin, V8, SpiderLabs ModSecurity, file, graphviz, Linux Kernel, libti, ZRTPCPP, taglib, suhosin, Phusion passenger, monkey, memcached, lxc, libguestfs, libarchive, Beanstalkd, Flac, libX11, Xen, libvirt, Wireshark, and Apache HTTPD
  21. ൃදऀॴײ 1. ػցֶशͷ୊ࡐͱͯ͠ͷ͓΋͠Ζ͞ Ұൠʹιʔείʔυͷֶश͸೉͍͠ (e.g. ίʔυΛॻ͘ ࡍɺϥΠϒϥϦԽͳͲͰಉ͡ػೳΛ࣋ͭίʔυΛԿճ ΋ॻ͘͜ͱΛੵۃతʹආ͚ΔͨΊɺଞλεΫʹൺ΂ͯ ৑௕ੑ͕গͳֶ͘श͕೉͍͠ ref.

    https://twitter.com/ neubig/status/712857703241089024 ) ͕ɺVCCݕग़ͷਫ਼ ౓Λग़͍ͯ͠Δʢطଘπʔϧ Flawfinder ʹରͯ͠ಉ recall Ͱ precision Λ99%ਫ਼౓޲্͍ͤͯ͞Δʣ ໰୊ઃఆͱͯ͠ʮιʔείʔυͷ඼࣭ͷߴ͞ʯ͸ఆྔ Խ͠ʹ͍͕͘ʮ੬ऑੑΛؚΉ͔൱͔ʯ͸໌֬ʹ2஋Խ Ͱ͖Δ ੬ऑੑ͸CVEͱͯ͠σʔλ͕஝ੵ͞Ε͓ͯΓɺ͞Βʹ ίϛοτʹCVE-IDΛॻ͍ͨΓɺCVEαΠτʹमਖ਼ί ϛοτ΁ͷϦϯΫΛهࡌͨ͠Γ͞Ε͍ͯΔ͜ͱΛ͏· ͘ར༻ͯ͠ίʔυͱ੬ऑੑΛͻ΋෇͚͍ͯΔ Linear SVM࠾༻͞Εͨཧ༝͸ॻ͔Ε͍ͯΔ͕ɺଞͷख ๏΋ࢼ͍ͨ͠ 2. ಛ௃ྔΛGitͷϝλσʔλ͔Βऔಘ͢Δηϯε͕Α͍ɻί ϛοτ͸࿦จதʹ΋͋Δ௨Γ։ൃऀʹͱͬͯଥ౰ͳ୯Ґɻ ։ൃऀͱͯ͠ͷܦݧ͕ͳ͍ͱ͜ͷൃ૝͸Ͱ͖ͳ͍ͷͰ͸ͳ ͍͔ 4. ϕετϓϥΫςΟε΁ͷཪ෇͚͕ڵຯਂ͍ʢʮཻ౓ͷখ ͍͞ίϛοτ͕޷·͍͠ʯɺʮ৽ࢀίϛολʔ͸ϕςϥϯ ʹൺ΂ͯ5ഒ੬ऑੑΛؚΈ΍͍͢ʯʣ 5. ͞ΒͳΔਫ਼౓΍଎౓ͷ޲্͚ͩͰͳ͘ɺपลπʔϧͷൃ ల΋ظ଴͍ͨ͠ɻ։ൃऀ໨ઢͰݟΔͱύονͷੜ੒·Ͱ͠ ͯ͘ΕΔProphetͳͲ͕͋ΔͷͰɺVCC-FinderΛϕʔεʹ͠ ͯΑΓ։ൃऀΛิॿͰ͖Δํ޲΁ൃలͤ͞ΒΕͦ͏Ͱ͋Δ ref. http://people.csail.mit.edu/fanl/papers/prophet-popl16.pdf