Slide 1

Slide 1 text

ධ൑෼ੳ 
 PythonͰಈֶ͔ͯ͠Ϳػցֶशೖ໳ ୈೋճʢશ࢛ճγϦʔζʣ URL : https://shiroyagi.connpass.com/event/40028/ ٠ా ངฏ
 2016/09/30

Slide 2

Slide 2 text

ߨࢣ঺հ • ٠ా ངฏʢ͖ͨ͘ Α͏΁͍ʣ • ത࢜ʢཧֶʣ • ݱࡏ͸๭ίϯαϧςΟϯάϑΝʔϜʹͯσʔλ෼ੳۀ຿ʹैࣄ • ಘҙ෼໺
 ɾػցֶशͷཧ࿦తଆ໘
 ɾਪનΞϧΰϦζϜ
 ɾը૾෼ੳʢDeep Learningʣ • ࿈བྷઌ
 Կ͔͋Γ·ͨ͠Β͓ؾܰʹ͝࿈བྷ͍ͩ͘͞
 Email : diracdiego@gmail.com
 Facebook : https://www.facebook.com/yohei.kikuta.3
 Linkedin : https://jp.linkedin.com/in/yohei-kikuta-983b29117

Slide 3

Slide 3 text

໨࣍ • ධ൑෼ੳͱ͸Կ͔
 ධ൑෼ੳͰ࣮ࢪ͢Δ͜ͱ
 • ςΩετσʔλΛ༻͍ͨ෼ੳͷجૅ
 ςΩετ෼ੳʹಛ༗ͷॲཧ΍ಛ௃ྔ࡞੒ͷجૅ
 • өըͷϨϏϡʔσʔλΛ༻͍ͨ෼ੳ
 ࣮ࡍʹ෼ੳΛ࣮ࢪͯ͠աڈ࿦จʹউͯΔ͔νϟϨϯδ

Slide 4

Slide 4 text

ධ൑෼ੳͱ͸Կ͔

Slide 5

Slide 5 text

ςΩετ͸৘ใྔ͕๛෋ ग़ॴ : https://www.amazon.co.jp/%E3%83%91%E3%82%BF%E3%83%BC%E3%83%B3%E8%AA%8D%E8%AD%98%E3%81%A8%E6%A9%9F%E6%A2%B0%E5%AD%A6%E7%BF%92-%E4%B8%8A-C-M-%E3%83%93%E3%82%B7%E3%83%A7%E3%83%83%E3%83%97/dp/4621061224/ref=sr_1_6?s=books&ie=UTF8&qid=1474539268&sr=1-6&keywords=%E6%A9%9F%E6%A2%B0%E5%AD%A6%E7%BF%92

Slide 6

Slide 6 text

ςΩετ͸৘ใྔ͕๛෋ ग़ॴ : http://review.kakaku.com/review/K0000899767/#tab

Slide 7

Slide 7 text

ςΩετ͸৘ใྔ͕๛෋ ग़ॴ : https://twitter.com/search?q=x1%20%E3%82%A4%E3%83%B3%E3%82%B9%E3%82%BF%E3%83%B3%E3%82%B9&src=typd

Slide 8

Slide 8 text

CGMͷོ੝ Consumer Generated Media (CGM) ͕޿͘࢖ΘΕ͍ͯΔ
 • ۩ମతͳྫ
 Amazon ͷϨϏϡʔ, Twitter ͷπΠʔτ, ཱྀߦαΠτͷޱίϛ, ͳͲ • ಛ௃
 Ϣʔβࣗ਎͕σʔλΛੜ੒
 αʔϏεʹΑͬͯ͸ඇৗʹେྔͷσʔλ • ༗༻ͳ఺
 ର৅ʹର͢ΔҰൠফඅऀͷੜͷ੠͕நग़Ͱ͖Δ
 େྔͷσʔλ͔Βੈ࿦શମΛ൑அՄೳ

Slide 9

Slide 9 text

ධ൑෼ੳͰ΍Γ͍ͨ͜ͱ • ۃੑͷ෼ྨ
 ͦͷςΩετ͕ positive ͳҙݟ͔ negative ͳҙݟ͔Λ෼ྨ
 • ࣙॻͷࣗಈߏங
 ςΩετσʔλͷू߹ʹݱΕΔ୯ޠͷొ࿥
 • ϢʔβͷϓϩϑΝΠϦϯά
 ςΩετͷ࡞੒ऀͷੑผ൑ผͳͲΛ࣮ࢪ
 • ಺༰ͷཁ໿
 ςΩετσʔλͷू߹͔Β࿩୊΍ੈ࿦ͷ܏޲ͷநग़ͳͲΛ࣮ࢪ

Slide 10

Slide 10 text

ධ൑෼ੳͰ΍Γ͍ͨ͜ͱ • ۃੑͷ෼ྨ
 ͦͷςΩετ͕ positive ͳҙݟ͔ negative ͳҙݟ͔Λ෼ྨ
 ෼ྨʹ͓͍ͯػցֶशΛ༻͍Δ͜ͱ͕Մೳ ϨϏϡʔA ͜ͷ঎඼͸ͱͯ΋࢖͍ ΍͍͢ͷͰ͓͢͢Ίʂ ϨϏϡʔB ͜ͷ঎඼͸஋ஈͷׂʹ ੑೳ͕௿͍ɻ ϨϏϡʔC ͜ͷ঎඼͸ങͬͯଛ͢ ΔϨϕϧͰ͸ͳ͍ɻ positive negative positive (or neutral) Ԡ༻ྫʣ
 ͋Δ঎඼ʹରͯͦ͠Ε͕ੈ͔ؒΒྑ͍ධՁͳͷ͔൱͔Λ൑அ ঎඼ʹର͢Δ negative ͳϨϏϡʔΛूΊͯվળ఺Λચ͍ग़͠

Slide 11

Slide 11 text

ධ൑෼ੳͰ΍Γ͍ͨ͜ͱ • ۃੑͷ෼ྨ
 ෼ྨΛࡉ͔ͯ͘͠ޒஈ֊ͷධՁͳͲʹ͢Δ͜ͱ΋ଟ͍
 ϥϯΫֶश΍ճؼͷ໰୊ͱͯ͠΋ѻ͑Δ ϨϏϡʔA ͜ͷ঎඼͸ͱͯ΋࢖͍ ΍͍͢ͷͰ͓͢͢Ίʂ ϨϏϡʔB ͜ͷ঎඼͸஋ஈͷׂʹ ੑೳ͕௿͍ɻ ϨϏϡʔC ͜ͷ঎඼͸ങͬͯଛ͢ ΔϨϕϧͰ͸ͳ͍ɻ ˑˑˑˑˑ ˑˑ ˑˑˑ Ԡ༻ྫʣ
 Ϣʔβͷଞ৘ใͱ૊Έ߹Θͤͯɺ͋ΔϢʔβʹରͯ͠ߴධՁʹͳΓ ͦ͏ͳ঎඼Λਪન

Slide 12

Slide 12 text

ධ൑෼ੳͰ΍Γ͍ͨ͜ͱ • ۃੑͷ෼ྨ
 ΑΓਐΜͩ΋ͷͱͯ͠؍఺ຖʹ෼ྨ͢Δ΋ͷ΋͋Δ
 ֤؍఺Λநग़͢Δͱ͜Ζ΋ػցֶश͕࢖͑Δ ϨϏϡʔ ͜ͷϗςϧ͸෦԰͕ͱͯ΋͖Ε͍Ͱྑ ͍ɻ͔͠͠ͳ͕Β৯ࣄ͸࣭͕௿͘վ ળͯ͠΋Β͍ͱ͜Ζɻैۀһͷ઀٬͸ ஸೡͰ޷ײ͕࣋ͯΔɻ ෦԰ͷ࣭ : positive Ԡ༻ྫʣ
 positive / negative ͕͍ࠞͬͯ͟ΔϨϏϡʔ͔Β؍఺ຖͷධՁΛऔΓग़ ͠ϢʔβͷҙݟΛ೺Ѳ ৯ࣄͷ࣭ : negative ઀٬ͷ࣭ : positive

Slide 13

Slide 13 text

ධ൑෼ੳͰ΍Γ͍ͨ͜ͱ • ࣙॻͷࣗಈߏங
 ςΩετσʔλͷू߹ʹݱΕΔ୯ޠͷొ࿥ Ԡ༻ྫʣ
 େྔͷςΩετσʔλ͔Βग़ݱ୯ޠͷ඼ࢺ΍ positive / negative Λࣗ ಈతʹ൑ผͯ͠ొ࿥͠ɺ͞Βʹදه༳Ε΋ੋਖ਼ͨ͠ DB Λ࡞੒ ϨϏϡʔA ͜ͷ঎඼͸ͱͯ΋࢖͍ ΍͍͢ͷͰ͓͢͢Ίʂ ϨϏϡʔB ͜ͷ঎඼͸஋ஈͷׂʹ ੑೳ͕௿͍ɻ ϨϏϡʔC ͜ͷ঎඼͸ങͬͯଛ͢ ΔϨϕϧͰ͸ͳ͍ɻ ࣙॻ DB ࢖͍΍͍͢ : positive յΕ͍ͯΔ : negative Ϩϕϧ : neutral ɾ ɾ ɾ

Slide 14

Slide 14 text

ධ൑෼ੳͰ΍Γ͍ͨ͜ͱ • ϢʔβͷϓϩϑΝΠϦϯά
 ςΩετͷ࡞੒ऀͷੑผ൑ผͳͲΛ࣮ࢪ Ԡ༻ྫʣ
 ౤ߘͨ͠ϨϏϡʔσʔλΛجʹੑผͳͲΛ൑ผ͠, Ϩίϝϯυ͢Δ঎ ඼ͷબఆʹ໾ཱͯΔ ϨϏϡʔ ͜ͷ঎඼͸༑ୡʹ΋޷ ධͰΈΜͳ࢖ͬͯ·͢ɻ ϨϏϡʔ ͜ͷ঎඼ͷܗঢ়͸~Ͱ ਺͸~Ͱੑೳे෼Ͱ͢ɻ

Slide 15

Slide 15 text

ධ൑෼ੳͰ΍Γ͍ͨ͜ͱ • ಺༰ͷཁ໿
 ςΩετσʔλͷू߹͔Β࿩୊΍ੈ࿦ͷ܏޲ͷநग़ͳͲΛ࣮ࢪ Ԡ༻ྫʣ
 େྔͷςΩετσʔλ͔Β͍·ྲྀߦΓͷ࿩୊Λநग़ͨ͠Γબڍͷ݁ ՌΛ༧ଌͨ͠Γ͢Δ ςΩετA બڍߥ໺͸΍͸Γࣗຽ ౘ͕͔ͬ͠Γ͍ͯ͠Δ ςΩετB ͜ͷঢ়گ͡Όࣗຽౘʹ ೖΕΔ͔͠ͳ͍ͩΖ ςΩετC ʓʓౘͷ֗಄ԋઆʹ~ ਓ͕ԡ͠دͤͨɻ ग़ॴ : http://japan.cnet.com/news/society/35034916/

Slide 16

Slide 16 text

ධ൑෼ੳͱػցֶश ػցֶशͱͷ਌࿨ੑ͸ߴ͍ • େྔͷσʔλΛଈ͔࣌ͭߴ଎ʹॲཧ͢Δ͜ͱ͕Մೳ • CGMͷོ੝ʹΑΓిࢠ৘ใͱͯ͠େྔͷςΩετσʔλ͕ଘࡏ • ୯७ͳϧʔϧϕʔεͰ༧ଌͰ͖ͳ͍໰୊ʹରͯ͠΋ߴ͍ਫ਼౓Λൃش
 ͔͠͠ͳ͕Β೉͍͠఺΋ଟ͍ • ςΩετσʔλ͸ͦͷ··Ͱ͸Ϟσϧʹ౤ೖͰ͖ͳ͍
 → ಛ௃ྔ࡞੒΁ͷ஌ݟ͕ඞཁ • ݴޠʹΑΔҧ͍͕େ͖͍
 → ਂ͘෼ੳ͢Δʹ͸ݴޠϞσϧͷཧղͳͲ΋ඞཁ

Slide 17

Slide 17 text

ධ൑෼ੳͷྲྀΕ ୯७ͳ෼ྨ͔ΒෳࡶͳλεΫ΁ͱൃల͍ͯ͠·͢
 • 1990೥୅
 ܗ༰ࢺͷۃੑ෼ྨ (positive or negative) • 2000೥୅
 લ൒͸ϨϏϡʔςΩετͷۃੑ෼ྨ (positive or negative)
 ޙ൒͔Β͸τϐοΫϞσϧΛ༻͍ͨ಺༰ཁ໿ͳͲͷෳࡶͳλεΫ΁ • 2010೥୅
 Twitter ͳͲͷ SNS ͷ෼ੳ
 Word2vecΛ࢝Ίͱ͢Δ୯ޠͷ෼ࢄදݱͳͲ
 ςΩετͱը૾ͳͲͷଞͷσʔλΛෳ߹ͨ͠෼ੳ

Slide 18

Slide 18 text

ςΩετσʔλΛ༻͍ͨ෼ੳͷجૅ

Slide 19

Slide 19 text

ςΩετ෼ੳ͸೉͍͠ ίϯϐϡʔλͰѻ͑Δͷ͸਺஋σʔλ
 
 → ςΩετ͸୯ޠͷཏྻ
 
 → ҙຯߏ଄ΛؚΜͩ਺஋σʔλʹ͢Δඞཁ͕͋Δ 
 → ͔͠͠ςΩετσʔλ͸େ͖͞΍ॱংͱ͍͏ई౓Ͱ͸ଌΓͮΒ͍ ɹ Ex.) ʮཧ૝ʯͱʮݱ࣮ʯʹେখ΍ͲͪΒ͕ઌͱ͔͸Ұൠʹ͸ͳ͍
 
 → Ͳ͏͢Ε͹Α͍ͷ͔ʂʁ
 
 → ຊߨ࠲Ͱ͸୅දతͳ෼ੳεςοϓΛ؆୯ʹ঺հ

Slide 20

Slide 20 text

୅දతͳ෼ੳεςοϓ • ςΩετσʔλΛ४උ
 • ܗଶૉղੳ
 • ܎Γड͚ղੳ
 • ಛ௃ྔ࡞੒
 • ໨తͷ෼ੳΛ࣮ࢪ
 ڭࢣ༗ΓͳΒ൑ผ΍ճؼɺڭࢣແ͠ͳΒΫϥελϦϯάɺͳͲ ※͋͘·Ͱ୅දతͳεςοϓͳͷͰ༷ʑͳύλʔϯ͕ଘࡏ

Slide 21

Slide 21 text

୅දతͳ෼ੳεςοϓ • ςΩετσʔλΛ४උ 
 ෼ੳͷର৅ͱͳΔςΩετσʔλΛ४උ͢Δ
 
 จࣈίʔυʹ஫ҙʂ
 ɾ೔ຊޠͳͲͷϚϧνόΠτจࣈ͸ಛʹ஫ҙ͕ඞཁ ɾpython͸2ܥͱ3ܥͰ࣮૷͕ҟͳΔ(2ܥ͸strܕͱunicodeܕ͕͋Γ3ܥ͸unicodeͰ౷Ұ)
 ɹಛผͳཧ༝͕ͳ͍ݶΓ͸3ܥΛ༻͍Δͷ͕ྑ͍
 
 σʔλιʔεͱͯ͠͸ԼهͷΑ͏ͳ΋ͷ͕͋Δ
 ɾTwitter API (https://dev.twitter.com/overview/documentation) Λ༻͍ͨ tweet ऩू ɾ੨ۭจݿ (http://www.aozora.gr.jp/) ɾӳޠͷөըϨϏϡʔσʔλ (http://www.cs.cornell.edu/people/pabo/movie-review-data/)

Slide 22

Slide 22 text

୅දతͳ෼ੳεςοϓ • ܗଶૉղੳ
 ςΩετΛҙຯ୯ҐͰ࠷খͷཁૉʹ෼ղ͢Δ
 
 ӳޠͳͲεϖʔεͰ୯ޠ͕۠੾ΒΕΔݴޠ͸؆୯
 ɹEx.) This is a pen. → This / is / a / pen / .
 ೔ຊޠ͸೉͍͠
 ղੳ༻ϥΠϒϥϦ͕ඞཁͰ, MeCab (http://taku910.github.io/mecab/) ͕༗໊
 ɹEx.) ͢΋΋΋΋΋΋ͷ͏ͪ → ͢΋΋ / ΋ / ΋΋ / ΋ / ΋΋ / ͷ / ͏ͪ
 ՄೳͳΒࣙॻΛ༻͍ͯදه༳ΕΛਖ਼ͨ͠Γ඼ࢺΛ༩͑Δͱߋʹྑ͍ ɹEx.) ͓͜ͳ͏, ߦ͏, ߦͳ͏ → ߦ͏
 ɹEx.) ඒ͍͠ → (ඒ͍͠, ܗ༰ࢺ)

Slide 23

Slide 23 text

୅දతͳ෼ੳεςοϓ • ܎Γड͚ղੳ
 ܗଶૉʹରͯ͠म০͢Δ͞ΕΔͷؔ܎Λࢦఆ͢Δ
 ͜Ε͸ݴޠಛੑ΍ଟٛੑͷͨΊͱͯ΋೉͍͠
 
 ɹEx.) I think that that that that boy used is wrong. 
 ɹEx.) ࠇ͍൅ͷඒ͍͠ঁੑ͕͍Δɻ 
 ϥΠϒϥϦͳͲΛ࢖༻͢Δ͔܎Γड͚ղੳ͸εΩοϓ͢Δͷ΋ΞϦ ೔ຊޠͷϥΠϒϥϦ͸ Cabocha (https://taku910.github.io/cabocha/) ͕༗໊

Slide 24

Slide 24 text

୅දతͳ෼ੳεςοϓ • ಛ௃ྔ࡞੒
 ୯ޠͷ༗ແΛ {0,1} Ͱදݱ͢Δ one-hot encoding ͕جຊ
 ɹEx.) ࢲ → [1,0,0,0, …], ͸ → [0,1,0,0, …], ਓؒ → [0,0,1,0, …]
 ͜ΕΛ༻͍ͯจॻߦྻΛԼهͷΑ͏ʹ࡞੒Ͱ͖Δ จষ ࢲ ͋ͳͨ ͸ ਓؒ ͩ Ͱ ͳ͍ ɻ ʜ ࢲ͸ਓؒͩɻ ͋ͳͨ͸ਓؒͰͳ͍ɻ ʜ ͜ͷํ๏͸γϯϓϧͰѻ͍΍͍͢ ͔͠͠೚ҙͷೋ୯ޠؒͷྨࣅ౓͕ಉ͡ʹͳͬͯ͠·͍ҙຯ͸ফࣦ

Slide 25

Slide 25 text

୅දతͳ෼ੳεςοϓ • ಛ௃ྔ࡞੒ɿࠓճͷ෼ੳͰ࢖͏΋ͷ
 ςΩετؚ͕ΉҙຯΛ൓өͤ͞ΔͨΊʹ༷ʑͳಛ௃͕ߟҊ͞Ε͍ͯΔ ɾN-gram ɹྡ઀ͯ͠ੜ͡Δ N ݸͷ୯ޠΛҰͭͷ୯Ґͱͯ͠ѻ͏ɻN = 1,2͕ଟ͍ ɹɹEx.) bi-gram ࢲ͸ਓؒͩɻ→ (ࢲ, ͸), (͸, ਓؒ), (ਓؒ, ͩ), (ͩ, ɻ) ɾBag of Words (BoW) ɹग़ݱ͢Δ୯ޠͷස౓ΛΧ΢ϯτͯͦ͠ͷ਺Λಛ௃ྔͱ͢Δ ɹɹEx.) ࢲ͸ࢲΛ৴͡Δɻ→ (ࢲ, 2), (͸, 1), (Λ, 1), (৴͡Δ, 1), (ɻ, 1) ɾtf-idf (term frequency and inverse document frequency) ɹ୯ޠͷස౓ʹରͯͦ͠Ε͕ग़ݱ͢Δจॻͷׂ߹ͰॏΈ෇͚ ɹɹEx.) ʮࢲʯ͸ग़ݱස౓͸ଟ͍͕ଟ͘ͷจষͰݱΕΔͨΊ௿͍είΞ

Slide 26

Slide 26 text

୅දతͳ෼ੳεςοϓ • ಛ௃ྔ࡞੒ɿͦͷଞ
 ɾ࣍ݩѹॖ ɹจॻߦྻΛ௿࣍ݩʹѹॖͯ͠τϐοΫநग़ͳͲΛߦ͏
 ɹಛҟ஋෼ղ΍֬཰తજࡏҙຯղੳ΍Latent Dirichlet AllocationͳͲ ɾ෼෍ ɹ୯ޠ෼෍΍඼ࢺͷൺ཰෼෍ͳͲ ɾ୯ޠͷ෼ࢄදݱ
 ɹWord2Vec ʹ୅ද͞ΕΔϕΫτϧԋࢉ͕ҙຯΛ੒͢Α͏ͳදݱͷ֫ಘ
 ɹɹEx.) king - man ≒ queen - woman
 ɾetc…

Slide 27

Slide 27 text

୅දతͳ෼ੳεςοϓ • ໨తͷ෼ੳΛ࣮ࢪ
 ಛ௃ྔ͕࡞੒Ͱ͖Ε͹ޙ͸ଞͷػցֶशΛ༻͍ͨ෼ੳͱಉ༷
 
 ڭࢣ༗Γ → positive / negative ൑ผ΍ϨϏϡʔείΞ༧ଌ, ͳͲ ڭࢣແ͠ → ΫϥελϦϯά΍τϐοΫநग़, ͳͲ

Slide 28

Slide 28 text

ࠓճ࣮ࢪ͢Δ෼ੳ • ςΩετσʔλΛ४උ
 ӳޠͰॻ͔ΕͨөըͷϨϏϡʔσʔλΛ࢖༻
 • ܗଶૉղੳ
 ୯७ͳεϖʔε۠੾Γ΍؆୯ͳࣙॻΛߏஙͯ͠ͷॲཧ࣮ߦ
 • ܎Γड͚ղੳ
 • ಛ௃ྔ࡞੒
 uni-gram Ͱ Bag of Words ࡞੒, Ұ෦ͷ bi-gram ͷߏங΍ tf-idf ΋ར༻
 • ໨తͷ෼ੳΛ࣮ࢪ
 ର৅ͷϨϏϡʔςΩετ͕ positive ͔ negative ͔Λ൑ผ

Slide 29

Slide 29 text

өըͷϨϏϡʔσʔλΛ༻͍ͨ෼ੳ

Slide 30

Slide 30 text

࣮ࡍʹ෼ੳΛ࣮ࢪͯ͠Έ·͠ΐ͏ʂ • ໰୊ઃఆ
 ༩͑ΒΕͨจষ͕ positive ͳҙݟ͔ negative ͳҙݟ͔Λ൑ผ
 ࢀߟ࿦จɿhttp://www.aclweb.org/anthology/W02-1011
 • σʔλ
 https://www.cs.cornell.edu/people/pabo/movie-review-data/ ͔Βऩू
 positive, negative ͷλά෇͚͕ͳ͞Ε͍ͯΔ 700+700 ͷจষ
 1ϑΝΠϧʹ͖ͭ1ϨϏϡʔςΩετ͕֨ೲ
 • ໨ඪ
 ςΩετ෼ੳͷجຊతͳྲྀΕΛମݧ
 ࢀߟ࿦จͷਫ਼౓Λ্ճΔʂ

Slide 31

Slide 31 text

Notebookͷ४උ 1. Githubͷ https://github.com/yosukekatada/python_ml_study Λ clone
 2. 20160930_second_meeting ʹҠಈ
 3. ධ൑෼ੳͰ༻͍Δͷ͸Լه
 - data/
 - ML_2_2_normal.ipynb
 - ML_2_2_advanced.ipynb
 4. jupyter notebook (or ipython notebook) Λ։͘
 5. ෼ੳΛ࢝Ί·͠ΐ͏ʂ

Slide 32

Slide 32 text

Ϟσϧ : Support Vector Machine ࢀߟ : https://en.wikipedia.org/wiki/Support_vector_machine
 
 ෳࡶͳ෼཭ڥքͷσʔλΛઢܗ෼཭Մೳͳಛ௃ྔۭؒʹࣹӨ
 Ͱ͖Δ͚ͩ෼཭ڥք͕σʔλ఺͔Β཭ΕΔΑ͏ʹ͢Δ (Ϛʔδϯ࠷େԽ)
 ߴ࣍ݩ (ແݶ࣍ݩ΋ʂ) ΁ͷࣹӨͰ΋ܭࢉ͕Մೳ (ΧʔωϧτϦοΫ) original space feature space

Slide 33

Slide 33 text

Ϟσϧ : Naive Bayes classifier ࢀߟ : https://en.wikipedia.org/wiki/Naive_Bayes_classifier
 
 ม਺ͷಠཱੑͷԾఆͱϕΠζͷఆཧ͔Β൑ผثΛߏங 
 
 
 ͜͜Ͱ C ͸ 1 (positive) ΋͘͠͸ 0 (negative) ͱ͍͏ΫϥεͰ͋Γ, ֤ x ͸ unigram Ͱߏஙͨ͠ Bag of Words ͱߟ͑Ε͹Α͍
 ෼฼͸Ϋϥε C ʹґଘ͠ͳ͍ͷͰআ͍ͯ, ৚݅෇͖ಠཱΛ࢖͏ͱҎԼ 
 Ψ΢ε෼෍΍ϕϧψʔΠ෼෍ΛԾఆͯ͠ σʔλ͔ΒύϥϝλΛֶश P ( C | X ) = P ( C | x1, x2, . . . , xn) = P ( x1, x2, . . . , xn | C ) P ( C ) P ( X ) P ( C | X ) / P ( C ) n Y 1 P ( xi | C )

Slide 34

Slide 34 text

Ϟσϧ : Random Forest ࢀߟ : https://en.wikipedia.org/wiki/Random_forest
 
 ܾఆ໦Λෳ਺૊Έ߹Θͤͯଟ਺ܾͰ༧ଌ ୈҰճษڧձࢿྉͰ΋આ໌ : http://www.slideshare.net/ssuserb5817c/python-66169435/1 ɾ ɾ ɾ ҰͭҰͭͷ໦͸σʔλΛ͏·͘෼ׂ͢ΔΑ͏ʹࢬ෼͚͞Ε͍ͯ͘ ͦΕͧΕͷ໦ͷ༧ଌͷฏۉΛͱΔ͜ͱͰશମͷ༧ଌͱ͢Δ ޷͖ͱ͍͏୯ޠ͕ ؚ·ΕΔ͔൱͔ ବ࡞ͱ͍͏୯ޠ͕ ؚ·ΕΔ͔൱͔ ?͕5ճҎ্ ݱΕΔ͔൱͔

Slide 35

Slide 35 text

ಛ௃ྔ : tf-idf ࢀߟ : https://en.wikipedia.org/wiki/Tf%E2%80%93idf
 
 ग़ݱස౓͕ଟ͍୯ޠ͸ॏཁ͕ͩଟ͘ͷจষʹݱΕΔ΋ͷ͸ॏཁͰͳ͍, ͱ͍͏ߟ͑ʹج͍ͮͯ୯ޠຖʹॏΈ෇͚Λͨ͠ಛ௃ྔ

Slide 36

Slide 36 text

·ͱΊ

Slide 37

Slide 37 text

·ͱΊ • ධ൑෼ੳͱ͸Կ͔
 ςΩετσʔλ͔Β positive / negative ͷۃੑ൑ผ΍࿩୊நग़Λߦ͏
 • ςΩετσʔλΛ༻͍ͨ෼ੳͷجຊ
 ඇߏ଄ԽσʔλͰ͋Δ͜ͱ΍ݴޠʹΑΔҧ͍͕͋ΔͨΊߴ೉౓
 جຊ͸ ܗଶૉղੳ→܎Γड͚ղੳ→ಛ௃ྔ࡞੒→෼ੳͷ࣮ࢪ
 • өըͷϨϏϡʔσʔλΛ༻͍ͨ෼ੳ
 ςΩετ෼ੳͷجຊతͳॲཧΛܦݧ
 ಛ௃ྔ࡞੒Λ޻෉͢Δ͜ͱͰաڈ࿦จΛ্ճΔਫ਼౓Λୡ੒