Slide 1

Slide 1 text

Python ではじめるスパースモデリング 2018年5月19日 PyCon mini Osaka @ ヤフー株式会社 GFOオフィス

Slide 2

Slide 2 text

છాوࢤ ͦΊ͔ͩͨ͠ • גࣜձࣾϋΧϧε औక໾$50 • 1ZUIPOྺ ೥ • .BDIJOF-FBSOJOH.FFUVQ,"/4"* ্ཱͪ͛ • IUUQTNMNLBOTBJDPOOQBTTDPN

Slide 3

Slide 3 text

ϋΧϧεͱ͸ • ϥΠϑαΠΤϯεɾ࢈ۀ෼໺ Y"* • εύʔεϞσϦϯάΛ࣠ͱͨ͠σʔλղੳ • ౦๺େֶɾେؔਅ೭।ڭत͕ΞυόΠβʔ

Slide 4

Slide 4 text

ຊ೔ͷΰʔϧ • εύʔεϞσϦϯάΛ஌ͬͯ΋Β͏ • 1ZUIPOͰͷΞϧΰϦζϜ࣮૷ΛΈͯ΋Β͏ • ߟ͑ํɾಛ௃ʹڵຯΛ΋ͬͯ΋Β͏

Slide 5

Slide 5 text

εύʔεϞσϦϯάͱ͸

Slide 6

Slide 6 text

εύʔεϞσϦϯά • σʔλʹ಺ࡏ͢Δεύʔεੑʹண໨ͯ͠ɺࣄ৅ ΛϞσϧԽ͢Δख๏ • ୯ҰͷΞϧΰϦζϜΛࢦ͢Θ͚Ͱ͸ͳ͍ • ೥ࠒ͔Β׆ൃʹݚڀ͞Ε͍ͯΔ

Slide 7

Slide 7 text

σϞ • Χϝϥ͔Βͷը૾Λֶश • എܠΛਪఆ • ҠಈମΛݕग़

Slide 8

Slide 8 text

ػցֶशͷಋೖ࣌ͷ՝୊ • ࣗಈԽ͸͍͕ͨ͠ɺઆ໌੹೚͸͋Δ • σʔλऩूͷ࣌ؒ΍ίετ͕େ͖͍ • ϋʔυ΢ΣΞͷίετ͸཈͍͑ͨ

Slide 9

Slide 9 text

εύʔεϞσϦϯά΁ͷظ଴ • ೖྗಛ௃ྔͷதͷॏཁͳ΋ͷ͕Θ͔Δ • গྔͷ৘ใ͔Β΋ਪఆΛߦ͑Δ • (16؀ڥҎ֎Ͱ΋ಈ࡞͢Δ

Slide 10

Slide 10 text

ઢܗճؼͰͷεύʔεϞσϦϯά • લఏ • ग़ྗ Z ͸ɺೖྗ Yͷઢܗ݁߹ͱ؍ଌϊΠζЏͰදݱ͞ΕΔ • ೖྗ Y ͸ N ࣍ݩɺ؍ଌ͞Εͨ Z ͸ O ݸ͋Δͱ͢Δ ! = #$ %$ + ⋯ + #( %( + ) 㱺 Z Λ͍͍ײ͡ʹઆ໌͢Δ X Λ஌Γ͍ͨ

Slide 11

Slide 11 text

ઢܗճؼͰͷεύʔεϞσϦϯά • ղ͘΂͖໰୊ • ؍ଌ஋ Z ͱਪఆͨ͠ X ͔Βܭࢉ͞ΕΔ஋ͷೋ৐ޡࠩΛ࠷খԽ min 1 2 & − () * 㱺 Z ͷαϯϓϧ਺͕ Y ͷ࣍ݩΑΓ΋খ͍͞৔߹͸ʁ

Slide 12

Slide 12 text

εύʔε੍໿ͷ௥Ճ • ະ஌਺ͷ਺ΑΓํఔࣜͷ਺͕গͳ͍࿈ཱํఔࣜ • Yʹର͢Δεύʔεͳ੍໿Λ௥Ճͯ͠ղ͘ • ʮͳΔ΂͘গͳ͍ Y Ͱ৚݅Λຬͨ͢ʯ 㱺ʮͳΔ΂͘ଟ͘ͷ X Λ ʹ͢Δʯ • ૉ௚ʹ΍Δͱɺ૊Έ߹Θͤ࠷దԽ໰୊㽊

Slide 13

Slide 13 text

-ϊϧϜ࠷దԽ • ੍໿৚݅Λ؇࿨ • ʮX ͷઈର஋ͷ૯࿨ΛͳΔ΂͘খ͘͢͞Δʯ • ؇࿨ͯ͠΋େҬత࠷దղ͕ಘΒΕΔ • ਺஋తʹղ͘͜ͱ͕Ͱ͖Δ

Slide 14

Slide 14 text

• -FBTU"CTPMVUF4ISJOLBHFBOE4FMFDUJPO 0QFSBUPS ͷུ • -ϊϧϜΛਖ਼ଇԽ߲ͱͯ͠௥Ճͨ͠໨తؔ਺ -BTTP min 1 2 & − () * + , ( - 㱺 ਖ਼ଇԽύϥϝʔλЕͰεύʔε੍໿ͷޮ͖Λௐ੔

Slide 15

Slide 15 text

छʑͷΞϧΰϦζϜ • ࠲ඪ߱Լ๏ $PPSEJOBUF%FTDFOU • ࠷খ֯ճؼ -FBTU"OHMF3FHSFTTJPO • ൓෮ॖখᮢ஋ΞϧΰϦζϜ *45" • ަޓํ޲৐਺๏ "%..

Slide 16

Slide 16 text

ྫɿ࠲ඪ߱Լ๏ͷΞϧΰϦζϜ 1. #$ % = 1, … , ) ΛॳظԽ 2. + #$ = , - . /0 . 1 , 2 Ͱߋ৽ 3($) = 6 − 8 9:$ ; 9 #9 ͱ͠ɺ, ͸ೈᮢ஋࡞༻ૉͱ͢Δ 3. ऩଋ৚݅·Ͱ܁Γฦ͠

Slide 17

Slide 17 text

ೈᮢ஋࡞༻ૉ • ஋Λθϩʹ͚ۙͮΔ࡞༻Λ࣋ͭ S ", $ = & " − $, (" ≥ $) 0, (−$ < " < $) " + $, (" ≤ −$)

Slide 18

Slide 18 text

ྫɿ࠲ඪ߱Լ๏ͷ࣮૷ྫ #  def soft_threshold(X, thresh): return np.where(np.abs(X) <= thresh, 0, X - thresh * np.sign(X)) #   w_cd = np.zeros(n_features) for _ in range(n_iter): for j in range(n_features): w_cd[j] = 0.0 r_j = y - np.dot(X, w_cd) w_cd[j] = soft_threshold(np.dot(X[:, j], r_j) / n_samples, alpha)

Slide 19

Slide 19 text

࣮ߦ݁Ռ ೖྗಛ௃ྔͷ࣍ݩ͸  ඇθϩཁૉ͸ αϯϓϧ਺͸

Slide 20

Slide 20 text

ͦͷଞͷ࣮૷ • TDJLJUMFBSO • ࠲ඪ߱Լ๏ͱ࠷খ֯ճؼ • IUUQTDJLJUMFBSOPSHTUBCMFNPEVMFTHFOFSBUFETLMFBSOMJOFBS@NPEFM-BTTPIUNM • IUUQTDJLJUMFBSOPSHTUBCMFNPEVMFTHFOFSBUFETLMFBSOMJOFBS@NPEFM-BTTP-BSTIUNM • TQNJNBHF • ަޓํ޲৐਺๏ • IUUQTHJUIVCDPNIBDBSVTTQNJNBHFCMPCEFWFMPQNFOUTQNJNBHFMJOFBS@NPEFMBENNQZ

Slide 21

Slide 21 text

TQNJNBHF • εύʔεϞσϦϯά༻ϥΠϒϥϦ • ը૾ղੳʹ༻͍ΒΕΔΞϧΰϦζϜΛத৺ʹ • TDJLJUMFBSOΠϯλʔϑΣʔεʹ४ڌ • IUUQTHJUIVCDPNIBDBSVTTQNJNBHF

Slide 22

Slide 22 text

ը૾ॲཧ΁ͷద༻ • جຊΞΠσΟΞ • ը૾͔ΒύονΛ੾Γग़͢ • ύονΛಉαΠζͷࣙॻجఈͷઢܕ݁߹Ͱදݱ͢Δ • ը૾શମΛදݱ͢ΔͨΊࣙॻ΋ֶश͢Δ

Slide 23

Slide 23 text

ը૾ॲཧ΁ͷద༻ :ը૾ "ࣙॻ ! "# $# 9܎਺

Slide 24

Slide 24 text

ࣙॻʹΑΔ࠶ߏ੒ Yύον جఈͰͷ࠶ߏ੒݁Ռ

Slide 25

Slide 25 text

ྫɿࣙॻֶशͱ࠶ߏ੒ #    patches = extract_simple_patches_2d(img, patch_size) #   patches = patches.reshape(patches.shape[0], -1).astype(np.float64) intercept = np.mean(patches, axis=0) patches -= intercept patches /= np.std(patches, axis=0) #   model = MiniBatchDictionaryLearning(n_components=n_basis, alpha=1, n_iter=n_iter, n_jobs=1) model.fit(patches) #  reconstructed_patches = np.dot(code, model.components_) reconstructed_patches = reconstructed_patches.reshape(len(patches), *patch_size) reconstructed = reconstruct_from_simple_patches_2d(reconstructed_patches, img.shape)

Slide 26

Slide 26 text

ܽଛิ׬΁ͷద༻ ܽଛ஋Λߟྀͨࣙ͠ॻֶशʹΑΔ৘ใ෮ݩ ը૾ :ʹର͠ྼԽ࡞༻ૉ .͕͔͔Δͱߟ͑ͯॲཧΛߦ͏

Slide 27

Slide 27 text

·ͱΊ

Slide 28

Slide 28 text

εύʔεϞσϦϯάͱ͸ • ೖྗಛ௃ྔͷதͷॏཁͳ΋ͷ͕Θ͔Δ • গྔͷ৘ใͰ΋͸͡ΊΔ͜ͱ͕Ͱ͖Δ • طଘ࣮૷Λ࢖ͬͯؾܰʹࢼͤΔ • TDJLJUMFBSO΍ TQNJNBHF • ຊ೔ͷ಺༰ ˠ IUUQTHJUJPWQY2