Slide 1

Slide 1 text

Individual Fair Gradient Boosting 2021/04/13 @ಡΈձ ༶໌఩

Slide 2

Slide 2 text

•ஶऀ৘ใ •Alexander Vargo, Fan Zhang, Mikhail Yurochkin, Yuekai Sun •ϛγΨϯେֶɼ্ւՊٕେֶɼMIT-IBM Watson AI Lab •ग़య: ICLR2021 •ͳΜͰબΜ͔ͩʁ •ݸผެฏੑ+ܾఆ໦Λߟ͍͑ͯΔͷ͸ҙ֎ʹগͳ͍ɽˠ͜Ε͕ॳΊͯΒ͍͠ ࿦จ৘ใ ݸผެฏੑ + GBDTʹ஫໨ͨ͠ݚڀ

Slide 3

Slide 3 text

•ػցֶश(ML)͕ҙࢥܾఆͷ෼໺Ͱ޿͘࢖ΘΕ࢝Ί͍ͯΔ •ಛఆͷάϧʔϓ(ਓ)ʹରͯ͠ෆެฏͳධՁΛ͍͚ͯ͠ͳ͍ •Amazonͷཤྺॻ৹ࠪγεςϜͰࠩผ͕ߦΘΕ͍ͯͨ͜ͱ͕໌Β͔ʹͳͬ ͨɽ ΠϯτϩμΫγϣϯ ެฏੑΛߟྀ͍͔ͯ͠ͳ͍ͱ͍͚ͳ͍

Slide 4

Slide 4 text

•MLք۾Ͱ͸େ͖͘ೋछྨͷެฏੑΛߟ͑Δ •ݸผެฏੑ: ࣅ͍ͯΔݸਓ͸ಉ͡ධՁΛड͚Δ΂͖ •ूஂެฏੑ: ूஂ͝ͱʹධՁͷࠩผ͕ͳ͍Α͏ʹ͢Δ΂͖ •ूஂެฏੑ͕Α͘औΓ্͛ΒΕ͍ͯΔ •ݸਓͷྨࣅ౓Λ͖ͪΜͱఆٛ͢Δ͜ͱ͕ࠔ೉͔ͩͬͨΒ ΠϯτϩμΫγϣϯ ࠓճ͸ݸผެฏੑΛର৅ͱ͍ͯ͘͠

Slide 5

Slide 5 text

•දσʔλʹGBDTΛ༻͍Δͷ͕ओྲྀʹͳ͖͍ͬͯͯΔɽ •ैདྷͷFair-awarness MLͰ͸non-smoothͳϞσϧ΍
 ϊϯύϥϝτϦοΫMLͰ͸͋·Γྑ͍ޮՌ͕ಘΒΕͯ
 ͍ͳ͔ͬͨɽ ΠϯτϩμΫγϣϯ ޯ഑ϒʔεςΟϯάܾఆ໦(GBDT)Λର৅ͱ͢Δ

Slide 6

Slide 6 text

•ݸผެฏੑΛର৅ʹͨ͠GBDTʹΑΔख๏ΛఏҊͨ͠ɽ •ϞσϧͷͦΕͧΕͷެฏੑΛূ໌͢Δ͜ͱ͕Մೳɽ •ݸผެฏੑ͚ͩͰͳ͘ूஂެฏੑΛ޲্ͤͭͭ͞ɼਫ਼౓Λҡ࣋ ͢Δख๏ʹͳ͍ͬͯΔ͜ͱΛ࣮ݧతʹࣔͨ͠ɽ ΠϯτϩμΫγϣϯ ߩݙ

Slide 7

Slide 7 text

•ೖྗ: , ग़ྗ: •อޢ͢Δଐੑ: ͍ΘΏΔηϯγςΟϒଐੑ •αϯϓϧ͝ͱެฏࢦඪ: ͜Ε͸αϯϓϧ͕͍ۙ΄Ͳࣅ͍ͯΔ •໨ඪ: 
 ɹαϯϓϧ͝ͱʹެฏͳϞσϧ Λ֫ಘ͢Δ͜ͱ 𝒳 ∈ ℝd 𝒴 = {0,1} 𝒵 = 𝒳 × {0,1} dx f : 𝒳 → {0,1} ४උ ࢖͏ه߸Λఆٛ͢Δ

Slide 8

Slide 8 text

•ఢରֶशʹΑͬͯୡ੒͢Δํ๏͸ଘࡏ͍ͯ͠Δ • ֶश͕ೖྗʹରͯ͠׈Β͔Ͱ͋Δ͜ͱ͕લఏʹͳ͍ͬͯΔ •׈Β͔Ͱͳ͍Ϟσϧʢܾఆ໦ͱ͔ʣʹରͯ͠΋ఢରֶशΛߦ͑ ΔΑ͏ʹ͍ͨ͠ʂ • ੍ݶ෇͖ఢରతίετؔ਺Λఆٛͨ͠Αʂ طଘख๏͸Ͳ͏ͩͬͨͷʁ Non-smoothͳϞσϧͰ͸͏·͍͔͘ͳ͔ͬͨɽ

Slide 9

Slide 9 text

•Transport cost function: ݸผͷαϯϓϧ͕͍ۙ΄Ͳখ͍͞ •Zͷ֬཰෼෍্ͷ࠷ద༌ૹڑ཭ : ෼෍ͷۙ͞Λߟ͍͑ͯΔ c ((x1 , y1), (x2 , y2)) ≜ d2 x (x1 , x2) + ∞ ⋅ 1 {y1 ≠y2} W W (P1 , P2) ≜ inf Π∈C(P1 , P2) ∫ 𝒵×𝒵 c (z1 , z2) dΠ (z1 , z2) ४උ αϯϓϧ͝ͱʹެฏͳϞσϧΛֶश͍ͨ͠

Slide 10

Slide 10 text

• ͸σʔλੜ੒෼෍ɼ ͷඍখͳڐ༰ύϥϝʔλ •ඪຊ্ۭؒͰ1) σʔλੜ੒෼෍͕͍ۙ
 ɹɹɹɹɹ ɹ2) MLϞσϧͷଛࣦΛେ͖͘ͳΔ΋ͷ Λ୳͍ͨ͠ Lr (f) ≜ sup P:W(P, P* )≤ϵ 𝔼P [ℓ(f(X), Y)] P⋆ ϵ > 0 ४උ ఢରతϦεΫؔ਺Λఆٛ͢Δɽ

Slide 11

Slide 11 text

•ྨࣅͨ͠αϯϓϧʹରͯ͠ϞσϧͷੑೳࠩΛݟ͚ͭΒΕΔ •ੑೳࠩΛ୳ࡧ͢Δ͜ͱͰ෼෍ʹରͯ͠ؤ݈ͳެฏੑͩͱଊ͑ ΒΕΔɽ •ݱঢ়ͩͱ·ͩsmoothͳϞσϧͷޯ഑͔͠ಘΒΕͳ͍ɽ ४උ ϩόετͰެฏͳ෼෍Λಘ͍ͨʂ

Slide 12

Slide 12 text

•σʔληοτΛ֦ு͢Δ: •࠷ద༌ૹؔ਺ʹ੍ݶΛՃ͑Δ: ҧ͍͸্ͷσʔληοτ͔Ͳ͏͔ 𝒟0 ≜ {(xi , yi), (xi ,1 − yi)} n i=1 W𝒟 (P1 , P2) ≜ inf Π∈C0(P1 , P2) ∫ 𝒵×𝒵 c (z1 , z2) dΠ (z1 , z2) ఏҊख๏ ੍ݶΛՃ͑ͯnon-smoothͷͨΊʹ޻෉͢Δɽ

Slide 13

Slide 13 text

• σʔληοτΛՃ͑Δ͜ͱͰ্ք͸ ʹࢦࣔ͞Εͨ෼෍ʹ੍ݶ ͞ΕΔ •͜ΕʹΑͬͯ༗ݶ࣍ݩઢܗܭը๏ʹΑͬͯղ͚ΔΑ͏ʹͳΔɽ •ଛࣦ͸ ʹ͔͠ґଘͯ͠ͳ͍ ͔Βඇฏ׈ͳϞσϧͰ΋ద༻Ͱ͖Δɽ D0 ℓ (f (xi), yi) and ℓ (f (xi) ,1 − yi) ఏҊख๏ ΍ͬͱඇฏ׈ʹద༻Ͱ͖ΔΑ

Slide 14

Slide 14 text

ޯ഑ϒʔεςΟϯάͰ͸ ΛٻΊΔඞཁ͕͋Δɽ μϯεΩϯͷఆཧΛ༻͍Δͱޯ഑͸ɼ ∂L ∂ ̂ y ∂L ∂ ̂ yi = ∂ ∂f (xi) [ sup P:W𝒟(P, Pn)≤ϵ 𝔼P [ℓ (f (xi), yi)]] = ∑ y∈𝒴 ∂ ∂f (xi) [ℓ (f (xi), y)) P* (xi , y) ఏҊख๏ ޯ഑ϒʔεςΟϯά໦Ͱ΋࢖͑ΔΑ͏ʹ͢Δ

Slide 15

Slide 15 text

•ઌड़ͷޯ഑Ͱ͸ɼϞσϧΛඍ෼͢Δඞཁ͕ͳ͍͔Βඇฏ׈ͳϞ σϧͰ΋ؔ਺ޯ഑ΛධՁ͢Δ͜ͱ͕Ͱ͖Δʂ •͋ͱ͸ ΛٻΊΕ͹ྑ͍ɽ •ઢܗܭը๏ʹΑͬͯ ΛٻΊΔํ๏ΛఏҊ͢Δɽ P⋆ P⋆ ఏҊख๏ ؔ਺ޯ഑Λߟ͑Δ

Slide 16

Slide 16 text

• ʹΑΔ೚ҙͷ෼෍ ʹରͯ͠ɼ ͱ͢Δͱ ͸࣍ͷΑ͏ͳߦྻ ͰදͤΔɽ 1. 2. D0 P Pi,k = P({(xi , k}), k ∈ {0,1} WD (P, Pn ) ≤ ϵ Π Π ∈ Γ with Γ = {Π ∣ Π ∈ ℝn×n + , ⟨C, Π⟩ ≤ ϵ, ΠT ⋅ 1n = 1 n 1n} Π ⋅ y1 = (P1,1 , …, Pn,1), and Π ⋅ y0 = (P1,0 , …, Pn,0) ఏҊख๏ Λઢܗܭը๏ͰٻΊΔ P⋆

Slide 17

Slide 17 text

•ߦྻ ɹˠ ϥϕϧjͰ͋Δαϯϓϧj͕αϯϓϧiʹ ͳͬͨͱ͖ͷଛࣦ •ٻΊ͍ͨߦྻ ͸࣍ͷΑ͏ʹͳΔ Ri,j = l(f(xi ), yj ) Π⋆ Π⋆ ∈ arg max Π∈Γ ⟨R, Π⟩ ఏҊख๏ ͞Βʹఆ͍ٛͯ͘͠Α

Slide 18

Slide 18 text

•݁ہ࠷ޙͷ ΛٻΊΔ͜ͱ͕Ͱ͖Ε͹ྑ͍ɽ •ٻΊΔʹ͋ͨͬͯɼؔ਺Fʹ͸Կ΋ԾఆΛஔ͍͍ͯͳ͍ͷͰɼඇ ฏ׈ͳؔ਺ʹ΋ద༻Ͱ͖Δɽ Π⋆ ఏҊख๏- ·ͱΊ ͜ΕͰඇฏ׈ͳؔ਺ʹ΋ద༻Ͱ͖Δʂ

Slide 19

Slide 19 text

•3ͭͷσʔληοτ(German Credit, Adult, COMPASS)Ͱݕূ •ఏҊख๏Ͱ༻͍Δܾఆ໦ΞϧΰϦζϜ͸ɼXGBoostͱ͢Δɽ •ଛࣦؔ਺͸ϩδεςΟοΫଛࣦΛ༻͍Δɽ ࣮ݧ

Slide 20

Slide 20 text

•YurochikinΒͷΛར༻͢Δ: •Q͸ηϯγςΟϒ෦෼ۭؒͱ௚ߦ͢ΔࣹӨߦྻ •อޢ͞ΕΔηϯγςΟϒଐੑҎ֎ͷ৘ใ͕ಉ͡ͳΒಉ౳ʹѻΘ ΕΔ΂͖Ͱ͋Δͱ͍͏ߟ͔͑Β࡞ΒΕͨɽ d2 x = (x1 − x2 , Q(x1 − x2 )) ࣮ݧ ެฏੑࢦඪʹ͍ͭͯ(ݸผͷαϯϓϧʹؔͯ͠)

Slide 21

Slide 21 text

•ܾఆ໦ख๏ʹؔͯ͠͸ɼର߅͕ͳ͍ͨΊόχϥΛ༻͍Δɽ •σʔλͷલॲཧΛ༻͍Δख๏ͱൺֱ͢Δ •อޢଐੑΛͳ͘͠ɼ෦෼ۭؒʹ౤Ө͢Δ(Yurochkin et al., 2020) •ݸਓʹҟͳΔॏΈΛద༻ͯ͠όϥϯεΛͱΔ(Kamiran & Calders, 2011) ࣮ݧ ର߅ख๏ʹ͍ͭͯ

Slide 22

Slide 22 text

•อޢ͞ΕΔ͍ͯΔଐੑͱ૬͕ؔ͋Δଐੑ(e.g. ෉͔?࠺͔?)ΛͣΒ ͢͜ͱͰ൓ࣄ࣮ͷਓ෺Λ࡞੒ɽ •→΄΅ಉ͡ਓ෺͔ͩΒಉ͡ධՁΛ͞ΕΔ΂͖ •อޢଐੑ͝ͱͷTPR,TNRͷࠩ(GAPMax)→Ϟσϧͷެฏੑࢦඪ •อޢଐੑ͝ͱͷRMSEͷࠩ(GAPRMSE) →Ϟσϧͷ༧ଌੑೳ ࣮ݧ ධՁʹ͍ͭͯ(طଘख๏ʹର͠༏ྼ͕ͳ͍Α͏ʹՃ޻Λ͢Δ)

Slide 23

Slide 23 text

•೥ྸΛηϯγςΟϒଐੑʹઃఆ → ถࠃͰ͸೥ྸΛ͚ͭͯ༩৴൑ அ͢Δͷ͸ҧݑ •ࣹӨʹΑΔલॲཧ͸ఏҊ΄ͲݸਓͷެฏੑΛ޲্ͤ͞ͳ͔ͬͨɽ ࣮ݧ݁Ռ ᶃ German Credit

Slide 24

Slide 24 text

•ఏҊख๏͸GBDTͷੑೳͷྑ͞ΛҾ͖ܧ͗ͭͭɼެฏͳϞσϧʹ ͳ͍ͬͯͨʂ ࣮ݧ݁Ռ ᶄ Adult

Slide 25

Slide 25 text

•NNϞσϧͷํ͕ਫ਼౓͸جຊతʹྑ͔ͬͨɽ •͔͠͠ެฏੑʹ͍ͭͯ͸ɼఏҊͷํ͕ྑ͔ͬͨɽ ࣮ݧ݁Ռ ᶅCOMPASS

Slide 26

Slide 26 text

•ݸผެฏੑΛୡ੒͢Δ՝୊ΛMLϞσϧͷੑೳࠩΛ୳ࡧͰ͖ͳ͍ ͜ͱ → ୳ࡧۭؒΛ༗ݶ۠ؒʹ੍ݶ͢Δ͜ͱͰࠀ෰ͨ͠ɽ •ࠓճઃఆ੍ͨ͠ݶ෇͖ఢରଛࣦؔ਺͸ଞͷnon-smoothख๏(ϥϯ μϜϑΥϨετ)ͳͲʹ΋ద༻Ͱ͖Δ͔΋͠Εͳ͍ɽ •࣮ײͱͯ͠ɼNNϞσϧΑΓ΋ܾఆ໦ϕʔεͷ΄͏͕ਫ਼౓ʴެฏ ੑΛୡ੒Ͱ͖ͦ͏. ·ͱΊ ݸผެฏੑʴܾఆ໦ͷख๏ΛఏҊͨͧ͠

Slide 27

Slide 27 text

•࡞ऀ͕͍ࣔͯ͠Δཧ࿦ΛͪΌΜͱཧղͰ͖ͳͯ͘͘΍͍͠ɽ •ݸผެฏੑΛߟ͍͑ͯΔ࿦จΛಡΊͯྑ͔ͬͨɽ ײ૝ ͖ͪΜͱཧ࿦Λ௥͑Δ਺ֶྗ͕ཉ͍͠