Upgrade to Pro — share decks privately, control downloads, hide ads and more …

XGBoost: A Scalable Tree Boosting System_free

LiberalArts
October 20, 2019

XGBoost: A Scalable Tree Boosting System_free

XGBoost論文の解説用資料の公開版です。
詳細版を確認されたい方は、下記よりご購入いただけますのでご確認ください。
https://note.mu/lib_arts/n/nefb511ba4fde

LiberalArts

October 20, 2019
Tweet

More Decks by LiberalArts

Other Decks in Technology

Transcript

  1. Copyright @ Liberal Arts Community. All Rights Reserved. XGBoost: A

    Scalable Tree Boosting System 1 SOK@LiberalArtsCommunity
  2. Copyright @ Liberal Arts Community. All Rights Reserved. ໨࣍ •

    ࣗݾ঺հ • ࿦จ֓؍ • ࿦จৄࡉ • ܾఆ໦(CART)ͷ෮श • Tree Boosting in a Nutshell • Split Finding Algorithms • System Design • ࢀߟจݙ 2
  3. Copyright @ Liberal Arts Community. All Rights Reserved. ࣗݾ঺հ twitter:

    @sokei14 ౦ژେֶେֶӃ਺ཧՊֶݚڀՊम࢜՝ఔमྃɻઐ໳͸ෳૉزԿֶɻ ͦͷޙɺϝΨόϯΫͰΫΦϯπͱͯ͠ࢢ৔ϦεΫ؅ཧۀ຿ʹैࣄɻ ݱࡏ͸ϕϯνϟʔͰAI༥ࢿ৹ࠪϞσϧͷ։ൃʹܞΘΔɻAIͰۚ༥αʔϏεͷ มֵΛເݟΔػցֶशΤϯδχΞɻ 4
  4. Copyright @ Liberal Arts Community. All Rights Reserved. ABSTRACT •

    ͜ͷ࿦จͰ͸XGBoostͱݺ͹ΕΔεέʔϥϒϧ͔ͭend-to-endͳπϦʔϒʔεςΟϯάΞϧΰϦζϜΛ ঺հ͢Δɽ • ఏҊ͢Δख๏ͱͯ͠ҎԼ͕ڍ͛ΒΕ͍ͯΔɽ 1. sparcity-aware-algorithm, weighted quantile sketch → 3ষͰઆ໌ 2. cache-aware access, data compression and shading → 4ষͰઆ໌ 6
  5. Copyright @ Liberal Arts Community. All Rights Reserved. ࿦จ֓؍ 6ͭͷষͰߏ੒͞Ε͍ͯ·͢ɻ

    1. INTRODUCTION 2. TREE BOOSTING IN A NUTSHELL 3. SPLIT FINDING ALGORITHMS 4. SYSTEM DESIGN 5. RELATED WORKS 6. END TO END EVALUATIONS ͕͜͜ϝΠϯ 7
  6. Copyright @ Liberal Arts Community. All Rights Reserved. ࿦จ֓؍ 2.

    TREE BOOSTING IN A NUTSHELL XGBoostͷίΞͱͳΔΞϧΰϦζϜʹ͍ͭͯ·ͱΊΒΕ͍ͯΔɽ • tree boostingͷΞϧΰϦζϜͷղઆ • ςΠϥʔల։ʹΑΔϩεؔ਺ͷۙࣅ • Shrinkage • Column Subsampling 8
  7. Copyright @ Liberal Arts Community. All Rights Reserved. ࿦จ֓؍ 3.

    SPLIT FINDING ALGORITHMS XGBoostʹ͓͚Δ෼ׂ఺୳ࡧͷ޻෉఺ʹ͍ͭͯड़΂ΒΕ͍ͯΔɽ • جຊͱͳΔExact Greedy Algorithmͷղઆ • ෼ׂީิ఺Λߜͬͯ୳ࡧʢApproximate Algorithmʣ • ॏΈ෇͖෼Ґ఺ͷ࠾༻ • ॏΈ෇͖෼Ґ఺ͷࢉग़ͷߴ଎ԽʢWeighted Quantile Sketchʣ • ܽଛσʔλʹରͯ͠͸default directionΛ࠾༻ʢSparcity-aware Split Findingʣ 9
  8. Copyright @ Liberal Arts Community. All Rights Reserved. ࿦จ֓؍ 4.

    SYSTEM DESIGN XGBoostʹ͓͚ΔγεςϜଆͷ޻෉఺ʹ͍ͭͯड़΂ΒΕ͍ͯΔɽ • ιʔτͷܭࢉίετͷ࡟ݮʢColumn Block for Parallel Learningʣ • CSCʹΑΔεύʔεߦྻσʔλѹॖ • σʔλͷϒϩοΫԽ • ܭࢉྔͷൺֱ • ޯ഑৘ใͷϓϦϑΣονʢCache-aware Accessʣ • ϒϩοΫαΠζͷ࠷దԽ • σΟεΫIOͷεϧʔϓοτ޲্ʢϒϩοΫѹॖɾ ϒϩοΫஅยԽʣ 10
  9. Copyright @ Liberal Arts Community. All Rights Reserved. ࿦จৄࡉɹͦͷલʹ…ܾఆ໦ͷ෮श ▪

    ܾఆ໦ʢCARTʣͱ͸ ͋Δಛ௃࣠ͱᮢ஋ͷେখؔ܎ͷ൑அͷ૊Έ߹ΘͤͰ෼ྨ໰୊΍ճؼ໰୊Λղ ͘ΞϧΰϦζϜͷ͜ͱɽ ܾఆ໦ͷ͏ͪɼԼਤͷΑ͏ʹඞͣೋ෼͞ΕΔ΋ͷΛCARTͱ͍͏ 12
  10. Copyright @ Liberal Arts Community. All Rights Reserved. ࿦จৄࡉɹͦͷલʹ…ܾఆ໦ͷ෮श ▪

    ܾఆ໦ʢCARTʣͱ͸ ϊʔυͱϊʔυΛ݁ͿϦϯΫ͔Βߏ੒͞Ε͍ͯΔɽϊʔυʹ͍ͭͯ͸໦ͷͲ ͷ෦෼ʹҐஔ͢ΔʹΑͬͯ࣍ͷΑ͏ʹ۠ผ͞Ε͍ͯΔɽ ໊લ ҙຯ ࠜϊʔυ ໦ͷҰ൪্ʹ͋Δϊʔυ ༿ϊʔυʢϦʔϑʣ ໦ͷҰ൪Լʹ͋Δϊʔυ ಺෦ϊʔυ ࠜϊʔυͱ༿ϊʔυҎ֎ͷϊʔυ ༿ϊʔυ ࠜϊʔυ ಺෦ϊʔυ ϦϯΫ 13