$30 off During Our Annual Pro Sale. View Details »

KDD 2016勉強会/Images Don’t Lie: Transferring Deep Visual Semantic Features to Large-Scale Multimodal Learning to Rank

tn1031
October 01, 2016

KDD 2016勉強会/Images Don’t Lie: Transferring Deep Visual Semantic Features to Large-Scale Multimodal Learning to Rank

KDD 2016勉強会, 2016/10/01

tn1031

October 01, 2016
Tweet

More Decks by tn1031

Other Decks in Science

Transcript

  1. Images Don’t Lie: Transferring Deep Visual
    Semantic Features to Large-Scale Multimodal
    Learning to Rank
    @tn1031

    2016/10/01, KDD 2016ษڧձ

    View Slide

  2. ঺հ͢Δ࿦จ
    Images Don’t Lie: Transferring Deep Visual Semantic Features to
    Large-Scale Multimodal Learning to Rank
    Corey Lynch, Kamelia Aryafar, Josh Atterberg
    http://www.kdd.org/kdd2016/papers/files/adp0804-lynchA.pdf
    • Etsy(ϋϯυϝΠυͷϚʔέοτϓϨΠε)ͷதͷਓ
    • ECαΠτͷݕࡧ݁ՌΛϥϯΩϯάֶशʹΑͬͯ࠷దԽ͢Δ
    • ঎඼ৄࡉͷϚϧνϞʔμϧͳಛ௃Λֶशʹ༻͍Δ
    2

    View Slide

  3. ࣗݾ঺հ
    • தଜ ୓ຏ / @tn1031
    • σʔλαΠΤϯςΟετ
    • VASILY, Inc.
    • Machine LearningΛ׆༻ͨ͠αʔϏε։ൃ
    3
    • ྨࣅը૾ݕࡧ
    • Convolutional Auto-Encoder, Approximate Nearest Neighbor
    • Ϩίϝϯυ
    • Collaborative Filtering, Matrix Factorization
    • ͪΐͬͱมΘͬͨΞΠςϜݕࡧ
    • Conditional VAE-GAN
    Machine LearningΛ׆༻ͨ͠αʔϏεͷྫ

    View Slide

  4. Images Don’t Lie: Transferring Deep Visual
    Semantic Features to Large-Scale Multimodal
    Learning to Rank

    View Slide

  5. എܠ
    ΞΠςϜͷdescription͸ϊΠζΛؚΉͨΊݕࡧͷ࣭͕௿Լ͢Δ
    5
    • λΠτϧ΍descriptionʹ ”wedding dress” ΛؚΉɹɹ
    ΞΠςϜ͕ώοτ͢Δ
    • ࣮ࡍ͸ wedding dress ͱ͸ؔ܎ͳ͍ΞΠςϜ͕ɹ
    େྔʹؚ·ΕΔ
    • ϊΠζͷݪҼ
    • ग़඼ऀ͕ಠࣗʹ෇༩͢Δ
    • τϥϑΟοΫΛՔ͙ͨΊ
    “wedding dress” ͷݕࡧ݁Ռ
    ը૾ΛݟΕ͹ wedding dress Ͱͳ͍͜ͱ͸໌Β͔

    View Slide

  6. ఏҊख๏
    ը૾ͱςΩετ྆ํΛར༻ͯ͠ݕࡧਫ਼౓Λվળ͢Δ
    6
    Multimodal Listing Embedding
    • ը૾ͱςΩετͷϕΫτϧԽ
    • ը૾͸CNNͰϕΫτϧԽ
    • ΞΠςϜʹ෇ਵ͢ΔςΩετ৘ใΛBoWͰϕΫτϧԽ
    • ը૾༝དྷͷϕΫτϧͱςΩετ༝དྷͷϕΫτϧΛconcat
    Learning To Rank
    • 2஋൑ผ໰୊ͱͯ͠ఆࣜԽ
    • ΫΤϦ͝ͱʹRankingSVMΛద༻ͯ͠ϥϯΩϯάΛٻΊΔ

    View Slide

  7. ಛ௃நग़ɿMultimodal Listing Embedding
    ը૾ͱςΩετΛҟͳΔख๏ͰϕΫτϧԽͯ͠concat͢Δ
    7
    ը૾
    • ImageNetͰֶशͤͨ͞VGG19
    • Ϟσϧ͸શΫΤϦڞ௨
    • fine-tuning͠ͳ͍
    • ࠷ऴखલͷfc૚ͷग़ྗ4096࣍ݩ
    Λಛ௃ྔͱ͢Δ
    ςΩετ
    • ز͔ͭͷཁૉͷBoW
    • ΞΠςϜID,γϣοϓID,λΠτϧ,
    λά
    • unigram,bigram

    View Slide

  8. ֶशɿLearning To Rank
    ΫΤϦʹର͢Δؔ࿈౓Λࢉग़͢Δؔ਺Λranking໰୊ͱֶͯ͠श͢Δ
    8
    pairwise preference approach
    ΫΤϦ q ɼؔ܎͋ΔΞΠςϜ d+ ɼؔ܎ͳ͍ΞΠςϜ d- ʹ͍ͭͯɼɹ
    ҎԼͷΑ͏ͳؔ਺ fq
    Λֶश͢Δ
    fq(d+) > fq(d )
    min
    w
    m
    X
    i=1
    max(1 yi
    h
    xi, w
    i
    , 0) + 1
    ||
    w
    ||1
    + 2
    ||
    w
    ||2
    RankingSVMͱͯ͠ɼҎԼͷ࠷খԽΛղ͘
    (
    xi, yi) =
    (
    d
    +
    d ,
    +1
    d d
    +
    ,
    1
    a well-ordered pair
    a poorly ordered pair

    View Slide

  9. σʔλऩू
    ΫΤϦʹର͢Δؔ࿈ੑͷ༗ແ͸Ϣʔβʔͷߦಈ͔Βऔಘ͢Δ
    9
    pairwiseͷऔಘ
    • ݕࡧ݁Ռʹ͍ͭͯpositiveͳ
    ൓Ԡ͕ಘΒΕͨΞΠςϜͱɹ
    ಘΒΕͳ͔ͬͨΞΠςϜɹɹ
    ͻͱͭΛબ୒͢Δ
    ΞΠςϜ͔Βಛ௃நग़
    • ֤ΞΠςϜͷಛ௃ྔΛɹɹɹ
    ͦΕͧΕநग़͢Δ
    ਖ਼ྫ/ෛྫͷऔಘ
    • pos - neg ͕ਖ਼ྫɺٯ͕ෛྫ

    View Slide

  10. ධՁࢦඪ
    ݕࡧ݁Ռͷྑ͞ΛnDCGͰධՁ͢Δ
    10
    Normalized Discounted Cumulative Gain (nDCG)
    • ϥϯΩϯάʹର͢ΔධՁࢦඪ
    • [0, 1]Ͱେ͖͍΄Ͳྑ͍
    • ্ҐͷΞΠςϜ͕ΑΓධՁ͞ΕΔΑ͏ͳ܏͕͍͍ࣼͭͯΔ
    1Ґ͔Β p Ґ·ͰͷϥϯΩϯάΛߟ͑Δ
    nPCGp =
    DCGp
    idealDCGp
    DCGp =
    p
    X
    i=1
    2
    reli
    1
    log2(
    i
    + 1)

    View Slide

  11. σʔληοτ
    Etsyͷݕࡧϩά͔Βऔಘ͢Δ
    11
    σʔλऔಘ
    • औಘظؒɿ2िؒ
    • σʔλ਺ɿ8.82 million training preference pairs
    ɹɹɹɹɹɹ1.9 million validation sessions
    ɹɹɹɹɹɹ1.9 million test sessions
    • ΫΤϦ਺ɿ1394 total queries
    औಘ࣌ͷ޻෉
    • FairPairs method
    • όΠΞεͷӨڹΛܰݮ͢Δख๏

    View Slide

  12. ݁Ռ
    12
    ςΩετͷΈͷ৔߹ͱൺֱͯ͠ฏۉ1.7ˋͷਫ਼౓վળʹ੒ޭ
    nDCGͷ஋Ͱൺֱ
    • શ1394ΫΤϦͷ͏ͪɼ 51.4% ʹ͍ͭͯnDCGͷվળ͕ݟΒΕͨ
    • ฏۉ͢Δͱ
    • text onlyɿbaseline
    • image onlyɿ2.2%ѱԽ
    • multimodalɿ1.7%վળ

    View Slide

  13. ݁Ռ
    13

    View Slide

  14. ݁Ռ
    14
    ςΩετʹϊΠζؚ͕·ΕΔΞΠςϜʹ͍ͭͯվળ͕ݟΒΕΔ
    ࠨɿtext model
    ӈɿmultimodal model

    View Slide

  15. ·ͱΊ
    ը૾ͱςΩετ྆ํΛར༻ͯ͠ݕࡧਫ਼౓Λվળ͢Δ
    15
    ओு
    • ը૾ಛ௃͸ݕࡧ݁Ռͷ࠷దԽʹ༗ޮ
    • textͰදݱ͖͠Ε͍ͯͳ͍৘ใΛѻ͏͜ͱ͕Ͱ͖Δ
    • ࣮ࡍʹݕࡧ݁Ռ͕վળ͞Εͨ
    ݸਓతͳٙ໰ͳͲ
    • textͷಛ௃நग़͕BoW
    • ෼ࢄදݱʗjoint model͸Ͳ͏ͳΔ͔ؾʹͳΔ
    • ੒ޭใुܕͷϏδωεϞσϧͷͱ͖͸஫ҙ͕ඞཁ
    • ݕࡧ݁Ռͷ࠷దԽ͕ച্ͷ࠷దԽʹͳΔͱ͸ݶΒͳ͍

    View Slide

  16. EOF.

    View Slide