$30 off During Our Annual Pro Sale. View Details »

Image Inspired Poetry generation

Image Inspired Poetry generation

Image Inspired Poetry generation
"以看圖寫詩為例的創意文本生成" present on R-Ladies @LINE office 20190729 by Johnson Wu. Event page: https://www.meetup.com/rladies-taipei/events/262838107/

LINE Developers Taiwan
PRO

July 29, 2019
Tweet

More Decks by LINE Developers Taiwan

Other Decks in Programming

Transcript

  1. - - - -

    View Slide

  2. Agenda
    u C
    u
    u -
    u

    View Slide

  3. !# (Bernardi et al. 2016)
    u 1. !
    u 2. !
    u 3. !
    u "

    1. Patterson, G., Xu, C., Su, H., and Hays, J. The sun attribute databased: Beyond categories for deeper scene understanding. International Journal of Computer Vision 108(1-2):59-81. 2014
    2. Devlin, J., Cheng, H., Fang, H., Gupta, S., Deng, L., He, X., Zweig, G., and Mitchell, M. Language models for image captioning: The quirks and what works. arXiv preprint arXiv:1505.01809. 2015
    3. Socher, R., Karpathy, A., Le, Q. V., Manning, C. D., and Ng, A. Y. Grounded compositional semantics for finding and describing images with sentences. TACL 2:207–218. 2014
    4. Soto, A. J., Kiros, R., Keselj, V., and Milios, E. E. Machine learning meets visualization for extracting insights from text data. AI Matters 2(2):15-17. 2015
    5. Karpathy, A., and Fei-Fei, L. Deep visual-semantic alignments for generating image descriptions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3128–3137. 2015
    6. Donahue, J., Anne Hendricks, L., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., and Darrell, T. Long-term recurrent convolutional networks for visual recognition and description. In Proceedings of the IEEE
    conference on computer vision and pattern recognition, 2625–2634. 2015
    7. Schwarz, K., Berg, T. L., and Lensch, H. P. Autoillustrating poems and songs with style. In Asian Conference on Computer Vision, 87–103. 2016

    View Slide

  4. From Captions to Image Concepts and
    Back (Fang et el, 2014)
    u 2014 MSCOCO rank #1

    View Slide

  5. Classic Image caption generator
    (Vinyals et al, 2014.)
    Also 2014 MSCOCO rank #1

    View Slide

  6. Attention added (Xu et el, 2015.)
    u Attention weight is calculated from decoder hidden state h and encoder features .
    u the context vector z is calculated from the attention.
    u Output and new hidden state is generated from the context vector z and hidden state.

    View Slide

  7. Input higher level of image features
    (Chen et el, 2015.)
    u Reconstruct visual representation
    u Stronger connection between images and texts

    View Slide

  8. AI
    u
    u AI
    u

    View Slide



  9. u

    u

    u

    View Slide

  10. 1950 Stochastische Texte .
    2
    2
    1
    2 .
    (Liu et al. 2018)
    P(Wt
    | wt-1:1
    ) =P(w1)*P(w2
    |w1
    )*P(w3
    |w1
    w2
    )….)

    View Slide

  11. /
    u
    u ?
    u /
    u /
    u Perplexity, BLEU score /

    View Slide


  12. u
    u

    u
    https://poem.msxiaobing.com/

    View Slide


  13. AI



    View Slide

  14. -
    u "2!
    ($
    "%$
    ),.$
    3
    460$#
    u '$1&
    ($
    5*
    /$+

    View Slide

  15. u




    View Slide

  16. &RK,.1
    u !# \QA`L
    WG4Z?A+NL
    W>-AE3
    BMX@
    0(
    BW4Z?A+_4
    P)A]
    ^O8UUA;9
    !# \QA`L
    u !# J,.1ISA aR
    <$2"7DV
    '=CAT :
    bA 62"7
    [Y95Ad%
    <$*F/
    !# HcA

    View Slide

  17. Case Study: 1 9 ?
    u : ) 2
    u 0 ( 8 )
    u 9 5 :
    2 9 8

    View Slide

  18. 7B,1'4-8
    u A=)
    u A$&@!?<
    u @!=8
    u +,>90C "6,/(
    # ,;:
    u +25 ,'3*.%8

    View Slide

  19. =0!"6$
    1. 42:9
    2. 56)
    0-';" 87 0"/
    3. *56 <#%&31,+( .0
    0.975
    0.005
    - 0.005
    0.004
    ' 0.003
    ( 0.586
    0.109
    $% 0.053





    +



    (


    !
    (%
    !&% )*+%
    .
    #/,%"










    View Slide

  20. :
    u
    u (pretrained on Krizhevesky et al, 2012.)
    u Input: image pixels, output: label Noun or Adj; two CNN models.
    u Generate P(C | I) = f(Wc
    ∗ I), Wc
    : CNN’s parameters, I: input image,
    *: operations of convolution, pooling, activation.
    u Train via BPTT




    View Slide

  21. f
    u 0
    r:
    : 174
    : 4
    : 0
    :41
    : 37
    :
    e d
    → v
    [ ]
    2 20
    w2

    co 0 e
    0
    by similarity

    View Slide

  22. c ]: hg
    u a n b . ) a h
    ) ) ) N X b
    u N R
    c [ : R b
    c [ e: Rb o
    c [ r(:e Rbo d
    c [ e Rbod (
    [ X EbN? -- i

    View Slide

  23. .3*
    /1=;&
    u &!(-B JJ
    u 6 1&8@G!1I71E:
    u +2E! n-gram 1:A. E.g 4-gram:
    u 0?7 n 41<4: 4-gram:1X
    u #<4E1B ",(-B
    u 1+ C =1C B >5%1(-$30%
    u +)DH(B→'GF$9$5%(with beam search)
    1







    View Slide

  24. i
    u rn b a -
    u g:brn rn i
    i rn m
    b
    b /-
    b i
    b b a b
    a
    2 b i
    / b a b a

    View Slide

  25. >:'1;7 --- =,
    u =1=H"5<>1 0<>B+6&





    u )## “I”+”” 1)." “8”+””,
    (1 P(“I”) > P(“I"”)
    '#A</19
    i.e [23G-], [23]
    @C>:'1%
    ;7))#<1*4 E E? F!$1=
    D

    View Slide

  26. ( m )?
    u
    u -: g s d - R s I
    u N e g A
    u
    u n ( )
    u r a
    b

    View Slide

  27. 74'-61 (charRNN-61)
    -:<
    "5/=)
    !
    u #30. 02 ($5/)
    u :<&%?-8' !,#
    ";/
    u :<
    0.8@+
    u (61
    *->$
    u 961

    View Slide

  28. CharRNN
    u RNN model (
    CharRNN model
    u ( )
    u

    View Slide

  29. View Slide

  30. 5";/-@F

    u F
    G.5"7QS
    u 4P-5"N&,5"E7G
    7>O
    7K= 7K= D I8H*C
    7R'J2 79'J2 1!
    ?7#%067A:
    $73
    B
    (3<7
    L) L) +M

    View Slide

  31. PU
    u ;1#I U"HG0
    C3\?
    u H`I C#SK3UB7-:A[*Q*HIV_
    ,L8A[a
    u 6 C#IUà '
    TNIWR<
    u %2/B7IX] #à$O@JIXY]
    u #IU'#SK2I>F()I@9.!I
    +
    u HV5Z ED
    u HI^/5&49=IM:

    View Slide

  32. b
    u r g nk
    u 4
    ma- >> m ma…
    - >>
    - >>
    “ ma- >> - - -ma …
    - - ….
    Θ > pi N–gram LM
    ma|Θ >P( |Θ) >P( |Θ)
    b

    View Slide

  33. N
    u N-gram N R
    N
    u
    R
    u

    View Slide

  34. k[4p7a_P^u fhT%Pbinary classifiera_Pk[0Y0 (OoB]P
    900a_l-N_E=#KJPB])G`HRLI78%
    u !>+a_M6U@fhVmVe:,,8pQ P*
    u Q U@o'21F→nP_a_/\PNc
    u Q U@J9"A
    a$→aIbaP W4p5j
    N
    u fhg3ZSiX.
    &d 5)_Pa=?0P
    u YaQDP;4
    YM6U@QDP(C4

    View Slide

  35. +$: 1:('
    u (=7? /*.
    u D4+2-3$1(%:9
    u ,80(;6"-!?B3B(' 2@
    &
    '
    u >F(A<
    u 1'C(GE )5H#

    View Slide

  36. 8
    u 8010
    u
    0D4
    u <
    u :?=
    u

    u &
    u 3628
    u General guideline
    /9<+$*>')
    $%@*>"F')0A
    %#C;E.7A(
    1+6,D
    u -680D4
    850
    u A/B test, latin square 0!B

    View Slide

  37. u
    u ) 2
    ( 2 )
    u 7
    ( )
    u : (
    u ) 0” ”

    ()
    :
    1

    View Slide

  38. "0*23
    u !'*%+)transformer RNN)
    u ,
    -*$ /*,#
    u ,&-(.*-
    u !3*0 1

    View Slide

  39. Two sentences to conclude
    u AI AI
    u

    View Slide

  40. We are hiring !
    u Data Scientist
    u Data Engineer
    u Data Analyst
    u NLP Engineer

    View Slide

  41. Thank you
    QA
    [email protected]

    View Slide