Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Image Inspired Poetry generation

Image Inspired Poetry generation

Image Inspired Poetry generation
"以看圖寫詩為例的創意文本生成" present on R-Ladies @LINE office 20190729 by Johnson Wu. Event page: https://www.meetup.com/rladies-taipei/events/262838107/

2102a6b8760bd6f57f672805723dd83a?s=128

LINE Developers Taiwan
PRO

July 29, 2019
Tweet

Transcript

  1. - - - -

  2. Agenda u C u u - u

  3. !# (Bernardi et al. 2016) u 1. ! u 2.

    ! u 3.  ! u  "   1. Patterson, G., Xu, C., Su, H., and Hays, J. The sun attribute databased: Beyond categories for deeper scene understanding. International Journal of Computer Vision 108(1-2):59-81. 2014 2. Devlin, J., Cheng, H., Fang, H., Gupta, S., Deng, L., He, X., Zweig, G., and Mitchell, M. Language models for image captioning: The quirks and what works. arXiv preprint arXiv:1505.01809. 2015 3. Socher, R., Karpathy, A., Le, Q. V., Manning, C. D., and Ng, A. Y. Grounded compositional semantics for finding and describing images with sentences. TACL 2:207–218. 2014 4. Soto, A. J., Kiros, R., Keselj, V., and Milios, E. E. Machine learning meets visualization for extracting insights from text data. AI Matters 2(2):15-17. 2015 5. Karpathy, A., and Fei-Fei, L. Deep visual-semantic alignments for generating image descriptions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3128–3137. 2015 6. Donahue, J., Anne Hendricks, L., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., and Darrell, T. Long-term recurrent convolutional networks for visual recognition and description. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2625–2634. 2015 7. Schwarz, K., Berg, T. L., and Lensch, H. P. Autoillustrating poems and songs with style. In Asian Conference on Computer Vision, 87–103. 2016
  4. From Captions to Image Concepts and Back (Fang et el,

    2014) u 2014 MSCOCO rank #1
  5. Classic Image caption generator (Vinyals et al, 2014.) Also 2014

    MSCOCO rank #1
  6. Attention added (Xu et el, 2015.) u Attention weight is

    calculated from decoder hidden state h and encoder features . u the context vector z is calculated from the attention. u Output and new hidden state is generated from the context vector z and hidden state.
  7. Input higher level of image features (Chen et el, 2015.)

    u Reconstruct visual representation u Stronger connection between images and texts
  8. AI  u  u AI    

    u   
  9.   u u  u 

  10. 1950 Stochastische Texte . 2 2 1 2 . (Liu

    et al. 2018) P(Wt | wt-1:1 ) =P(w1)*P(w2 |w1 )*P(w3 |w1 w2 )….)
  11. / u u ? u / u / u Perplexity,

    BLEU score /
  12.  u   u   u  

    https://poem.msxiaobing.com/
  13.    AI    

  14. -   u " 2!  ($ "%$ 

    ),.$ 3 460$# u ' $1& ($ 5* /$+
  15. u      

  16. &RK,.1 u !# \QA`L WG4Z?A+NL W>-AE3 BMX @ 0( BW4Z?A+_4

    P) A] ^O8UUA;9 !# \QA`L u !# J,.1ISA aR <$2"7DV '=CAT : bA 62"7 [Y95Ad% <$*F/ !# HcA 
  17. Case Study: 1 9 ? u : ) 2 u

    0 ( 8 ) u 9 5 : 2 9 8
  18. 7B,1'4-8 u  A=) u A$&@!?< u @!=8 u +,>90C

    "6,/( # ,;: u +25 ,'3*. %8
  19. =0!"6$ 1. 42:9 2. 56) 0-';" 87 0"/ 3. *56

    <#%&31, +( .  0   0.975  0.005 - 0.005 0.004 ' 0.003 ( 0.586  0.109  $% 0.053     +    (   !  (% !&% )*+% . #/,%"          
  20.  :  u   u  (pretrained on

    Krizhevesky et al, 2012.) u Input: image pixels, output: label Noun or Adj; two CNN models. u Generate P(C | I) = f(Wc ∗ I), Wc : CNN’s parameters, I: input image, *: operations of convolution, pooling, activation. u Train via BPTT   …
  21. f u 0 r: : 174 : 4 : 0

    :41 : 37 : e d → v [ ] 2 20 w2    co 0 e   0 by similarity
  22. c ]: hg u a n b . ) a

    h ) ) ) N X b u N R c [ : R b c [ e: Rb o c [ r(:e Rbo d c [ e Rbod ( [ X EbN? -- i
  23. .3* /1=;& u &!(-B JJ u 6  1&8@G!1I71E: u

    +2E! n-gram 1:A. E.g 4-gram: u 0?7 n 41<4: 4-gram:1X u #<4E1B ",(- B u 1+ C =1C B > 5%1(-$30% u +)DH(B→'GF$9$5%(with beam search) 1      
  24. i u rn b a - u g:brn rn i

    i rn m b b /- b i  b b a b a 2 b i  / b a b a 
  25. >:'1;7 --- =, u = 1=H"5<>1 0<>B+6&   

                       u )## “I”+”” 1)." “8”+””, (1 P(“I”) > P(“I"”) '#A</19 i.e [23G-], [23] @C>:'1% ;7))#<1*4 E E? F!$1= D
  26. ( m )? u u -: g s d -

    R s I u N e g A u u n ( ) u r a b
  27. 74'-61 (charRNN-61) -:< "5/=) ! u #30. 02 ($ 5/)

    u :<&%?-8' !,  # ";/ u :< 0.8@+  u (61 *->$ u 961
  28. CharRNN u RNN model ( CharRNN model u ( )

    u
  29. None
  30. 5";/-@ F u F G.5"7QS u 4P-5"N&,5"E7G 7>O 7K= 7K=

    D I8H*C 7R'J2 79'J2 1! ?7#%067A: $73 B (3<7 L) L) +M 
  31. PU u ;1#I U"HG0 C3\? u H`I C#SK3 UB7-:A[*Q* HIV_

    ,L8A[a u 6 C#IUà ' TNIWR< u %2/B7IX] #à$O@JIXY] u #IU'# SK2I>F()I@9.!I + u HV5Z ED u HI^/5&49=IM:
  32. b u r g nk u 4 ma- >> 

    m ma… - >> - >> “ ma- >> - - -ma … - - …. Θ > pi N–gram LM ma|Θ >P( |Θ) >P( |Θ) b
  33. N u N-gram N R N u R u

  34. k[4p7a_P^<E u fhT%Pbinary classifiera_Pk[0Y0 (O oB]P 900a_l-N_E=#KJPB])G`HRLI78% u !>+a_M6U@fhVmVe:,,8pQ P*

    u Q U@o'21F→nP_a_/\PNc u  Q U@J9"A a $→aIba P W4p5j N u fhg3ZSiX. &d 5)_ Pa=?0P u YaQDP;4 YM6U@QDP(C4
  35. +$: 1:(' u (=7? /* . u D4+2-3$ 1(%:9 u

    ,80(;6"-!?B3B(' 2@ & ' u >F(A< u 1'C(GE )5H#
  36. 8 u 8 010 u  0D4 u < u

    :?= u  u & u 3628 u General guideline /9<+$*>') $%@*>"F')0A %#C;E.7A(  1+6,D u -68 0D4 8 50  u A/B test, latin square 0! B 
  37. u u ) 2 ( 2 ) u 7 (

    ) u : ( u ) 0” ”     () : 1
  38. " 0*23 u !'*%+)transformer  RNN) u , -*$ /*,#

    u ,&-(.*- u !3*0 1 
  39. Two sentences to conclude u AI AI   

    u  
  40. We are hiring ! u Data Scientist u Data Engineer

    u Data Analyst u NLP Engineer
  41. Thank you QA johnson.wu@linecorp.com