Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Image Inspired Poetry generation

Image Inspired Poetry generation

Image Inspired Poetry generation
"以看圖寫詩為例的創意文本生成" present on R-Ladies @LINE office 20190729 by Johnson Wu. Event page: https://www.meetup.com/rladies-taipei/events/262838107/

LINE Developers Taiwan

July 29, 2019
Tweet

More Decks by LINE Developers Taiwan

Other Decks in Programming

Transcript

  1. !# (Bernardi et al. 2016) u 1. ! u 2.

    ! u 3.  ! u  "   1. Patterson, G., Xu, C., Su, H., and Hays, J. The sun attribute databased: Beyond categories for deeper scene understanding. International Journal of Computer Vision 108(1-2):59-81. 2014 2. Devlin, J., Cheng, H., Fang, H., Gupta, S., Deng, L., He, X., Zweig, G., and Mitchell, M. Language models for image captioning: The quirks and what works. arXiv preprint arXiv:1505.01809. 2015 3. Socher, R., Karpathy, A., Le, Q. V., Manning, C. D., and Ng, A. Y. Grounded compositional semantics for finding and describing images with sentences. TACL 2:207–218. 2014 4. Soto, A. J., Kiros, R., Keselj, V., and Milios, E. E. Machine learning meets visualization for extracting insights from text data. AI Matters 2(2):15-17. 2015 5. Karpathy, A., and Fei-Fei, L. Deep visual-semantic alignments for generating image descriptions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3128–3137. 2015 6. Donahue, J., Anne Hendricks, L., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., and Darrell, T. Long-term recurrent convolutional networks for visual recognition and description. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2625–2634. 2015 7. Schwarz, K., Berg, T. L., and Lensch, H. P. Autoillustrating poems and songs with style. In Asian Conference on Computer Vision, 87–103. 2016
  2. Attention added (Xu et el, 2015.) u Attention weight is

    calculated from decoder hidden state h and encoder features . u the context vector z is calculated from the attention. u Output and new hidden state is generated from the context vector z and hidden state.
  3. Input higher level of image features (Chen et el, 2015.)

    u Reconstruct visual representation u Stronger connection between images and texts
  4. 1950 Stochastische Texte . 2 2 1 2 . (Liu

    et al. 2018) P(Wt | wt-1:1 ) =P(w1)*P(w2 |w1 )*P(w3 |w1 w2 )….)
  5.  u   u   u  

    https://poem.msxiaobing.com/
  6. -   u " 2!  ($ "%$ 

    ),.$ 3 460$# u ' $1& ($ 5* /$+
  7. &RK,.1 u !# \QA`L WG4Z?A+NL W>-AE3 BMX @ 0( BW4Z?A+_4

    P) A] ^O8UUA;9 !# \QA`L u !# J,.1ISA aR <$2"7DV '=CAT : bA 62"7 [Y95Ad% <$*F/ !# HcA 
  8. Case Study: 1 9 ? u : ) 2 u

    0 ( 8 ) u 9 5 : 2 9 8
  9. =0!"6$ 1. 42:9 2. 56) 0-';" 87 0"/ 3. *56

    <#%&31, +( .  0   0.975  0.005 - 0.005 0.004 ' 0.003 ( 0.586  0.109  $% 0.053     +    (   !  (% !&% )*+% . #/,%"          
  10.  :  u   u  (pretrained on

    Krizhevesky et al, 2012.) u Input: image pixels, output: label Noun or Adj; two CNN models. u Generate P(C | I) = f(Wc ∗ I), Wc : CNN’s parameters, I: input image, *: operations of convolution, pooling, activation. u Train via BPTT   …
  11. f u 0 r: : 174 : 4 : 0

    :41 : 37 : e d → v [ ] 2 20 w2    co 0 e   0 by similarity
  12. c ]: hg u a n b . ) a

    h ) ) ) N X b u N R c [ : R b c [ e: Rb o c [ r(:e Rbo d c [ e Rbod ( [ X EbN? -- i
  13. .3* /1=;& u &!(-B JJ u 6  1&8@G!1I71E: u

    +2E! n-gram 1:A. E.g 4-gram: u 0?7 n 41<4: 4-gram:1X u #<4E1B ",(- B u 1+ C =1C B > 5%1(-$30% u +)DH(B→'GF$9$5%(with beam search) 1      
  14. i u rn b a - u g:brn rn i

    i rn m b b /- b i  b b a b a 2 b i  / b a b a 
  15. >:'1;7 --- =, u = 1=H"5<>1 0<>B+6&   

                       u )## “I”+”” 1)." “8”+””, (1 P(“I”) > P(“I"”) '#A</19 i.e [23G-], [23] @C>:'1% ;7))#<1*4 E E? F!$1= D
  16. ( m )? u u -: g s d -

    R s I u N e g A u u n ( ) u r a b
  17. 74'-61 (charRNN-61) -:< "5/=) ! u #30. 02 ($ 5/)

    u :<&%?-8' !,  # ";/ u :< 0.8@+  u (61 *->$ u 961
  18. 5";/-@ F u F G.5"7QS u 4P-5"N&,5"E7G 7>O 7K= 7K=

    D I8H*C 7R'J2 79'J2 1! ?7#%067A: $73 B (3<7 L) L) +M 
  19. PU u ;1#I U"HG0 C3\? u H`I C#SK3 UB7-:A[*Q* HIV_

    ,L8A[a u 6 C#IUà ' TNIWR< u %2/B7IX] #à$O@JIXY] u #IU'# SK2I>F()I@9.!I + u HV5Z ED u HI^/5&49=IM:
  20. b u r g nk u 4 ma- >> 

    m ma… - >> - >> “ ma- >> - - -ma … - - …. Θ > pi N–gram LM ma|Θ >P( |Θ) >P( |Θ) b
  21. k[4p7a_P^<E u fhT%Pbinary classifiera_Pk[0Y0 (O oB]P 900a_l-N_E=#KJPB])G`HRLI78% u !>+a_M6U@fhVmVe:,,8pQ P*

    u Q U@o'21F→nP_a_/\PNc u  Q U@J9"A a $→aIba P W4p5j N u fhg3ZSiX. &d 5)_ Pa=?0P u YaQDP;4 YM6U@QDP(C4
  22. +$: 1:(' u (=7? /* . u D4+2-3$ 1(%:9 u

    ,80(;6"-!?B3B(' 2@ & ' u >F(A< u 1'C(GE )5H#
  23. 8 u 8 010 u  0D4 u < u

    :?= u  u & u 3628 u General guideline /9<+$*>') $%@*>"F')0A %#C;E.7A(  1+6,D u -68 0D4 8 50  u A/B test, latin square 0! B 
  24. u u ) 2 ( 2 ) u 7 (

    ) u : ( u ) 0” ”     () : 1
  25. We are hiring ! u Data Scientist u Data Engineer

    u Data Analyst u NLP Engineer