Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Blending Texture Features from Multiple Reference Images for Style Transfer - SIGGRAPH ASIA 2016 Technical Brief

Blending Texture Features from Multiple Reference Images for Style Transfer - SIGGRAPH ASIA 2016 Technical Brief

本スライドはSIGGRAPH ASIA 2016 Technical Briefでの発表スライドになります。

本研究では、与えられた画像集合から画像集合中に共通する画風を学習し、その画風を別の写真などの画像へ転写する方法を提案します。本研究では、画風とは一つの作品によって特徴付けられるのではなく、同じ画風で描かれた複数の作品に共通する特徴によって定められると仮定することで、従来法に比べて元画像の色分布を保存しながら画風を転写しています。
画像集合の取得のため、本研究では多くの付加情報が付与された500,000枚以上のデジタルイラストを含む新たなデータセットを構築しました。

Dwango Media Village

December 07, 2016
Tweet

More Decks by Dwango Media Village

Other Decks in Research

Transcript

  1. BLENDING TEXTURE FEATURES FROM
    MULTIPLE REFERENCE IMAGES FOR STYLE TRANSFER
    Hikaru Ikuta*1,*2, Keisuke Ogaki*2, Yuri Odagiri*2
    *1The University of Tokyo *2Dwango, Co. ltd.

    View Slide

  2. OUR SYSTEM
    “Watercolor”

    Objective: Obtain an image painted in the texture of a given set of images
    Input: Content Image and Texture Image Set
    Tested with a novel dataset “nico-illust” containing 500,000 digital paintings
    $POUFOU*NBHF 5FYUVSF*NBHF4FU 0VUQVU*NBHF

    View Slide

  3. Original Video
    Our Results
    The input style image set (50 images)

    View Slide

  4. MOTIVATION
    Gatys, et al. [1]

    $POUFOU*NBHF 5FYUVSF*NBHF4FU 0VUQVU*NBHF
    [1] Gatys, L. A., et al. 2016. Image style transfer using convolutional neural networks. CVPR

    View Slide

  5. MOTIVATION
    Gatys, et al. [1]

    $POUFOU*NBHF 5FYUVSF*NBHF4FU 0VUQVU*NBHF
    [1] Gatys, L. A., et al. 2016. Image style transfer using convolutional neural networks. CVPR
    The color depends heavily on the input texture image
    The color correspondence of the original image is often lost
    Uses only one image, so…

    View Slide

  6. MOTIVATION
    Gatys, et al. [1]

    $POUFOU*NBHF 5FYUVSF*NBHF4FU 0VUQVU*NBHF
    [1] Gatys, L. A., et al. 2016. Image style transfer using convolutional neural networks. CVPR
    Is this really a “style” transfer?
    The color depends heavily on the input texture image
    The color correspondence of the original image is often lost
    Uses only one image, so…

    View Slide

  7. MOTIVATION
    Gatys, et al. [1]

    $POUFOU*NBHF 5FYUVSF*NBHF4FU 0VUQVU*NBHF
    [1] Gatys, L. A., et al. 2016. Image style transfer using convolutional neural networks. CVPR
    Is this really a “style” transfer?
    The color depends heavily on the input texture image
    The color correspondence of the original image is often lost
    Uses only one image, so…
    The color of one artwork The “style”
    6=

    View Slide

  8. MOTIVATION
    Gatys, et al. [1]

    $POUFOU*NBHF 5FYUVSF*NBHF4FU 0VUQVU*NBHF
    [1] Gatys, L. A., et al. 2016. Image style transfer using convolutional neural networks. CVPR
    Is this really a “style” transfer?
    The color depends heavily on the input texture image
    The color correspondence of the original image is often lost
    Uses only one image, so…

    View Slide

  9. MOTIVATION
    Gatys, et al. [1]
    Our Method
    $POUFOU*NBHF 5FYUVSF*NBHF4FU 0VUQVU*NBHF


    $POUFOU*NBHF 5FYUVSF*NBHF4FU 0VUQVU*NBHF
    [1] Gatys, L. A., et al. 2016. Image style transfer using convolutional neural networks. CVPR

    View Slide

  10. MOTIVATION
    Gatys, et al. [1]
    Our Method
    $POUFOU*NBHF 5FYUVSF*NBHF4FU 0VUQVU*NBHF


    $POUFOU*NBHF 5FYUVSF*NBHF4FU 0VUQVU*NBHF
    [1] Gatys, L. A., et al. 2016. Image style transfer using convolutional neural networks. CVPR
    Our method…
    Preserves the original color, while extracting the texture of the
    texture image set
    Therefore, it can transfer the same style onto different images

    View Slide

  11. THE METHOD

    View Slide

  12. Neural Style Transfer is a joint distance
    minimization problem
    $POUFOU*NBHF
    5FYUVSF*NBHF
    Joint
    Distance
    Minimization
    $POUFOU
    'FBUVSF
    0VUQVU*NBHF
    F
    G
    2
    6
    6
    6
    6
    6
    4
    3.1
    4.1
    5.9
    2.6
    .
    .
    .
    3
    7
    7
    7
    7
    7
    5
    5FYUVSF
    'FBUVSF
    SYSTEM OVERVIEW (FORMER)
    2
    6
    6
    6
    6
    6
    4
    2.7
    1.8
    4.2
    8.4
    .
    .
    .
    3
    7
    7
    7
    7
    7
    5

    View Slide

  13. Neural Style Transfer is a joint distance
    minimization problem
    $POUFOU*NBHF
    5FYUVSF*NBHF
    Joint
    Distance
    Minimization
    $POUFOU
    'FBUVSF
    0VUQVU*NBHF
    F
    G
    2
    6
    6
    6
    6
    6
    4
    3.1
    4.1
    5.9
    2.6
    .
    .
    .
    3
    7
    7
    7
    7
    7
    5
    5FYUVSF
    'FBUVSF
    The output image tries to be close to the:

    - Content image in the “Content Feature” space

    - Texture image in the “Texture Feature” space
    SYSTEM OVERVIEW (FORMER)
    2
    6
    6
    6
    6
    6
    4
    2.7
    1.8
    4.2
    8.4
    .
    .
    .
    3
    7
    7
    7
    7
    7
    5

    View Slide

  14. Neural Style Transfer is a joint distance
    minimization problem
    $POUFOU*NBHF
    5FYUVSF*NBHF
    Joint
    Distance
    Minimization
    $POUFOU
    'FBUVSF
    0VUQVU*NBHF
    F
    G
    2
    6
    6
    6
    6
    6
    4
    3.1
    4.1
    5.9
    2.6
    .
    .
    .
    3
    7
    7
    7
    7
    7
    5
    5FYUVSF
    'FBUVSF
    SYSTEM OVERVIEW (FORMER)
    2
    6
    6
    6
    6
    6
    4
    2.7
    1.8
    4.2
    8.4
    .
    .
    .
    3
    7
    7
    7
    7
    7
    5

    View Slide

  15. Neural Style Transfer is a joint distance
    minimization problem
    $POUFOU*NBHF
    5FYUVSF*NBHF
    Joint
    Distance
    Minimization
    $POUFOU
    'FBUVSF
    0VUQVU*NBHF
    F
    G
    2
    6
    6
    6
    6
    6
    4
    3.1
    4.1
    5.9
    2.6
    .
    .
    .
    3
    7
    7
    7
    7
    7
    5
    5FYUVSF
    'FBUVSF
    SYSTEM OVERVIEW (FORMER)
    2
    6
    6
    6
    6
    6
    4
    2.7
    1.8
    4.2
    8.4
    .
    .
    .
    3
    7
    7
    7
    7
    7
    5
    F
    G
    $POUFOU
    'FBUVSF
    F
    G
    G
    2
    6
    6
    6
    6
    6
    4
    3.1
    4.1
    5.9
    2.6
    .
    .
    .
    3
    7
    7
    7
    7
    7
    5
    G
    G
    (Image of arbitrary size)
    Rany⇥any RN
    texture
    (Fixed size vector)
    The Texture Feature function :
    G
    2
    6
    6
    6
    6
    6
    4
    2.7
    1.8
    4.2
    8.4
    .
    .
    .
    3
    7
    7
    7
    7
    7
    5
    2
    6
    6
    6
    6
    6
    4
    0.5
    7.7
    2.1
    5.6
    .
    .
    .
    3
    7
    7
    7
    7
    7
    5

    View Slide

  16. G
    2
    6
    6
    6
    6
    6
    4
    3.1
    4.1
    5.9
    2.6
    .
    .
    .
    3
    7
    7
    7
    7
    7
    5
    Rany⇥any
    RN
    texture
    Dimension:
    Number of filters
    in the CNN
    squared

    correlation
    matrix
    Middle layer channels of
    the CNN
    Vectorize
    [ ]

    View Slide

  17. Neural Style Transfer is a joint distance
    minimization problem
    $POUFOU*NBHF
    5FYUVSF*NBHF
    $POUFOU
    'FBUVSF
    0VUQVU*NBHF
    F
    G
    2
    6
    6
    6
    6
    6
    4
    3.1
    4.1
    5.9
    2.6
    .
    .
    .
    3
    7
    7
    7
    7
    7
    5
    5FYUVSF
    'FBUVSF
    SYSTEM OVERVIEW (FORMER)
    Joint
    Distance
    Minimization
    2
    6
    6
    6
    6
    6
    4
    2.7
    1.8
    4.2
    8.4
    .
    .
    .
    3
    7
    7
    7
    7
    7
    5

    View Slide

  18. Neural Style Transfer is a joint distance
    minimization problem
    $POUFOU*NBHF
    Joint
    Distance
    Minimization
    $POUFOU
    'FBUVSF
    0VUQVU*NBHF
    F
    2
    6
    6
    6
    6
    6
    4
    3.1
    4.1
    5.9
    2.6
    .
    .
    .
    3
    7
    7
    7
    7
    7
    5
    Optimal
    Blending 0QUJNBM
    5FYUVSF
    'FBUVSF
    Novelty of our system
    SYSTEM OVERVIEW (PROPOSED)
    2
    6
    6
    6
    6
    6
    4
    2.7
    1.8
    4.2
    8.4
    .
    .
    .
    3
    7
    7
    7
    7
    7
    5

    View Slide

  19. Neural Style Transfer is a joint distance
    minimization problem
    $POUFOU*NBHF
    Joint
    Distance
    Minimization
    $POUFOU
    'FBUVSF
    0VUQVU*NBHF
    F
    2
    6
    6
    6
    6
    6
    4
    3.1
    4.1
    5.9
    2.6
    .
    .
    .
    3
    7
    7
    7
    7
    7
    5
    Optimal
    Blending 0QUJNBM
    5FYUVSF
    'FBUVSF
    Novelty of our system
    Generate a better texture feature that
    - Preserves the colors of the content
    - Represents the author/genre’s “style”
    SYSTEM OVERVIEW (PROPOSED)
    2
    6
    6
    6
    6
    6
    4
    2.7
    1.8
    4.2
    8.4
    .
    .
    .
    3
    7
    7
    7
    7
    7
    5

    View Slide

  20. Neural Style Transfer is a joint distance
    minimization problem
    $POUFOU*NBHF
    Joint
    Distance
    Minimization
    $POUFOU
    'FBUVSF
    0VUQVU*NBHF
    F
    2
    6
    6
    6
    6
    6
    4
    3.1
    4.1
    5.9
    2.6
    .
    .
    .
    3
    7
    7
    7
    7
    7
    5
    Optimal
    Blending 0QUJNBM
    5FYUVSF
    'FBUVSF
    Novelty of our system
    Generate a better texture feature that
    - Preserves the colors of the content
    - Represents the author/genre’s “style”
    How? We use one key observation…
    SYSTEM OVERVIEW (PROPOSED)
    2
    6
    6
    6
    6
    6
    4
    2.7
    1.8
    4.2
    8.4
    .
    .
    .
    3
    7
    7
    7
    7
    7
    5

    View Slide

  21. KEY OBSERVATION
    Assume that: a concatenated image of some style belongs to the same style

    View Slide

  22. KEY OBSERVATION
    G ( )
    G ( )
    ( )
    Assume that: a concatenated image of some style belongs to the same style

    View Slide

  23. KEY OBSERVATION
    G ( )
    G ( )
    ( ) 1
    3
    G
    G ( )
    G ( ) 1
    3
    G
    G ( )
    G ( )
    1
    3
    G
    G ( )
    G ( ) + +

    G ( )
    It is the linear combination of the texture features of each image
    Approximation is due to padding at boundaries in the CNN
    Derived from the properties of : Proof shown in paper
    Assume that: a concatenated image of some style belongs to the same style

    View Slide

  24. KEY OBSERVATION
    G ( )
    G ( )
    ( ) 1
    3
    G
    G ( )
    G ( ) 1
    3
    G
    G ( )
    G ( )
    1
    3
    G
    G ( )
    G ( ) + +

    G ( )
    It is the linear combination of the texture features of each image
    Approximation is due to padding at boundaries in the CNN
    Derived from the properties of : Proof shown in paper
    Assume that: a concatenated image of some style belongs to the same style
    Linear combinations of texture features from a certain style,

    is again a texture feature of that style

    View Slide

  25. KEY OBSERVATION
    G ( )
    G ( )
    ( ) 1
    3
    G
    G ( )
    G ( ) 1
    3
    G
    G ( )
    G ( )
    1
    3
    G
    G ( )
    G ( ) + +

    G ( )
    It is the linear combination of the texture features of each image
    Approximation is due to padding at boundaries in the CNN
    Derived from the properties of : Proof shown in paper
    Assume that: a concatenated image of some style belongs to the same style
    Linear combinations of texture features from a certain style,

    is again a texture feature of that style
    Let us understand it geometrically…

    View Slide

  26. BLENDING IN THE “STYLE SPACE”

    View Slide

  27. Dimensions
    in the
    Texture
    Feature .
    G
    BLENDING IN THE “STYLE SPACE”

    View Slide







  28. BLENDING IN THE “STYLE SPACE”

    View Slide







  29. - Weights are positive and sum up to one
    - Number of images < Number of feature dimensions
    Represents a simplex
    BLENDING IN THE “STYLE SPACE”

    View Slide







  30. BLENDING IN THE “STYLE SPACE”

    View Slide







  31. The Watercolor “Style Space”
    BLENDING IN THE “STYLE SPACE”

    View Slide







  32. Content Image (not watercolor)
    BLENDING IN THE “STYLE SPACE”

    View Slide







  33. Our method: find the closest point

    within the watercolor style space
    BLENDING IN THE “STYLE SPACE”

    View Slide







  34. arg min
    r
    X
    l,i,j
    Gl
    ij
    (I
    content
    )
    K
    X
    k
    r
    k
    Gl
    ij
    (I
    k
    )
    !2
    s.t.
    K
    X
    k
    r
    k
    = 1, 0  r
    k
     1 (k = 1, · · · , K)
    Find the optimal weights rk
    BLENDING IN THE “STYLE SPACE”

    View Slide







  35. arg min
    r
    X
    l,i,j
    Gl
    ij
    (I
    content
    )
    K
    X
    k
    r
    k
    Gl
    ij
    (I
    k
    )
    !2
    s.t.
    K
    X
    k
    r
    k
    = 1, 0  r
    k
     1 (k = 1, · · · , K)
    Find the optimal weights rk
    ˜
    Gl
    ij
    =
    K
    X
    k
    rkGl
    ij
    (Ik)
    Optimal Texture Feature
    BLENDING IN THE “STYLE SPACE”

    View Slide

  36. PIPELINE
    Now that we have chosen a texture feature,
    for the rest of the style transfer, we use Gatys, et al.[1]
    $POUFOU*NBHF
    Joint
    Distance
    Minimization
    $POUFOU
    'FBUVSF
    0VUQVU*NBHF
    F
    2
    6
    6
    6
    6
    6
    4
    3.1
    4.1
    5.9
    2.6
    .
    .
    .
    3
    7
    7
    7
    7
    7
    5
    Optimal
    Blending 0QUJNBM
    5FYUVSF
    'FBUVSF
    2
    6
    6
    6
    6
    6
    4
    2.7
    1.8
    4.2
    8.4
    .
    .
    .
    3
    7
    7
    7
    7
    7
    5

    View Slide

  37. PIPELINE
    Now that we have chosen a texture feature,
    for the rest of the style transfer, we use Gatys, et al.[1]
    $POUFOU*NBHF
    Joint
    Distance
    Minimization
    $POUFOU
    'FBUVSF
    0VUQVU*NBHF
    F
    2
    6
    6
    6
    6
    6
    4
    3.1
    4.1
    5.9
    2.6
    .
    .
    .
    3
    7
    7
    7
    7
    7
    5
    Optimal
    Blending 0QUJNBM
    5FYUVSF
    'FBUVSF
    2
    6
    6
    6
    6
    6
    4
    2.7
    1.8
    4.2
    8.4
    .
    .
    .
    3
    7
    7
    7
    7
    7
    5

    View Slide

  38. PIPELINE
    Now that we have chosen a texture feature,
    for the rest of the style transfer, we use Gatys, et al.[1]
    $POUFOU*NBHF
    Joint
    Distance
    Minimization
    $POUFOU
    'FBUVSF
    0VUQVU*NBHF
    F
    2
    6
    6
    6
    6
    6
    4
    3.1
    4.1
    5.9
    2.6
    .
    .
    .
    3
    7
    7
    7
    7
    7
    5
    Optimal
    Blending 0QUJNBM
    5FYUVSF
    'FBUVSF
    A large number of images
    Annotations of the image’s style (“watercolor,” etc…)
    We need a large image dataset with…
    2
    6
    6
    6
    6
    6
    4
    2.7
    1.8
    4.2
    8.4
    .
    .
    .
    3
    7
    7
    7
    7
    7
    5

    View Slide

  39. EXPERIMENTS

    View Slide

  40. Novel dataset: “nico-illust”[4]
    [4] https://nico-opendata.jp/en/
    500,000 images, largely consisting of digital paintings,
    with rich annotations: Tags, user comments, Number of favorites, etc.

    (Tags include motif name, style name, etc.)

    View Slide

  41. Author: Kariwo
    (ID:33341043)
    (50 images)
    Author: Morin
    (ID:10195867)
    (4 images)
    Author: Last Hunter
    (ID:23607472)
    (50 images)
    Author: Niichi
    (ID:13762409)
    (50 images)
    Tag: Watercolor
    (50 images)
    Tag: Pixel Art
    (10 images)

    View Slide

  42. View Slide

  43. Author: Kariwo
    (ID:33341043)
    (50 images)
    Author: Last Hunter
    (ID:23607472)
    (50 images)

    View Slide

  44. Author: Morin
    (ID:10195867)
    (4 images)
    Author: Niichi
    (ID:13762409)
    (50 images)

    View Slide

  45. Tag: Watercolor
    (50 images)
    Tag: Pixel Art
    (10 images)

    View Slide

  46. Tag: Watercolor
    (50 images)
    Tag: Pixel Art
    (10 images)
    Automatically captures the use of
    perpendicular lines in the style of pixel art

    View Slide

  47. Original Video “Watercolor” Style Transfer
    Weights of each texture image
    OUR SYSTEM

    View Slide

  48. Texture image weights for each frame

    View Slide

  49. CONCLUSION

    View Slide

  50. CONCLUSION
    We proved that the Texture Features proposed in Gatys, et al.[1]
    can be linearly combined to construct a valid texture feature
    We proposed a novel dataset “nico-illust” that contains digital
    paintings
    We proposed a neural style transfer method that preserves the
    color of the original input image, by taking a collection of
    images to represent a given style

    View Slide