Slide 1

Slide 1 text

Convolutional Neural Networks for Artistic Style Transfer Harish Narayanan @copingbear harishnarayanan.org/writing/artistic-style-transfer github.com/hnarayanan/artistic-style-transfer

Slide 2

Slide 2 text

Artistic style transfer is gorgeous and popular

Slide 3

Slide 3 text

Artistic style transfer is gorgeous and popular

Slide 4

Slide 4 text

Artistic style transfer is gorgeous and popular 3

Slide 5

Slide 5 text

Artistic style transfer is gorgeous and popular 3

Slide 6

Slide 6 text

Content image c Style image s 4

Slide 7

Slide 7 text

Content image c Style image s Style-transferred image x 4

Slide 8

Slide 8 text

Content image c Style image s Style-transferred image x 4

Slide 9

Slide 9 text

Content image c Style image s Style-transferred image x 4

Slide 10

Slide 10 text

We formally pose it as an optimisation problem! 5

Slide 11

Slide 11 text

We formally pose it as an optimisation problem! 5 L content ( ) ≈ 0 ,

Slide 12

Slide 12 text

We formally pose it as an optimisation problem! 5 L content ( ) ≈ 0 , ( ) ≈ 0 , Lstyle

Slide 13

Slide 13 text

We formally pose it as an optimisation problem! L( c , s , x ) = ↵L content ( c , x ) + L style ( s , x ) 6 x ⇤ = argmin x L( c , s , x )

Slide 14

Slide 14 text

We formally pose it as an optimisation problem! L( c , s , x ) = ↵L content ( c , x ) + L style ( s , x ) 6 x ⇤ = argmin x L( c , s , x ) ⭐ Content and style losses are not defined in a per-pixel difference sense, but higher-level semantic differences!

Slide 15

Slide 15 text

But then how does one write a program to perceive semantic differences?!? 7

Slide 16

Slide 16 text

But then how does one write a program to perceive semantic differences?!? 7 ⭐ We don’t! We turn to machine learning.

Slide 17

Slide 17 text

But then how does one write a program to perceive semantic differences?!? • Image classification problem
 (linear ➞ neural network ➞ convnet) 7 ⭐ We don’t! We turn to machine learning.

Slide 18

Slide 18 text

But then how does one write a program to perceive semantic differences?!? • Image classification problem
 (linear ➞ neural network ➞ convnet) • Break 7 ⭐ We don’t! We turn to machine learning.

Slide 19

Slide 19 text

But then how does one write a program to perceive semantic differences?!? • Image classification problem
 (linear ➞ neural network ➞ convnet) • Break • Download a pre-trained convnet classifier,
 and repurpose it for style transfer 7 ⭐ We don’t! We turn to machine learning.

Slide 20

Slide 20 text

But then how does one write a program to perceive semantic differences?!? • Image classification problem
 (linear ➞ neural network ➞ convnet) • Break • Download a pre-trained convnet classifier,
 and repurpose it for style transfer • Concluding thoughts 7 ⭐ We don’t! We turn to machine learning.

Slide 21

Slide 21 text

Let’s start with a more basic problem to motivate our approach: The image classification problem 8 f( ) 99% Baby 0.8% Dog 0.1% Car 0.1% Toothbrush

Slide 22

Slide 22 text

Let’s start with a more basic problem to motivate our approach: The image classification problem 8 f( ) 99% Baby 0.8% Dog 0.1% Car 0.1% Toothbrush z }| { D = W ⇥ H ⇥ 3 z}|{ K

Slide 23

Slide 23 text

Image classification is a challenging problem 9

Slide 24

Slide 24 text

Image classification is a challenging problem 9

Slide 25

Slide 25 text

⭐ There is a semantic gap between the input representation and the task at hand

Slide 26

Slide 26 text

The pieces that make up a supervised learning solution to the image classification problem 11

Slide 27

Slide 27 text

The pieces that make up a supervised learning solution to the image classification problem 11

Slide 28

Slide 28 text

The pieces that make up a supervised learning solution to the image classification problem 11

Slide 29

Slide 29 text

The pieces that make up a supervised learning solution to the image classification problem 11

Slide 30

Slide 30 text

The pieces that make up a supervised learning solution to the image classification problem 11

Slide 31

Slide 31 text

The simplest learning image classifier: The linear classifier 12 f ( x ; W , b ) = Wx + b sj = ( f )j = efj PK k=1 efk ✓ = (W, b) Linear score function Parameters to learn

Slide 32

Slide 32 text

The simplest learning image classifier: The linear classifier 12 f ( x ; W , b ) = Wx + b sj = ( f )j = efj PK k=1 efk ✓ = (W, b) Linear score function Parameters to learn Cross-entropy loss function Ly( s ) = X i yi log( si)

Slide 33

Slide 33 text

A simplified look at gradient descent 13 L(w) w

Slide 34

Slide 34 text

A simplified look at gradient descent 13 L(w) w w0

Slide 35

Slide 35 text

A simplified look at gradient descent 13 L(w) w w0 w1 w1 = w0 ⌘ dL dw (w0)

Slide 36

Slide 36 text

A simplified look at gradient descent 13 L(w) w w0 w1 w1 = w0 ⌘ dL dw (w0) w2 w2 = w1 ⌘ dL dw (w1)

Slide 37

Slide 37 text

A simplified look at gradient descent 13 L(w) w w0 w1 w1 = w0 ⌘ dL dw (w0) w2 w2 = w1 ⌘ dL dw (w1) w optimal dL dw = 0 . . . w optimal

Slide 38

Slide 38 text

"# The linear image classifier in TensorFlow 14 github.com/hnarayanan/artistic-style-transfer

Slide 39

Slide 39 text

No content

Slide 40

Slide 40 text

No content

Slide 41

Slide 41 text

– TensorFlow Docs Authors “Getting 92% accuracy on MNIST is bad.
 It’s almost embarrassingly bad.”

Slide 42

Slide 42 text

Moving to a nonlinear score function: Introducing the neuron 17

Slide 43

Slide 43 text

Moving to a nonlinear score function: Introducing the neuron 17

Slide 44

Slide 44 text

Moving to a nonlinear score function: Stacking neurons into a first neural network 18 ����� ����� ������ ����� ����� ������ ���� x y1 = W1x + b1 h1 = max(0 , y1) y2 = W2h1 + b2 s = (y2)

Slide 45

Slide 45 text

Moving to a nonlinear score function: Stacking neurons into a first neural network 18 ����� ����� ������ ����� ����� ������ ���� x y1 = W1x + b1 h1 = max(0 , y1) y2 = W2h1 + b2 s = (y2)

Slide 46

Slide 46 text

Moving to a nonlinear score function: Stacking neurons into a first neural network 18 ����� ����� ������ ����� ����� ������ ���� x y1 = W1x + b1 h1 = max(0 , y1) y2 = W2h1 + b2 s = (y2)

Slide 47

Slide 47 text

"# A first neural network-based image classifier in TensorFlow 19 github.com/hnarayanan/artistic-style-transfer

Slide 48

Slide 48 text

"# An improved neural network-based classifier in TensorFlow 20

Slide 49

Slide 49 text

"# An improved neural network-based classifier in TensorFlow 20 ⭐ Just because we can fit anything doesn’t mean our learning algorithm will find that fit!

Slide 50

Slide 50 text

"# An improved neural network-based classifier in TensorFlow 20 github.com/hnarayanan/artistic-style-transfer

Slide 51

Slide 51 text

Tinkering with neural network architectures to get a feeling for approximation capabilities 21 Example 1 Example 2 Example 3 playground.tensorflow.org

Slide 52

Slide 52 text

⭐ Neural networks can learn features we’d otherwise need to hand-engineer with domain knowledge.

Slide 53

Slide 53 text

Standard neural networks are not the best option when it comes to dealing with image data 23 28 px × 28 px 784 px … They disregard the structure of the image

Slide 54

Slide 54 text

Standard neural networks are not the best option when it comes to dealing with image data 24 Number of parameters they need to learn grows rapidly Linear: 784×10 + 10 = 7,850 ����� ����� ������ ����

Slide 55

Slide 55 text

Standard neural networks are not the best option when it comes to dealing with image data 24 Number of parameters they need to learn grows rapidly Neural Network (1 hidden layer): 784×100 + 100 + 100×10 + 10 = 79,510 ����� ����� ������ ����� ����� ������ ����

Slide 56

Slide 56 text

Standard neural networks are not the best option when it comes to dealing with image data 24 Number of parameters they need to learn grows rapidly Neural Network (2 hidden layers): 784×400 + 400 + 400×100 + 100 + 100×10 + 10 = 355,110 ����� ����� ������ ����� � ����� ������ ����� � ����� ������ ����

Slide 57

Slide 57 text

Convolutional neural networks to the rescue! 25 Regular (Fully Connected) Neural Network Convolutional Neural Network

Slide 58

Slide 58 text

Core pieces of a convolutional neural network: The convolutional layer 26 K (filters) = 2 F (extent) = 3 S (stride) = 2 P (padding) = 1

Slide 59

Slide 59 text

Core pieces of a convolutional neural network: The convolutional layer 26 K (filters) = 2 F (extent) = 3 S (stride) = 2 P (padding) = 1

Slide 60

Slide 60 text

Core pieces of a convolutional neural network: The pooling layer 27 F (extent) = 2 S (stride) = 2

Slide 61

Slide 61 text

Core pieces of a convolutional neural network: The pooling layer 27 F (extent) = 2 S (stride) = 2

Slide 62

Slide 62 text

"# An accurate convnet-based image classifier in TensorFlow 28 github.com/hnarayanan/artistic-style-transfer

Slide 63

Slide 63 text

Better understanding what a convnet-based classifier does with the MNIST data 29 transcranial.github.io/keras-js/#/mnist-cnn

Slide 64

Slide 64 text

⭐ Deep learning (and convnets in particular) are all about learning representations 30

Slide 65

Slide 65 text

No content

Slide 66

Slide 66 text

No content

Slide 67

Slide 67 text

Introducing a powerful convnet-based classifier at the heart of the Gatys style transfer paper 32 VGG Net: Networks systematically composed of 3×3 CONV layers. (ReLU not shown for brevity.)

Slide 68

Slide 68 text

Introducing a powerful convnet-based classifier at the heart of the Gatys style transfer paper 32 VGG Net: Networks systematically composed of 3×3 CONV layers. (ReLU not shown for brevity.)

Slide 69

Slide 69 text

Let’s start with a pre-trained VGG Net in Keras 33 138 million parameters (VGG16) trained on ImageNet

Slide 70

Slide 70 text

Let’s start with a pre-trained VGG Net in Keras 33 Keras coming to TensorFlow core in 1.2!

Slide 71

Slide 71 text

"# Fetching and playing with a pre-trained VGG Net in Keras 34 github.com/hnarayanan/artistic-style-transfer

Slide 72

Slide 72 text

35 Content image c Style image s Style-transferred image x

Slide 73

Slide 73 text

Recall the style transfer optimisation problem 36 L content ( ) ≈ 0 , ( ) ≈ 0 , Lstyle L( c , s , x ) = ↵L content ( c , x ) + L style ( s , x ) x ⇤ = argmin x L( c , s , x )

Slide 74

Slide 74 text

VGG Net has already learnt to encode perceptual and semantic information that we need to measure our losses! 37

Slide 75

Slide 75 text

VGG Net has already learnt to encode perceptual and semantic information that we need to measure our losses! 37 x s c

Slide 76

Slide 76 text

How we explicitly calculate the style and
 content losses 38 Ll content ( c , x ) = 1 2 X i,j Cl ij Xl ij 2

Slide 77

Slide 77 text

How we explicitly calculate the style and
 content losses 38 Ll content ( c , x ) = 1 2 X i,j Cl ij Xl ij 2 Gij( A ) = X k AikAjk El( s , x ) = 1 4N2 l M2 l X i,j Gij( S l) Gij( X l) 2 Lstyle( s , x ) = L X l=0 wlEl( s , x )

Slide 78

Slide 78 text

The last remaining technical bits and bobs 39 Total variation loss to control smoothness of the generated image LTV( x ) = X i,j ( xi,j+1 xij)2 + ( xi+1,j xij)2

Slide 79

Slide 79 text

The last remaining technical bits and bobs 39 Total variation loss to control smoothness of the generated image LTV( x ) = X i,j ( xi,j+1 xij)2 + ( xi+1,j xij)2 L-BFGS used as the optimisation algorithm since we’re only generating one image

Slide 80

Slide 80 text

"# Concrete implementation of the artistic style transfer algorithm in Keras 40 github.com/hnarayanan/artistic-style-transfer

Slide 81

Slide 81 text

41

Slide 82

Slide 82 text

41

Slide 83

Slide 83 text

Let’s look at some examples over a range of styles c_w = 0.025 s_w = 5 t_v_w = 0.1 c_w = 0.025 s_w = 5 t_v_w = 5 c_w = 0.025 s_w = 5 t_v_w = 0.5 c_w = 0.025 s_w = 5 t_v_w = 1

Slide 84

Slide 84 text

Let’s look at some examples over a range of styles c_w = 0.025 s_w = 5 t_v_w = 1 c_w = 0.025 s_w = 5 t_v_w = 0.1 c_w = 0.025 s_w = 5 t_v_w = 1 c_w = 0.025 s_w = 5 t_v_w = 0.5

Slide 85

Slide 85 text

And over a range of hyperparameters

Slide 86

Slide 86 text

And over a range of hyperparameters

Slide 87

Slide 87 text

And over a range of hyperparameters c_w = 0.025 s_w = 0.1—10 t_v_w = 0.1

Slide 88

Slide 88 text

Prisma Us Style

Slide 89

Slide 89 text

Prisma Us Style

Slide 90

Slide 90 text

Some broad concluding thoughts 46

Slide 91

Slide 91 text

Some broad concluding thoughts • Turn to machine learning when you have general problems that seem intuitive to state, but where it’s hard to explicitly write down all the solution steps • Note that this difficulty often stems from a semantic gap between the input representation and the task at hand 46

Slide 92

Slide 92 text

Some broad concluding thoughts • Turn to machine learning when you have general problems that seem intuitive to state, but where it’s hard to explicitly write down all the solution steps • Note that this difficulty often stems from a semantic gap between the input representation and the task at hand • Just because a function can fit something doesn’t mean the learning algorithm will always find that fit 46

Slide 93

Slide 93 text

Some broad concluding thoughts • Turn to machine learning when you have general problems that seem intuitive to state, but where it’s hard to explicitly write down all the solution steps • Note that this difficulty often stems from a semantic gap between the input representation and the task at hand • Just because a function can fit something doesn’t mean the learning algorithm will always find that fit • Deep learning is all about representation learning. They can learn features we’d otherwise need to hand-engineer with domain knowledge. 46

Slide 94

Slide 94 text

… and closer to this evening’s workshop 47

Slide 95

Slide 95 text

… and closer to this evening’s workshop • In studying the problem of cat vs. baby deeply, you’ve learnt how to see. You can repurpose this knowledge! 47

Slide 96

Slide 96 text

… and closer to this evening’s workshop • In studying the problem of cat vs. baby deeply, you’ve learnt how to see. You can repurpose this knowledge! • Convnets are really good at computer vision tasks, but they’re not infallible 47

Slide 97

Slide 97 text

… and closer to this evening’s workshop • In studying the problem of cat vs. baby deeply, you’ve learnt how to see. You can repurpose this knowledge! • Convnets are really good at computer vision tasks, but they’re not infallible • TensorFlow is great, but Keras is what you likely want to be using to experiment quickly 47

Slide 98

Slide 98 text

… and closer to this evening’s workshop • In studying the problem of cat vs. baby deeply, you’ve learnt how to see. You can repurpose this knowledge! • Convnets are really good at computer vision tasks, but they’re not infallible • TensorFlow is great, but Keras is what you likely want to be using to experiment quickly • Instead of solving an optimisation problem, train a network to approximate solutions to it for 1000x speedup 47

Slide 99

Slide 99 text

Questions? Harish Narayanan @copingbear harishnarayanan.org/writing/artistic-style-transfer github.com/hnarayanan/artistic-style-transfer

Slide 100

Slide 100 text

References and further reading 1. https://harishnarayanan.org/writing/artistic-style-transfer/; https://github.com/hnarayanan/artistic-style-transfer 2. http://prisma-ai.com; https://deepart.io; http://www.pikazoapp.com 3. https://arxiv.org/abs/1701.04928 4. http://www.artic.edu/aic/collections/artwork/80062 5. https://arxiv.org/abs/1508.06576 6. https://arxiv.org/abs/1508.06576 7. — 8. http://cs231n.github.io/classification/ 9. http://cs231n.github.io/classification/ 10.— 11.http://cs231n.stanford.edu/slides/winter1516_lecture2.pdf 12.http://cs231n.github.io/linear-classify/; https://www.tensorflow.org/tutorials/mnist/beginners/ 13.http://cs231n.github.io/optimization-1/

Slide 101

Slide 101 text

References and further reading 14.https://github.com/hnarayanan/artistic-style-transfer/blob/master/notebooks/1_Linear_Image_Classifier.ipynb;
 https://www.tensorflow.org/tutorials/mnist/beginners/ 15.http://cs231n.github.io/linear-classify/ 16.https://www.tensorflow.org/get_started/mnist/pros 17.https://appliedgo.net/perceptron/ 18.http://cs231n.github.io/neural-networks-1/; https://en.wikipedia.org/wiki/Universal_approximation_theorem 19.https://github.com/hnarayanan/artistic-style-transfer/blob/master/notebooks/2_Neural_Network- based_Image_Classifier-1.ipynb 20.https://github.com/hnarayanan/artistic-style-transfer/blob/master/notebooks/3_Neural_Network- based_Image_Classifier-2.ipynb 21.http://playground.tensorflow.org/; http://www.sciencedirect.com/science/article/pii/089360809190009T 22.— 23.http://cs231n.github.io/convolutional-networks/; https://www.youtube.com/watch?v=LxfUGhug-iQ 24.http://cs231n.github.io/convolutional-networks/ 25.http://cs231n.github.io/convolutional-networks/ 26.http://cs231n.github.io/convolutional-networks/

Slide 102

Slide 102 text

References and further reading 27.http://cs231n.github.io/convolutional-networks/ 28.https://github.com/hnarayanan/artistic-style-transfer/blob/master/notebooks/4_Convolutional_Neural_Network- based_Image_Classifier.ipynb; https://www.tensorflow.org/get_started/mnist/pros 29.https://transcranial.github.io/keras-js/#/mnist-cnn 30.http://www.deeplearningbook.org/contents/intro.html; https://www.youtube.com/watch?v=AgkfIQ4IGaM;
 http://www.matthewzeiler.com/pubs/arxive2013/arxive2013.pdf 31.— 32.https://arxiv.org/abs/1409.1556; http://image-net.org/challenges/LSVRC/2014/results 33.http://www.image-net.org; http://www.fast.ai/2017/01/03/keras/; https://www.youtube.com/watch?v=UeheTiBJ0Io 34.https://github.com/hnarayanan/artistic-style-transfer/blob/master/notebooks/5_VGG_Net_16_the_easy_way.ipynb;
 https://keras.io/applications/ 35.https://arxiv.org/abs/1508.06576 36.https://arxiv.org/abs/1508.06576 37.https://arxiv.org/abs/1508.06576; https://arxiv.org/abs/1603.08155 38.https://arxiv.org/abs/1508.06576 39.https://arxiv.org/pdf/1412.0035.pdf; https://en.wikipedia.org/wiki/Limited-memory_BFGS

Slide 103

Slide 103 text

References and further reading 40.https://github.com/hnarayanan/artistic-style-transfer/blob/master/notebooks/ 6_Artistic_style_transfer_with_a_repurposed_VGG_Net_16.ipynb
 https://github.com/fchollet/keras/blob/master/examples/neural_style_transfer.py 41.— 42.— 43.— 44.— 45.https://arxiv.org/abs/1603.08155 46.— 47.— 48.https://harishnarayanan.org/writing/artistic-style-transfer/; https://github.com/hnarayanan/artistic-style-transfer 49.— 50.— 51.— 52.—