convolutional network trained by the VGG group at Oxford University Transfer learning Re-using pre-trained networks Deep learning tricks of the trade tips to save you some time
Specify network in terms of layers Flexible and powerful - Specify network in terms of mathematical expressions Neural network toolkits Expression compilers API style
from the final layer through the network all_params = lasagne.layers.get_all_params(final_layer, trainable=True) # Get parameters from pre-trained layers; give the top pre-trained layer pretrained_params = lasagne.layers.get_all_params( vgg16.network[‘pool5’], trainable=True) new_params = [p for p in all_params if p not in pretrained_params]
mini-batch results in regularization (due to noise), reaching lower error rates in the end [Goodfellow16]. When using very small mini- batches, need to compensate with lower learning rate and more epochs. Slow due to low parallelism Does not use all cores of GPU Low memory usage Less neuron activations kept in RAM
rate as with smaller batches and may not learn at all. Can be fast due to high parallelism Uses GPU parallelism (there are limits; gains only achievable if there are unused CUDA cores) High memory usage Lots of neuron activations kept around; can run out of RAM on large networks
lots of experiments use ~100 Effective training Learns reasonably quickly – in terms of improvement per epoch – and reaches acceptable error rate or loss Medium performance Acceptable in many cases Medium memory usage Fine for modest sized networks
15] Train two networks; one given random parameters to generate an image, another to discriminate between a generated image and one from the training set
model [Simonyan14] and extract texture features from one of the convolutional layers, given a target style / painting as input Use gradient descent to iterate photo – not weights – so that its texture features match those of the target image.