ISSonDL2019 Neuroevolution

Slide 1

Slide 1 text

International Summer School on Deep Learning Neuroevolution workshop Michał Karzyński Intel AI

Slide 2

Slide 2 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 2 Hello, my name is… Michał Karzyński Software Architect at Intel AI @postrational http://karzyn.com

Slide 3

Slide 3 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 3 Neuroevolution Neuroevolution uses evolutionary algorithms to generate artificial neural networks, parameters, topology and rules. Stated goals of neuroevolution research: • Evolve complex AI systems • Democratize AI AutoML Neural Architecture Search Hyperparameter tuning Neuroevolution

Slide 4

Slide 4 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 4 Biological inspirations

Slide 5

Slide 5 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 5 Biological inspiration - Evolution • Natural selection – survival of the fittest • Genes for best fitness will be passed onto the next generation • Mutation – random changes can introduce new versions of genes which increase fitness • Speciation – over time new specialized species will arise Image source: Pixabay

Slide 6

Slide 6 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 6 Biological inspiration – Sexual reproduction • Offspring produced by two individuals • Each parent provides half of the genes • During meiosis two chromosomes undergo crossover to exchange homologous genetic material • Crossover creates genetic diversity Image source: Wikipedia

Slide 7

Slide 7 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 7 Biological inspiration – Development • Genetic information is not expressed directly • Genes encode a program for the development of the organism • Transcription factors control which genes are expressed in which cells, allowing reuse of genes at different stages and different places • Similar sets of genes under a different developmental program can create different species Image source: Wikipedia

Slide 8

Slide 8 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 8 Biological inspiration – Neurons and brain structure • Inspiration for the entire field of Deep Learning • Geometric topological arrangement of neurons in the visual cortex lines up with the grid of photoreceptive cells in the retina • Somatosensory representation of the body in the brain exhibits a similar geometry to the body itself Image source: Wikipedia

Slide 9

Slide 9 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 9 NeuroEvolution of Augmenting Topologies (NEAT)

Slide 10

Slide 10 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 10 NeuroEvolution of Augmenting Topologies (NEAT) Research started in 2002, primarily led by Kenneth O. Stanley and Risto Miikkulainen at The University of Texas at Austin. NEAT algorithm features: • Allows crossover • Protects innovation through speciation • Starts with minimal genome (allow evolution to add only needed features) Stanley, K. O., & Miikkulainen, R. (2002). Evolving Neural Networks through Augmenting Topologies. Evolutionary Computation, 10(2), 99–127.

Slide 11

Slide 11 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 11 NEAT Genetic Encoding • Encode nodes and connections • Use historical markers

Slide 12

Slide 12 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 12 NEAT Mutations • Change weight • Add/remove connection • Add/remove node

Slide 13

Slide 13 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 13 NEAT Crossover • Uses historical markers to align genomes

Slide 14

Slide 14 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 14 NEAT Speciation • Protect innovation by performing selection within a “species” of similar individuals

Slide 15

Slide 15 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 15 NEAT Ablation study • Evaluated on double pole balancing task (commonly used for reinforcement learning) • Evaluations: average number of individuals needed to complete task • Failure: no result within 1000 generations

Slide 16

Slide 16 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 16 Hypercube-based NEAT (HyperNEAT)

Slide 17

Slide 17 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 17 Hypercube-based NEAT (HyperNEAT) Builds on work on NEAT, starting around 2009. HyperNEAT extends NEAT by adding: • Mechanism simulating biological development • Indirect evolution – the developmental program is evolving, not the ANN itself • Substrate – ability to take advantage of spatial geometry of the problem domain Stanley, K. O., D’Ambrosio, D. B., & Gauci, J. (2009). A Hypercube-Based Encoding for Evolving Large-Scale Neural Networks. Artificial Life, 15(2), 185–212. Gauci, J., & Stanley, K. O. (2010). Autonomous Evolution of Topographic Regularities in Artificial Neural Networks - Neural Computation, 22(7), 1860–1898.

Slide 18

Slide 18 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 18 HyperNEAT Indirect encoding • Compositional pattern producing network (CPPN) produces an ANN by defining connections in a geometric substrate • NEAT is applied to evolve the CPPN

Slide 19

Slide 19 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 19 HyperNEAT Evolved connectivity patterns

Slide 20

Slide 20 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 20 HyperNEAT Connectivity Concepts

Slide 21

Slide 21 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 21 HyperNEAT Checkers • A CPPN calculates the weight of connections between the board and a hidden layer (AB) or the hidden layer and the output layer (BC) • Output of ANN is the score of the move to this position

Slide 22

Slide 22 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 22 HyperNEAT Compression • A relatively small CPPN can encode a much larger ANN

Slide 23

Slide 23 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 23 Deep Neuroevolution

Slide 24

Slide 24 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 24 Deep Neuroevolution • Builds on the success of deep neural networks (AlexNet, VGG, GoogLeNet, ResNet, etc.) • Steps away from building a custom feed-forward neural network edge-by-edge • Genes now represent layers of a deep neural network and their attributes (e.g.: type: Convolution, attributes: strides, padding, output channels, etc.) • Mutations can add or remove layers, change their type or attribute values • Combines two search algorithms: genetic algorithms for architecture search and backpropagation for training

Slide 25

Slide 25 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 25 CoDeepNEAT Research on NEAT continues into the Deep Neuroevolution era at Cognizant, an AI startup. DeepNEAT - NEAT approach is used, but each node in a chromosome now represents a network layer. CoDeepNEAT – two populations of modules and blueprints are evolved separately using DeepNEAT. The blueprint chromosome is a graph where each node contains a pointer to a particular module species. For fitness evaluation, the modules and blueprints are combined to create an ANN. Miikkulainen, R. et al. (2017). Evolving Deep Neural Networks.

Slide 26

Slide 26 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 26 CoDeepNEAT Modules and blueprints

Slide 27

Slide 27 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 27 CoDeepNEAT CIFAR-10 • 92.7% accuracy on CIFAR-10 • Training time was limited, so evolutionary pressure created fast training networks (120 epoch to converge)

Slide 28

Slide 28 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 28 CoDeepNEAT Custom LSTM

Slide 29

Slide 29 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 29 Neuroevolution research at Google Teams at Google Brain and DeepMind are working on neural architecture search methods based on genetic algorithms. In 2017 they published an influential paper on evolving image classifiers and achieved 94.6% accuracy on CIFAR-10 with no human participation in neural network design. Real, E. et al. (2017). Large-Scale Evolution of Image Classifiers

Slide 30

Slide 30 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 30 Real, E. et al. 2017 Mutations

Slide 31

Slide 31 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 31 Real, E. et al. 2017 Evolution

Slide 32

Slide 32 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 32 Real, E. et al. 2017 Results

Slide 33

Slide 33 text

International Summer School on Deep Learning 33 Hierarchial representations Inspired by reusable modules in human-created networks such as Inception, ResNet, etc. Build up more complex structures from simpler ones. Results in much faster evolution than a non-hierarchial representation. Similar in concept to modules and blueprints of CoDeepNEAT. Liu, H., Simonyan, K., Vinyals, O., Fernando, C., & Kavukcuoglu, K. (2017). Hierarchical Representations for Efficient Architecture Search

Slide 34

Slide 34 text

International Summer School on Deep Learning 34 Hierarchial representations

Slide 35

Slide 35 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 35 Google AutoML using Reinforcement Learning (NAS) Zoph, B., & Le, Q. V. (2016). Neural Architecture Search with Reinforcement Learning.

Slide 36

Slide 36 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 36 Reinforcement Learning vs Neuroevolution Real, E. et al. (2019). Regularized Evolution for Image Classifier Architecture Search

Slide 37

Slide 37 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 37 Function-preserving mutations New research performed at IBM, based on work by Chen et al. (ICLR, 2016), who proposed function-preserving transformations for transfer learning. Chen, T., Goodfellow, I., & Shlens, J. (2016). Net2Net: Accelerating Learning via Knowledge Transfer

Slide 38

Slide 38 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 38 Function-preserving mutations Martin Wistuba at IBM described the following function-preserving transformations: • Layer Widening – increase the number of filters in a convolutional layer • Layer Deepening – deepen a network by inserting an additional convolutional or fully connected layer • Kernel Widening – increase the kernel size in a convolutional layer by padding with zeros • Insert Skip Connections – initialize connection weights to produce zeros • Branch Layers – insert branching into network Wistuba, M. (2019). Deep learning architecture search by neuro-cell-based evolution with function-preserving mutations.

Slide 39

Slide 39 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 39 Function-preserving mutations He used them to create a set of function-preserving mutations: • Insert Convolution • Branch and Insert Convolution • Insert Skip • Alter Number of Filters • Alter Number of Units • Alter Kernel Size • Branch Convolution

Slide 40

Slide 40 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 40 Function- preserving mutations Results • Duration in GPU days

Slide 41

Slide 41 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 41 Function- preserving mutations Results

Slide 42

Slide 42 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 42 Takeaways • AutoML is a powerful technique, starting to outperform human ability • Neuroevolution is a viable approach to AutoML • Research in the are is ongoing and likely to generate results in near future • Computational costs are very high and put a limit on research

Slide 43

Slide 43 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 43 Takeaways • Deep Neuroevolution combines evolutionary search with weight training by backpropagation • Lamarckian evolution – passing on learned weight to the next generation improves search performance • Function-preserving mutations make Lamarckian evolution possible • Efficient methods perform multi-level evolution: smaller modules and larger networks

Slide 44

Slide 44 text

International Summer School on Deep Learning Michał Karzyński - Neuroevolution 44 Interesting research questions • What is the best genome encoding for Neuroevolution? • What other mutations can be added to the search? • Can “substrate geometry” be used in Deep Neuroevolution (like in HyperNEAT)? • Will a training network (RL) approach be more efficient than evolutionary search? • We need good open-source frameworks for Deep Neuroevolution research