Masters Thesis

Analysis of Efficient Neural Architecture search via parameter sharing Prabhant
Singh Supervisors: Tobias Jacobs, Meelis Kull

Motivation Story of this thesis - Apply transfer learning to
ENAS to speed up the process - ENAS showed unexpected results - Thorough analysis of ENAS controller

Neural architecture search What’s Neural Architecture search? - Automating the
design of neural networks Why we need Neural architecture search? - Reduce the cost of designing novel architectures - Faster design time

Neural architecture search Elements of Neural architecture search • Search
space • Search strategy • Performance estimation strategy

NAS Neural architecture search with reinforcement learning • Proposed by
Google • First technique to match SOTA performance in CIFAR-10 and PTB • Agent samples architectures from search space • Evaluates architectures by training them from scratch • Expensive more than 24000 GPU hours required to find the architectures

ENAS • Proposed by Google • Uses similar controller as
NAS • Uses weight sharing based performance estimation strategy for evaluation • Provides search space options as macro and micro • Comparable results to NAS with just 24 GPU hours

ENAS: Parameter Sharing

Experiments • Transfer learning from CIFAR-10 to CIFAR-100 • Learning
curve evaluation: Evaluating architectures sampled at initial epoch and final epoch of ENAS • Performance estimation strategy: Training 5 architectures sampled at epoch 155 from scratch

Results: Transfer learning CIFAR-100, Macro search space Sampled epoch Transfer
applied Accuracy 310 NO 80.55 155 NO 80.33 100 NO 80.78 310 YES 80.35 155 YES 80.39 100 YES 80.19

Results: Learning curve evaluation Dataset Search space Sampled epoch Accuracy
CIFAR-10 macro 1 96.69, 95.80, 95.71 CIFAR-10 macro 310 95.38, 95.81, 95.76 CIFAR-100 macro 1 80.75, 77.12, 80.55 CIFAR-100 macro 310 80.39, 80.07, 80.47 CIFAR-100 micro 1 79.59, 77.67 CIFAR-100 micro 310 80.50, 80.02

Results: Performance estimation strategy Sampled epoch: 155, CIFAR-100, Macro search
space Validation accuracy Final accuracy 41.41 80.33 32.81 81.11 28.12 81.12 21.09 80.50 17.97 80.81

Conclusion • ENAS performance is mainly because of its sophisticated
search space. • ENAS Search strategy is as good as random search. • ENAS performance estimation strategy is biased and overfit on the models with more shared weights. Future Work: • Applying same experimental methodology to DARTS, PENAS and NAO.

Thank you!

Neural architecture search as an RL problem - Agent: LSTM
- Action space: Search space - Actions: Sampled architectures - Reward : input from performance estimation strategy For ENAS: At first step the controller receives an empty input.

Masters Thesis

Masters Thesis

prabhant

More Decks by prabhant

Other Decks in Education

Featured

Transcript

Analysis of Efficient Neural Architecture search via parameter sharing Prabhant

Motivation Story of this thesis - Apply transfer learning to

Neural architecture search What’s Neural Architecture search? - Automating the

Neural architecture search Elements of Neural architecture search • Search

Neural architecture search Elements of Neural architecture search • Search

NAS Neural architecture search with reinforcement learning • Proposed by

ENAS • Proposed by Google • Uses similar controller as

ENAS: Parameter Sharing

ENAS

Experiments • Transfer learning from CIFAR-10 to CIFAR-100 • Learning

Results: Transfer learning CIFAR-100, Macro search space Sampled epoch Transfer

Results: Learning curve evaluation Dataset Search space Sampled epoch Accuracy

Results: Performance estimation strategy Sampled epoch: 155, CIFAR-100, Macro search

Conclusion • ENAS performance is mainly because of its sophisticated

Thank you!

Neural architecture search as an RL problem - Agent: LSTM