Pushing the boundaries of material design with RLlib

Slide 1

Slide 1 text

Lighter than a feather, stronger than a rock Pushing boundaries of material design with RLlib Tomasz Zaluska 1,2 1. Laboratory for Biosensors and Bioelectronics, department of information technology, ETH Zürich 2. Cui Lab, Department of Chemistry, Stanford University

Slide 2

Slide 2 text

Outline • Big picture: Optimizing material design • Use case: Electrode optimization in neuroscience • Neuroscience crash course • Improving mesh electrodes • RL framework • Simulator • Algorithm • Results • Why Ray and Conclusion 2

Slide 3

Slide 3 text

Searching for optimal design Size requirement Weight requirement How to design it? 3 Big Picture Use case RL framework Results

Slide 4

Slide 4 text

Problem: General framework to improve material design Domain specific knowledge Time intensive iteration Heuristics 4 Big Picture Use case RL framework Results

Slide 5

Slide 5 text

Opportunity: Increasing number of open source libraries and scalable machine learning Domain specific knowledge Time intensive iteration Heuristics Fast open-source simulation Scalable machine learning 5 Big Picture Use case RL framework Results

Slide 6

Slide 6 text

Crash course neuroscience Big Picture Use case RL framework Results 6 Top-down Bottom-up FMRI Calcium Imaging Opto- genetics Electro physiology 4 neuron ensembles connected to each other FMRI image of health subject

Slide 7

Slide 7 text

Bottom-up: Electrophysiology Big Picture Use case RL framework Results 7 • We want to measure the electrical activity of single neurons • Neurons communicate via all-or-nothing electrochemical pulse called an action potential • Approach: analyze electrical activate of single neurons with a electrodes

Slide 8

Slide 8 text

Use case • We are building the next generation neural electrodes • Electrodes have a fish-net like structure to embrace neurons inside of them • Problem: minimize the weight and surface, but remain robust to physical stresses Mesh electrode designed by Lieber lab 8 Big Picture Use case RL framework Results

Slide 9

Slide 9 text

Deployment: Rl pipeline: RLLIB Simulation: Solution: RL for optimizing design 9 Big Picture Use case RL framework Results

Slide 10

Slide 10 text

Simulator 40 seconds Precompiled (CPU) • Loading physics engine • Preparing environment 1 second Run time (GPU) • Creation of simulation object • ~50 Simulation steps • Calculation of strain for each step Big Picture Use case RL framework Results 10

Slide 11

Slide 11 text

Custom RL environment Big Picture Use case RL framework Results 11

Slide 12

Slide 12 text

RL algorithms: PPO Big Picture Use case RL framework Results 12 PPO • Both for continuous and discrete spaces • strikes a balance between ease of implementation, sample complexity • computes an update at each step that minimizes the cost function while ensuring the deviation from the previous policy is relatively small.

Slide 13

Slide 13 text

Tune Rllib tune config 1 Computational architecture Worker 1 Worker 2 Simulation 1 GPU Simulation 2 GPU Rllib tune config 2 Worker 1 Worker 2 Simulation 1 GPU Simulation 2 GPU 13 Big Picture Use case RL framework Results • Local (1 GPU, 2 CPU) • ~1 sec per simulation step • Cluster (Stanford computing cluster - 32 CPU core job): • ~2 sec per simulation step • Total training time (1m simulation) 17h

Slide 14

Slide 14 text

Results Weight of mesh decreases from 62% to 51% 0% 20% 40% 60% 80% 100% Full Mesh Baseline Ai design best performer Weight of mesh in percent of full solid 0 50 100 150 200 250 0k 10k 20k 30k 40k 50k 60k 70k 80k Average Reward per epoch Reward flats out after 80k epochs (each epoch are 200 simulations) 14 Big Picture Use case RL framework Results

Slide 15

Slide 15 text

Simulation results Mesh fails twirl test Mesh passes water bubble test 15 Big Picture Use case RL framework Results

Slide 16

Slide 16 text

Why Ray? • Benchmarking algorithms / hyperparameter tuning • Optimizing quickly different parameters and algorithms • Metrics Tensorboard / Dashboard • Helps with memory leaks • Performance monitoring • Easy scalable for different architectures • Swiftly changing from local environment for debugging to training on cluster 16 Big Picture Use case RL framework Results

Slide 17

Slide 17 text

Challenges Defining appropriate reward function Reduce training time to scale up further Performance not stable 17 Big Picture Use case RL framework Results

Slide 18

Slide 18 text

Thank you for your attention! 18