Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Pushing the boundaries of material design with RLlib

Pushing the boundaries of material design with RLlib

Abstract: Improving the design and properties of biomedical devices is fundamental to both academic research and the commercialization of such devices. However, improvement of the designs and their physical properties often relies on heuristics, ad-hoc choices, or in the best case iterative topology optimization methods.
We combine material simulation and reinforcement learning to create new optimized designs. The reinforcement learner’s goal is to reduce the weight of an object, but it has to withstand various types of physical forces such as stretching, twisting, compressing, etc. It does so by iteratively pruning a full block of material to reduce the weight. Due to the considerable number of learning iteration steps required, it is vital that the system simulates every iteration in as little time as possible.
The use of RLlib and Ray Tune enables broad-scale parallelization of the reinforcement learning pipeline and deployment on a decentralized computing platform. This allows us to cut the training time by orders of magnitude and the resulting design outperforms the baseline case with several unique designs.

Speaker: Tomasz Zaluska is a visiting graduate student at Stanford. He focuses on applied ML to neuroscience.

Anyscale

June 23, 2022
Tweet

More Decks by Anyscale

Other Decks in Technology

Transcript

  1. Lighter than a feather, stronger than a rock Pushing boundaries

    of material design with RLlib Tomasz Zaluska 1,2 1. Laboratory for Biosensors and Bioelectronics, department of information technology, ETH Zürich 2. Cui Lab, Department of Chemistry, Stanford University
  2. Outline • Big picture: Optimizing material design • Use case:

    Electrode optimization in neuroscience • Neuroscience crash course • Improving mesh electrodes • RL framework • Simulator • Algorithm • Results • Why Ray and Conclusion 2
  3. Searching for optimal design Size requirement Weight requirement How to

    design it? 3 Big Picture Use case RL framework Results
  4. Problem: General framework to improve material design Domain specific knowledge

    Time intensive iteration Heuristics 4 Big Picture Use case RL framework Results
  5. Opportunity: Increasing number of open source libraries and scalable machine

    learning Domain specific knowledge Time intensive iteration Heuristics Fast open-source simulation Scalable machine learning 5 Big Picture Use case RL framework Results
  6. Crash course neuroscience Big Picture Use case RL framework Results

    6 Top-down Bottom-up FMRI Calcium Imaging Opto- genetics Electro physiology 4 neuron ensembles connected to each other FMRI image of health subject
  7. Bottom-up: Electrophysiology Big Picture Use case RL framework Results 7

    • We want to measure the electrical activity of single neurons • Neurons communicate via all-or-nothing electrochemical pulse called an action potential • Approach: analyze electrical activate of single neurons with a electrodes
  8. Use case • We are building the next generation neural

    electrodes • Electrodes have a fish-net like structure to embrace neurons inside of them • Problem: minimize the weight and surface, but remain robust to physical stresses Mesh electrode designed by Lieber lab 8 Big Picture Use case RL framework Results
  9. Simulator 40 seconds Precompiled (CPU) • Loading physics engine •

    Preparing environment 1 second Run time (GPU) • Creation of simulation object • ~50 Simulation steps • Calculation of strain for each step Big Picture Use case RL framework Results 10
  10. RL algorithms: PPO Big Picture Use case RL framework Results

    12 PPO • Both for continuous and discrete spaces • strikes a balance between ease of implementation, sample complexity • computes an update at each step that minimizes the cost function while ensuring the deviation from the previous policy is relatively small.
  11. Tune Rllib tune config 1 Computational architecture Worker 1 Worker

    2 Simulation 1 GPU Simulation 2 GPU Rllib tune config 2 Worker 1 Worker 2 Simulation 1 GPU Simulation 2 GPU 13 Big Picture Use case RL framework Results • Local (1 GPU, 2 CPU) • ~1 sec per simulation step • Cluster (Stanford computing cluster - 32 CPU core job): • ~2 sec per simulation step • Total training time (1m simulation) 17h
  12. Results Weight of mesh decreases from 62% to 51% 0%

    20% 40% 60% 80% 100% Full Mesh Baseline Ai design best performer Weight of mesh in percent of full solid 0 50 100 150 200 250 0k 10k 20k 30k 40k 50k 60k 70k 80k Average Reward per epoch Reward flats out after 80k epochs (each epoch are 200 simulations) 14 Big Picture Use case RL framework Results
  13. Simulation results Mesh fails twirl test Mesh passes water bubble

    test 15 Big Picture Use case RL framework Results
  14. Why Ray? • Benchmarking algorithms / hyperparameter tuning • Optimizing

    quickly different parameters and algorithms • Metrics Tensorboard / Dashboard • Helps with memory leaks • Performance monitoring • Easy scalable for different architectures • Swiftly changing from local environment for debugging to training on cluster 16 Big Picture Use case RL framework Results
  15. Challenges Defining appropriate reward function Reduce training time to scale

    up further Performance not stable 17 Big Picture Use case RL framework Results