Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Pushing the boundaries of material design with RLlib

Pushing the boundaries of material design with RLlib

Abstract: Improving the design and properties of biomedical devices is fundamental to both academic research and the commercialization of such devices. However, improvement of the designs and their physical properties often relies on heuristics, ad-hoc choices, or in the best case iterative topology optimization methods.
We combine material simulation and reinforcement learning to create new optimized designs. The reinforcement learner’s goal is to reduce the weight of an object, but it has to withstand various types of physical forces such as stretching, twisting, compressing, etc. It does so by iteratively pruning a full block of material to reduce the weight. Due to the considerable number of learning iteration steps required, it is vital that the system simulates every iteration in as little time as possible.
The use of RLlib and Ray Tune enables broad-scale parallelization of the reinforcement learning pipeline and deployment on a decentralized computing platform. This allows us to cut the training time by orders of magnitude and the resulting design outperforms the baseline case with several unique designs.

Speaker: Tomasz Zaluska is a visiting graduate student at Stanford. He focuses on applied ML to neuroscience.

Anyscale
PRO

June 23, 2022
Tweet

More Decks by Anyscale

Other Decks in Technology

Transcript

  1. Lighter than
    a feather,
    stronger
    than a rock
    Pushing boundaries of material
    design with RLlib
    Tomasz Zaluska 1,2
    1. Laboratory for Biosensors and Bioelectronics, department of
    information technology, ETH Zürich
    2. Cui Lab, Department of Chemistry, Stanford University

    View Slide

  2. Outline
    • Big picture: Optimizing material design
    • Use case: Electrode optimization in neuroscience
    • Neuroscience crash course
    • Improving mesh electrodes
    • RL framework
    • Simulator
    • Algorithm
    • Results
    • Why Ray and Conclusion
    2

    View Slide

  3. Searching for optimal design
    Size requirement
    Weight requirement
    How to design it?
    3
    Big Picture Use case RL framework Results

    View Slide

  4. Problem: General framework to improve
    material design
    Domain specific
    knowledge
    Time intensive
    iteration
    Heuristics
    4
    Big Picture Use case RL framework Results

    View Slide

  5. Opportunity: Increasing number of open
    source libraries and scalable machine
    learning
    Domain specific
    knowledge
    Time intensive
    iteration
    Heuristics
    Fast open-source
    simulation
    Scalable machine
    learning
    5
    Big Picture Use case RL framework Results

    View Slide

  6. Crash course neuroscience
    Big Picture Use case RL framework Results 6
    Top-down
    Bottom-up
    FMRI
    Calcium
    Imaging
    Opto-
    genetics
    Electro
    physiology
    4 neuron ensembles
    connected to each other
    FMRI image of
    health subject

    View Slide

  7. Bottom-up: Electrophysiology
    Big Picture Use case RL framework Results 7
    • We want to measure
    the electrical activity
    of single neurons
    • Neurons communicate
    via all-or-nothing
    electrochemical pulse
    called an action
    potential
    • Approach: analyze
    electrical activate of
    single neurons with a
    electrodes

    View Slide

  8. Use case
    • We are building the
    next generation neural
    electrodes
    • Electrodes have a
    fish-net like structure
    to embrace neurons
    inside of them
    • Problem: minimize the
    weight and surface,
    but remain robust to
    physical stresses
    Mesh electrode designed by Lieber lab
    8
    Big Picture Use case RL framework Results

    View Slide

  9. Deployment:
    Rl pipeline: RLLIB
    Simulation:
    Solution: RL for optimizing design
    9
    Big Picture Use case RL framework Results

    View Slide

  10. Simulator
    40 seconds
    Precompiled
    (CPU)
    • Loading physics
    engine
    • Preparing
    environment
    1 second
    Run time (GPU)
    • Creation of
    simulation object
    • ~50 Simulation
    steps
    • Calculation of
    strain for each
    step
    Big Picture Use case RL framework Results 10

    View Slide

  11. Custom RL environment
    Big Picture Use case RL framework Results 11

    View Slide

  12. RL algorithms: PPO
    Big Picture Use case RL framework Results 12
    PPO
    • Both for continuous and
    discrete spaces
    • strikes a balance between
    ease of implementation,
    sample complexity
    • computes an update at each
    step that minimizes the cost
    function while ensuring the
    deviation from the previous
    policy is relatively small.

    View Slide

  13. Tune
    Rllib tune config 1
    Computational architecture
    Worker 1 Worker 2
    Simulation 1
    GPU
    Simulation 2
    GPU
    Rllib tune config 2
    Worker 1 Worker 2
    Simulation 1
    GPU
    Simulation 2
    GPU
    13
    Big Picture Use case RL framework Results
    • Local (1 GPU, 2 CPU)
    • ~1 sec per simulation step
    • Cluster (Stanford computing
    cluster - 32 CPU core job):
    • ~2 sec per simulation step
    • Total training time (1m
    simulation) 17h

    View Slide

  14. Results
    Weight of mesh decreases
    from 62% to 51%
    0%
    20%
    40%
    60%
    80%
    100%
    Full Mesh Baseline Ai design best
    performer
    Weight of mesh in percent of full
    solid
    0
    50
    100
    150
    200
    250
    0k 10k 20k 30k 40k 50k 60k 70k 80k
    Average Reward per epoch
    Reward flats out after 80k epochs
    (each epoch are 200 simulations)
    14
    Big Picture Use case RL framework Results

    View Slide

  15. Simulation results
    Mesh fails twirl test Mesh passes water bubble test
    15
    Big Picture Use case RL framework Results

    View Slide

  16. Why Ray?
    • Benchmarking algorithms /
    hyperparameter tuning
    • Optimizing quickly different parameters
    and algorithms
    • Metrics Tensorboard / Dashboard
    • Helps with memory leaks
    • Performance monitoring
    • Easy scalable for different
    architectures
    • Swiftly changing from local environment
    for debugging to training on cluster
    16
    Big Picture Use case RL framework Results

    View Slide

  17. Challenges
    Defining appropriate reward
    function
    Reduce training time to scale
    up further
    Performance not stable
    17
    Big Picture Use case RL framework Results

    View Slide

  18. Thank you for your attention!
    18

    View Slide