Deep learning for protein engineering with Ray - Stanley Bishop DeepChem.io

Stanley Page 1 Ray & Protein Engineering @ DeepChem

Stanley Page 2 • Mathematician working as a machine learning
scientist • Developer at deepchem.io, an open-source project to democratize deep learning for science ✨come check us out✨ • Mostly made of proteins

Stanley Page 3 • The DeepChem project works to democratize
deep learning for science • The DeepChem project aims to create high quality, open source tools for drug discovery, materials science, quantum chemistry, and biology. • DeepChem projects are managed by a group of open source contributors.

Stanley Page 4 • Proteins are complex molecules that perform
critical functions in our bodies • Proteins are synthesized through the processes of transcription and translation • Protein synthesis is the bioinformatic equivalent to a compiler process

Stanley Page 5 • Proteins are made of amino acids
• Amino acid compositions are deterministic and combinatorial • Imagine Lego bricks

Stanley Page 6 • Protein structure is the result of
a three-part process • Until recently, these structures were largely observed experimentally • AlphaFold has changed the game

Stanley Page 8 • Protein complexes can perform predictable actions
in response to stimulus • Molecules activate a protein by ‘docking’ which changes its electrochemical dynamics • The dynamics of this docking are of crucial importance to medicine discovery and disease treatment

Stanley Page 9 • In the 1950s there were tens
of thousands birth defects caused by Thalidomide prescriptions to expecting mothers • Thalidomide-TBX5 docking complex yields reactive oxygen species • With predictive AI technologies, these reactions can be found prior to human testing

• The space of possible ligand molecules can contain hundreds
of millions to billions of potential compounds • Per ligand computations are relatively expensive • Protein ligand docking presents a challenging distributed computing problem • That has been traditionally solved with on-prem hardware Stanley Page 10

Stanley Page 11 Docking is a DAG!

Stanley Page 12 • Task 1: generate 3D structures

Stanley Page 13 • Task 2: generate features

Stanley Page 14 • Task 3: compute dock complex

Stanley Page 15

Stanley Page 16 • Active learning has the potential to
accelerate molecular simulation • Teams at Stanford and Harvard are using Ray for this purpose • Early results indicate a 10x improvement in computational efficiency

• The DeepChem is always looking for contributors. Check us
out at DeepChem.io • Bioinformatics is likely to soon lead the machine learning charge in terms of the data scale of models… so there will be a lot to build • Deep gratitude to the Ray community for building such important infostructure for the machine learning revolution

Deep learning for protein engineering with Ray ...

Deep learning for protein engineering with Ray - Stanley Bishop DeepChem.io

Anyscale

More Decks by Anyscale

Other Decks in Technology

Featured

Transcript

Stanley Page 1 Ray & Protein Engineering @ DeepChem

Stanley Page 2 • Mathematician working as a machine learning

Stanley Page 3 • The DeepChem project works to democratize

Stanley Page 4 • Proteins are complex molecules that perform

Stanley Page 4 • Proteins are complex molecules that perform

Stanley Page 5 • Proteins are made of amino acids

Stanley Page 6 • Protein structure is the result of

Stanley Page 8 • Protein complexes can perform predictable actions

Stanley Page 9 • In the 1950s there were tens

• The space of possible ligand molecules can contain hundreds

Stanley Page 11 Docking is a DAG!

Stanley Page 12 • Task 1: generate 3D structures

Stanley Page 13 • Task 2: generate features

Stanley Page 14 • Task 3: compute dock complex

Stanley Page 15

Stanley Page 16 • Active learning has the potential to

• The DeepChem is always looking for contributors. Check us