$30 off During Our Annual Pro Sale. View Details »

ML for Genetic Engineering

ML for Genetic Engineering

We built a classification ML model to help plant breeders cultivate the healthiest, high-yielding crops.

See winning solutions here:
https://irririceresearch.wixsite.com/annual-report-2019/innovate

The problem:

Phenotype Prediction Challenge

Build a machine learning model to predict the value of a certain numeric phenotypes given a high-dimensional genotype information,
You will be given a training dataset consisting of numerically encoded genotype data for genetically distinct rice varieties, along with records of certain agronomically important traits for the same varieties (such as grain yield, plant height, days to flowering). Your goal is to implement a model that would predict the values of these phenotypes given genotype data of an unseen rice sample. (It can be a single model for all traits or one model for each trait. You may use any programming language or ML framework. We will also provide a baseline model against which you can test your model. Once your model(s) is ready for submission, you will need to run it on the test dataset which we will provide. Submitted models will be evaluated using metrics such as mean absolute error and the r-squared score. For each phenotype, we will compare your model with a baseline linear model and the models of other participants.

https://hack4rice2019.irri.org/

Shanelle Recheta

March 19, 2021
Tweet

More Decks by Shanelle Recheta

Other Decks in Technology

Transcript

  1. EXPLORE | EXPERIMENT | EXECUTE

  2. PROBLEM PLANT BREEDERS INCREASING DEMAND OF FOOD PLANT STRESS LOSSES

  3. Current Trends on Genetic Engineering Developing Climatically-resilient Cultivars Incorporating Resistance

    Genes Early Stress Prediction
  4. PROBLEM RESEARCHERS NON STANDARDIZED DATA PROCESSING UNDERUTILIZED GENOTYPING TOOLS MISSING

    ATTRIBUTES
  5. Latest Phenotype Prediction Technology VERY HIGH MEMORY REQUIREMENT!

  6. Deep phenotyping An open source deep learning tool for phenotype/genotype

    classification
  7. DEEP LEARNING 3 Phases of the Pipeline * for High-Throughput

    Phenotyping CLASSIFICATION into different groups PREDICTION of different plant characteristics IDENTIFICATION of genetic make up of various plants Multilayer Perceptron (MLP) and Artificial Neural Network (ANN)
  8. MLP RESULTS FAST & ACCURATE ACCURACY 87% PROCESSING TIME 00:03:13

  9. FAST & ACCURATE ACCURACY 78% PROCESSING TIME 00:02:18 ANN RESULTS

  10. Future Work Key features to be implemented: • Crowdsourced Phenotype

    Dataset • Crop Geotagging with KYC Verification • Automated Standardization • Hello World Phenotyping • Cross validation, Ensemble Models & Feature Engineering
  11. THE TEAM Alec Xavier Manabat Design Engineer Analog Devices, Inc.

    Shanelle Grace Recheta Data Science Consultant FTW Foundation | Cropital TOWARDS INCLUSIVITY IN RICE SCIENCE!