Finding TF binding sites SNP’s and linkage analysis Quality Value HMM …. Not much practical neural network approach before 2012 AFAIK Brief Bioinform. 2006;7(1):86-112. doi:10.1093/bib/bbk007
before 21st century. What we do now in 2018 All the “magics” of deep learning in 5 lines: Auto-differentiation Back-propagation Stochastic Gradient Decent Published at 1990
the model Apply model on the test set Re-train model if necessary Evaluation training result, training set / validation set Train model Build model (typical multiple layer NN + nonlinear activation) Gather training dataset, clean up data
of “A”, “C”, “G”, and “T” • Build Applications of Deep Learning in Genomics • Convert strings of “A”, “C”, “G” and “T” to “tensors” • Some ”ground truth” so we can train a network from sequence tensors to the answer. https://www.cc.gatech.edu/~san37/post/dlhc-start/
Test datat on new platform BGI-SEQ • Train data preparation • Software package management • Post-variant call data processing https://blog.dnanexus.com/2018-05-31-training-and- applying-genomic-deep-le
experiments, e.g., transcription factor binding with ChIPS-eq, or chromatin accessibility asset (ATAC-seq). Want to know: sequence features in the reference that corresponding determine the experiment measurement outcome Input tensor: “One-hot encoding” of candidate regions of the reference genome
project has collected most sequence feature detection models that has been developed We are working to deploy Kipoi on DNAnexus platform which is designed to be more close to genomic data. http://kipoi.org Žiga Avsec, Jun Cheng and Julien Gagneur, Technical University of Munic Roman Kreuzhuber, Lara Urban and Oliver Stegle, European Bioinformatics Institute Johnny Israeli, Avanti Shrikumar, Chuan Foo and Anshul Kundaje, Stanford University
start? Build your own computer + GPU AWS, Google Cloud, Microsoft Azure DNAnexus platform • Close to the genomic data • Interactive work • Batch processing