Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Computational Challenges in Precision Medicine ...

log0
June 09, 2014

Computational Challenges in Precision Medicine and Genomics - Gary Bader - 2014

Computational Challenges in Precision Medicine and Genomics - Gary Bader - 2014

log0

June 09, 2014
Tweet

More Decks by log0

Other Decks in Research

Transcript

  1. PRECISION MEDICINE • TRADITIONAL MEDICINE, WITH MORE DATA • DIAGNOSIS:

    ASSIGNING PATIENTS TO GROUPS – BIOLOGY, DISEASE PROGRESSION, TREATMENT RESPONSE • PERSONALIZED, BUT NOT EVERYONE HAS A DIFFERENT DISEASE NATURE MEDICINE 19, 249 (2013) DOI:10.1038/NM0313-249
  2. NATIONAL COMPREHENSIVE CANCER NETWORK (NCCN) B reast C ancer N

    oni nvasi ve I nvasi ve Lobul ar C arci nom a I n S i tu D uctal C arci nom a I n S i tu Lobul ar C arci nom a D uctal C arci nom a I nflam m atory
  3. IMPROVING PRECISION WITH GENOMICS • BRCA1/BRCA2 MUTATIONS PREDICT RISK •

    COMMERCIAL PROGNOSTIC TESTS BASED ON GENE SIGNATURES HTTP://THEBIGCANDME.BLOGSPOT.CA/
  4. GENOMICS • NEW TECHNOLOGY FOR READING/WRITING DNA • MEASURE OUR

    GENETIC CODE AND SYSTEM STATE • LOTS OF VARIABLES – WHOLE GENOME, TRANSCRIPT AND PROTEIN EXPRESSION, SPLICING, CHROMATIN STRUCTURE, MOLECULAR INTERACTION, TRANSCRIPTION FACTOR, METHYLATION, METABOLITE, PATIENT PHENOTYPE
  5. 2

  6. HTTP://WWW.LHSC.ON.CA/ SOURCE CODE ON DISK LOAD TO ACTIVE MEMORY COMPILER

    RUNNING SOFTWARE ACTIVE MEMORY 4 LETTER CODE (DNA/RNA BASES) 20 LETTER CODE (AMINO ACIDS) MEEPQSDPSVEPPLSQETFSDLWKLLPEN… GATGGGATTGGGGTTTTCCCCTCCCAT…
  7. DNA SEQUENCING • RECENT MASSIVE BREAKTHROUGH • CURRENT TECH: –

    ~10 HUMAN GENOMES, 1TB DATA/6 DAY RUN ILLUMINA, GEORGE CHURCH
  8. FEB. 1, 2013: DR. LEE HOOD RECEIVES HIS NATIONAL MEDAL

    OF SCIENCE FROM PRESIDENT OBAMA AT WHITE HOUSE CEREMONY
  9. MORE BREAKTHROUGHS COMING WWW.NANOPORETECH.COM 20-NODE INSTALLATION = COMPLETE HUMAN GENOME

    IN 15 MINUTES MINION = USB CONNECTION, MINIMAL SAMPLE PREPARATION, $1000 DEVICE + CONSUMABLES
  10. WHERE DOES THE DATA COME FROM? BARODA, INDIA TORONTO, CANADA

    VERMONT, USA CAMBRIDGE, UK MOLECULAR BIOLOGY LABS AROUND THE WORLD
  11. COMPUTING NEEDS: 1 HUMAN GENOME • ~125 BASE READ LENGTH

    X MILLIONS • >30X COVERAGE • ALIGNMENT TO REFERENCE GENOME • COMPUTE VARIANTS (MUTATIONS) • ANNOTATE VARIANTS • COMPUTE TIME: UP TO 2 DAYS/GENOME – OPTIMIZED 4 HOURS: 128G/2CPU/SSD, 3.1GHZ • MEDICALLY IMPORTANT TO BE FAST
  12. THE POWER OF GENOMICS IN MEDICINE • 7000 RARE MONOGENIC

    DISEASES – 50% HAVE A KNOWN GENE RESPONSIBLE – QUADRUPLED RATE OF IDENTIFICATION SINCE 2012 • BRAIN DOPAMINE-SEROTONIN VESICULAR TRANSPORT DISEASE AND ITS TREATMENT – TWO YEARS FROM DISEASE DEFINITION TO GENE IDENTIFICATION TO TREATMENT NAT REV GENET. 2013 OCT;14(10):681-91 N ENGL J MED. 2013 FEB 7;368(6):543-5
  13. CANCER GENOMICS • GERM LINE VS. SOMATIC MUTATIONS • AIM:

    IDENTIFY FREQUENT MUTATIONS IN CANCER • >11,000 TUMOUR GENOMES, 9M MUTATIONS HUMAN COLORECTAL CARCINOMA HTTPS://DCC.ICGC.ORG/
  14. COMPUTING CHALLENGES • EXPONENTIAL DATA GROWTH (>MOORE’S LAW) – BILLIONS

    OF GENOMES – SIZE: >100GB/HUMAN GENOME, 4GB PROCESSED, MBS (JUST MUTATIONS) • HETEROGENEOUS, NOISY, COMPLEX DATA – DATA SCIENTISTS, DOMAIN EXPERTS
  15. COMPUTATIONAL BIOLOGY • RESEARCH: USING COMPUTERS TO ANSWER BIOLOGICAL/BIOMEDICAL QUESTIONS

    • EXPLORE, INTERPRET AND DISCOVER: SEARCH • SPEED AND ACCURACY: ALGORITHMS • PREDICTING FUNCTIONAL MUTATIONS, PATIENT CLASSIFICATION: MACHINE LEARNING • PRIVACY: DIFFERENTIAL PRIVACY, ENCRYPTION • USABLE APPLICATIONS: SOFTWARE ENGINEERING
  16. MedSavant search engine for genetic variants WWW.MEDSAVANT.COM Developers: Marc Fiume,

    James Vlasblom, Ron Ammar, Orion Buske, Eric Smith, Andrew Brook, Misko Dzamba, Khushi Chachcha, Sergiu Dumitriu Scientific Advisors: Christian Marshall, Kym Boycott, Marta Girdea, Peter Ray, Gary Bader, Michael Brudno
  17. GLIOBLASTOMA MULTIFORME (N=215) GOLDENBERG, BRUDNO NATURE METHODS, 2014 IDENTIFY DISEASE

    SUBTYPE SURVIVAL CLUSTERING SPEED DATA FUSION (NON-LINEAR, MESSAGE PASSING), UNSUPERVISED CLUSTERING
  18. PREDICT TREATMENT RESPONSE • SUPERVISED MACHINE LEARNING E.G. RHEUMATOID ARTHRITIS

    METHOTREXATE RESPONSE B New A A B B B A Personal Medical Network Responder Non-Responder New New patient (Predicted Non-Responder) Weakly similar Highly similar Response to treatment A Similar e.g. SNP, smoking status SHIRLEY HUI, RUTH ISSERLIN, HUSSAM KACA, TABITHA KUNG, KATHY SIMINOVITC
  19. EXPLAINING GENOMICS DATA • SNAPSHOTS OF SYSTEM STATE – E.G.

    CANCER VS. NORMAL • EXPLAIN WHY STATES DIFFER – E.G. REGULATOR PERTURBATION – CAUSAL MODELING – PRIOR KNOWLEDGE ABOUT MECHANISM: PATHWAYS WITT H ET AL. CANCER CELL. 2011 AUG 16;20(2):143-57
  20. Microtubule Cytoskeleton Cell Projection & Cell Motility Cell Proliferation Glycosylation

    Adhesion Regulation of GTPase Kinase Activity/Regulation CNS Development Intellectual Disability Autism GTPase/Ras Signaling Regulation of cell proliferation Positive regulation of cell proliferation Tyrosin kinase Vasculature develepment Palate develepment Organ Morphogenesis Behavior Heart develepment RHO Ras Membrane Kinase regulation Cell Motility (stricter cluster) Centrosome Nucleolus Cell cycle Regulation of hormone levels Aminoacid derivative / amine metabolism Synaptic vescicle maturation Reelin pathway LIS1 in neuronal migration and development Negative regulation of cell cycle cKIT pathway mTor pathway Zn finger domain Carboxyl esterase domain Ras signaling GTPase regulator Neuron migration Cell Motility (stricter cluster) Cell morphogenesis Cell projection organization CNS development Brain development Neurite development CNS neuron differentiation Axonogenesis Projection neuron axonogenesis Cerebral cortex cell migration SMC flexible hinge domain Urea and amine group metabolism MHC-I Zoom of CNS-Development ID ID ASD ASD Both 0% 12.5% Enriched in deletions FDR Known disease genes Enriched only in disease genes Node type (gene-set) Edge type (gene-set overlap) From disease genes to enriched gene-sets Between gene-sets enriched in deletions Between sets enriched in deletions and in disease genes or between disease sets only Pinto et al. Functional impact of global rare copy number variation in autism spectrum disorders. Nature. 2010 Jun 9.
  21. Ad N re of Neuron migration Cell Motility (stricter cluster)

    Cell morphogenesis Cell projection organization CNS development Brain development Neurite development CNS neuron differentiation Axonogenesis Projection neuron axonogenesis Cerebral cortex cell migration Zoom of CNS-Development
  22. PATIENT #1 PATIENT #2 PATIENT #3 PATIENT #I PATHWAYGS I

    CNV-AFFECTED GENE COUNT = 1 COUNT = 1 COUNT = 1 COUNT = 0 • IF WE HAVE AT LEAST ONE CNV AFFECTING AT LEAST ONE GENE IN A CERTAIN PATHWAY G I , THEN WE HAVE A PERTURBATION POTENTIAL IN THAT PATHWAY • WE COUNT THE PRESENCE / ABSENCE OF SUCH PERTURBATION POTENTIAL IN PATIENTS Patient #1 Patient #2 Patient #3 … Patient #i … Patient #n GS1 1 1 1 … 0 … 0 GS2 0 0 1 … 1 … 0 GS3 0 0 0 … 0 … 0 DANIELE MERICO PATHWAY ASSOCIATION TEST
  23. DESCRIPTION: •THE SIGNIFICANCE OF A GENE-SET IS THEN ASSESSED USING

    THE FISHER’S EXACT TEST FOR ASSOCIATION •A SIGNIFICANT GENE-SET IS AFFECTED BY A MUTATION POTENTIAL MORE FREQUENTLY IN CASES THAN CONTROLS •THE FDR IS ESTIMATED BY SHUFFLING THE COLUMNS IN THE ‘GENE-SET BY PATIENT’ COUNT TABLE Case Control GSi 13 1 Not in GSi 1146 - 13 889 - 1 Patient #1 Patient #2 Patient #3 … Patient #i … Patient #n GS1 1 1 1 … 0 … 0 GS2 0 0 1 … 1 … 0 GS3 0 0 0 … 0 … 0 PATHWAY ASSOCIATION TEST
  24. BENEFITS OF SYSTEMS THINKING • IMPROVES STATISTICAL POWER – FEWER

    TESTS • MORE REPRODUCIBLE – E.G. GENE EXPRESSION SIGNATURES • EASIER TO INTERPRET – FAMILIAR CONCEPTS E.G. CELL CYCLE • IDENTIFIES MECHANISM – CAN EXPLAIN CAUSE VS. PARTS THINKING
  25. THE FACTOID PROJECT MAX FRANZ, IGOR RODCHENKOV, OZGUN BABUR, EMEK

    DEMIR, CHRIS SANDER HELPING AUTHORS DIGITIZE THEIR PUBLISHED KNOWLEDGE HTTP://FACTOID.BADERLAB.ORG/
  26. NETWORK VISUALIZATION AND ANALYSIS UCSD, ISB, AGILENT, MSKCC, PASTEUR, UCSF

    HTTP://CYTOSCAPE.ORG PATHWAY COMPARISON LITERATURE MINING GENE ONTOLOGY ANALYSIS ACTIVE MODULES COMPLEX DETECTION NETWORK MOTIF SEARCH
  27. GENE FUNCTION PREDICTION HTTP://WWW.GENEMANIA.ORG QUAID MORRIS (DONNELLY) RASHAD BADRAWI, OVI

    COMES, SYLVA DONALDSON, MAX FRANZ, CHRISTIAN LOPES, FARZANA KAZI, JASON MONTOJO, HAROLD RODRIGUEZ, KHALID ZUBERI • GUILT-BY-ASSOCIATION PRINCIPLE • BIOLOGICAL NETWORKS ARE COMBINED INTELLIGENTLY TO OPTIMIZE PREDICTION ACCURACY • ALGORITHM IS MORE FAST AND ACCURATE THAN ITS PEERS
  28. SOCIAL CHALLENGES • BIOETHICS AND DATA SHARING • ENGAGING RESEARCHERS

    – CROWDSOURCING: TCGA PAN CANCER, DREAM • ENCOURAGING RESEARCHERS TO EXPLORE UNCHARTED TERRITORY • NEED FOR QUANTITATIVE THINKING IN BIOLOGY – NEW PH.D. PROGRAM IN THE MOLECULAR GENETICS DEPARTMENT AT THE UNIVERSITY OF TORONTO NATURE. 2011 FEB 10;470(7333):163 WWW.NATURE.COM/TCGA/
  29. EPENDYMOMA • 3RD MOST COMMON BRAIN TUMOUR IN CHILDREN •

    INCURABLE IN UP TO 45% OF PATIENTS STEVE MACK, MICHAEL TAYLOR, RUTH ISSERLIN - CANCER CELL. 2011 AUG 16;20(2):143-57 GENE EXPRESSION PATIENT AGE OVERALL SURVIVAL
  30. EPENDYMOMA GENOMIC ANALYSIS • EPENDYMOMA BRAIN CANCER - MOST COMMON

    AND MORBID LOCATION FOR CHILDHOOD IS THE POSTERIOR FOSSA (PF = BRAINSTEM + CEREBELLUM) • TWO SUBTYPES BY GENE EXPRESSION: PFA - YOUNG, DISMAL PROGNOSIS, PFB - OLDER, EXCELLENT PROGNOSIS. • WHOLE GENOME SEQUENCING (47 SAMPLES) SHOWED ALMOST NO MUTATIONS, HOWEVER DNA METHYLATION ARRAYS SHOWED CLEAR CLUSTERING INTO PFA AND PFB (79 SAMPLES) • PFA MORE TRANSCRIPTIONALLY SILENCED BY CPG METHYLATION STEVE MACK, MICHAEL TAYLOR, SCOTT ZUYDERDUYN NATURE, FEB. 2014
  31. POLYCOMB REPRESSOR COMPLEX 2 – INHIBITED BY DZNEP AND GSK343

    – KILLED PFA CELLS NO KNOWN TREATMENT, SO NOW GOING TO CLINICAL TRIAL, COMPASSIONATE USE IN ONE PATIENT
  32. 2 MONTHS 3 MONTHS 3 CYCLES VIDAZA 9 YO WITH

    METASTATIC PF EPENDYMOMA TO LUNG TREATED WITH AZACYTIDINE TREATMENT OF METASTATIC PF EPENDYMOMA WITH VIDAZA MICHAEL TAYLOR
  33. ACKNOWLEDGEMENTS BADER LAB DOMAIN INTERACTION TEAM SHOBHIT JAIN BRIAN LAW

    JÜRI REIMAND MOHAMED HELMY ANDREA UETRECHT MARINA OLHOVSKY CANCER GENOMICS FLORENCE CAVALLI DAVID SHIH ASHA ROSTAMIANFAR PRECISION MEDICINE RON AMMAR SHIRLEY HUI FUNDING HTTP://BADERLAB.ORG PATHWAY AND NETWORK ANALYSIS RUTH ISSERLIN IGOR RODCHENKOV SCOTT ZUYDERDUYN RUTH WONG VERONIQUE VOISIN SHAHEENA BASHIR KHALID ZHUBERI CHRISTIAN LOPES JASON MONTOJO MAX FRANZ HAROLD RODRIGUEZ