Upgrade to Pro — share decks privately, control downloads, hide ads and more …

JBrowse and Inter-"Mine" Communication - IMDEV 2017

JBrowse and Inter-"Mine" Communication - IMDEV 2017

Lightning Talk about InterMine/JBrowse integration and extensions to Inter-"Mine" Communication, presented at the 2017 InterMine Developer Workshop and Hackathon (IMDEV 2017) held at the Joint Genome Institute (JGI) in Walnut Creek, CA

Presented on Thursday, March 30th 2017 by Vivek Krishnakumar

Vivek Krishnakumar

March 30, 2017
Tweet

More Decks by Vivek Krishnakumar

Other Decks in Programming

Transcript

  1. JBrowse &
    Inter-mine
    communication
    Vivek Krishnakumar
    J. Craig Venter Institute
    [email protected]
    2017 InterMine Workshop and Hackathon
    Thursday, March 30th @ 11:10am

    View Slide

  2. Overview
    ➔InterMine <-> JBrowse Integration
    ➔Extending FriendlyMines
    ➔MedicMine & ThaleMine
    ➔Summary
    ➔Acknowledgements
    ➔FIN

    View Slide

  3. InterMine <->
    JBrowse integration
    ➔JBrowse & REST API spec
    ➔Components in InterMine <->
    JBrowse integration
    ➔Issues with initial implementation
    ➔Fixes implemented in InterMine
    core and JBrowse

    View Slide

  4. JBrowse & REST API spec
    • JBrowse (http://jbrowse.org)
    • Javascript based portable genome browser tool developed by GMOD community
    • Ships with two REST Store adapters: Features and Names
    • REST Features Store handles genomic features within specified region:
    • :base/stats/global
    • :base/stats/region/:refseq?start=:start&end=:end
    • :base/stats/regionFeatureDensities/:refseq?start=:start&end=:end&basesPerBin=20000
    • :base/features/:refseq?start=:start&end=:end
    • REST Names Store handles search and retrieval of features by identifier:
    • :base/names/?startswith=:searchterm
    • :base/names/?equals=:searchterm
    • Accepted response type: application/json
    http://gmod.org/wiki/JBrowse_Configuration_Guide#Writing_JBrowse-compatible_Web_Services

    View Slide

  5. Components in InterMine <-> JBrowse integration
    • populate-child-features postprocessor task
    • Processes SequenceFeature entities to ensure that feature hierarchy is correctly structured (based on SO)
    • Genomic feature Engine (API)
    • Returns genomic feature data in JBrowse compatible JSON format
    • Config generator service (API)
    • Returns feature tracks (trackList.json) & reference sequence listing (refSeqs.json)
    • Names service (API)
    • Returns list of genomic features matching the input search term
    • Embedded ReportDisplayer
    • Injects (via iframe) JBrowse visualization into InterMine report page
    http://intermine.readthedocs.io/en/latest/webapp/third-party-tools/jbrowse/

    View Slide

  6. Issues with initial
    implementation
    • populate-child-features postprocessor
    • Failed to correctly construct Gene model JSON
    object
    |- gene:1
    |----- mRNA:1
    |----- exon:1
    |----- exon:2
    |----- mRNA:2
    |----- exon:3
    • Feature Engine & Config generator
    • Failed to properly convert InterMine Class names
    to SO terms
    e.g. AntisenseLncRNA → antisense_lncRNA`
    • Failed to specify transcript type in track
    configuration
    • Data issues
    • Failed to represent Gene model CDS entity as
    disjointed parts due to CDS loading mechanism
    e.g. FlyBaseCDSFastaLoader,
    AraportCDSFastaLoader,
    ChadoSequenceProcessor
    • populate-child-features postprocessor
    • Improved logic populates the childFeatures
    collection appropriately
    |- gene:1
    |----- mRNA:1
    |----- exon:1
    |----- exon:2
    |----- mRNA:2
    |----- exon:3
    • Feature Engine & Config generator
    • Automatically resolves IM Class to SO type
    • Ability to specify which tracks to display and
    define extra parameters via web.properties config
    • Fixes to ProcessedTranscript glyph (in JBrowse)
    • Automatically infers disjointed CDS parts from
    exon parts and total CDS span
    Fixes made to InterMine &
    JBrowse core
    intermine/#1454
    intermine/#1426
    jbrowse/#872
    @sergiocontrino
    @justinccdev
    @vivekkrish

    View Slide

  7. Extending
    FriendlyMine
    Functionality
    ➔Current state of Inter-mine
    communication based on standard
    Homology data model
    ➔FriendlyMines extended to
    support alternate Homology data
    model(s)
    ➔Extending FriendlyMines to
    support new Entity types (e.g.
    SyntenyBlocks)

    View Slide

  8. Inter-mine communication using Homology data
    1
    2 3
    Homologue data model
    Friendly mine config in web.properties

    View Slide

  9. Extending FriendlyMines to support alternate
    Homology model(s)
    Standard Model
    Class Name Homologue
    Primary Entity gene
    Related Entity homologue
    Relationship type (orthologue,
    paralogue, LDO, etc.)
    PhytoMine Data Model
    Homolog
    gene1
    gene2
    type (NO VALUE)
    Method NA method (one-to-one, one-to-
    many, many-to-many)

    View Slide

  10. FriendlyMines extended to support alternate
    Homology model(s)
    1
    2
    3
    stable/e93905e
    @vivekkrish
    Homolog data model
    Friendly mine config in web.properties

    View Slide

  11. stable/7800633
    @vivekkrish
    Extend FriendlyMines to support new Entity type
    (Synteny Block)
    3
    @sammyjava
    1
    2

    View Slide

  12. Medicago Genome &
    Arabidopsis
    Information Portal
    ➔MedicagoGenome.org &
    MedicMine
    ➔Araport.org & ThaleMine
    ➔Additions to ThaleMine

    View Slide

  13. MedicagoGenome.org

    View Slide

  14. Community
    Orthologs
    (Phytozome, LIS)
    eFP Browser
    (BAR)
    Real-time
    Data Centers
    Community
    RNA-Seq expression
    (NCBI SRA)
    Publications
    (NCBI)
    Pathways
    (KEGG)
    Mt4.0
    • Genome sequence
    • Genome annotation
    MedicMine
    Gene List
    Analysis
    Gene
    Report
    Query, Web
    Services
    MedicMine - Data Warehouse
    Warehouse
    Data Sources
    Mt Gene Indices
    (DFCI)
    Krishnakumar et al. 2014, Plant Cell Physiol

    View Slide

  15. Araport.org

    View Slide

  16. Community
    Orthologs
    (PhytoMine)
    Co-expression
    (ATTED)
    Real-time
    Data Centers
    Community
    Protein Interactions
    (InAct, BioGrid)
    Gene expression
    (Araport, BAR)
    Publications
    (NCBI, UniProt)
    Pathways
    (KEGG)
    Araport11
    ThaleMine
    Gene List
    Analysis
    Gene
    Report
    Query,
    Data Tables
    Web
    Services
    ThaleMine - Data Warehouse
    Warehouse
    Araport
    Data Sources
    Krishnakumar et al. 2016, Plant Cell Physiol
    Reannotation of
    Col-0 genome
    (Araport)

    View Slide

  17. Additions to ThaleMine
    Heatmap for 113 RNA-Seq Studies
    Gene Report
    Gene List Analysis
    Gene Report (e.g. dicer-like 1/AT1G01040)
    Medicago
    JGI PhytoMine
    Human
    RNA-seq study
    RNA-seq study
    Gene
    Genes

    View Slide

  18. Summary
    • Issues plaguing the functionality of InterMine’s zero-config JBrowse integration
    have been resolved
    • DISCUSSION: Fold-in changes made to JBrowse code base into InterMine (for maintainability)
    • FriendlyMine functionality extended by implementing link classes to support
    alternative Homology data model(s)
    • FriendlyMine functionality extended further to support Synteny based interlinking of
    InterMine instances
    • Araport and MedicagoGenome.org implement InterMine warehouses, ThaleMine
    and MedicMine, respectively.

    View Slide

  19. Acknowledgments
    • JCVI
    • Christopher D. Town (PI: Araport, LegFed)
    • Agnes P. Chan (co-PI: Araport, LegFed)
    • Cambridge
    • Sergio Contrino (Araport Software Dev)
    • Gos Micklem (co-PI: Araport)
    • Justin Clark-Casey
    • Julie Sullivan
    • NCGR
    • Samuel Hokin (LIS Lead Software Dev)
    • Andrew Farmer (co-PI: LegFed)
    • JGI Phytozome
    • Joe Carlson (Lead Software Dev)
    • InterMine and GMOD community

    View Slide

  20. Thank you!

    View Slide