Interoperation between InterMines - LegFed Project Kickoff Meeting

Interoperation between InterMines - LegFed Project Kickoff Meeting

Overview of InterMine infrastructure, ability to interoperate with other InterMine instances via IM 2.0 StairCase

Presented at the Legume Federation Project Kickoff Meeting, 2015/06/22 by Vivek Krishnakumar

655ece370aa88ec83d11254234ded6ce?s=128

Vivek Krishnakumar

June 22, 2015
Tweet

Transcript

  1. Interoperation between InterMines Legume Federation, June 22, 2015 Vivek Krishnakumar

    Chris Town J. Craig Venter Institute
  2. InterMine in a nutshell • Open-source data warehouse software •

    Integration of complex biological data • Parsers for common biological data formats • Extensible framework for custom data • Cookie-cutter interface, highly customizable • Interact using sophisticated web query tools • Programmatic access using web-service API
  3. Open-source Project • Source code available online • Distributed with

    the GNU LGPL license • GitHub Repo: https://github.com/intermine/int ermine • GitHub Organization: https://github.com/intermine intermine / intermine > bio > biotestmine > config > flymine > humanmine > imbuild > intermine > testmodel .gitignore .travis.yml LICENSE LICENSE.LIBS README.md RELEASE_NOTES
  4. Richard N. Smith et al. Bioinformatics 2012;28:3163-3165 InterMine system architecture

  5. InterMine system architecture Web Application • Java Server Pages (JSP),

    HTML, JS, CSS • Interfaces with Java Servlets and IM web-services Web Server • Tomcat 7.0.x, serves Web application ARchive file • ant based build system using Java SDK Database Server • PostgreSQL 9.2 or above • range query, btree, gist enabled (refer docs here) http://intermine.readthedocs.org/en/latest/system-requirements/
  6. Alex Kalderimis et al. Nucl. Acids Res. 2014;42:W468-W472 InterMine web

    services http://iodocs.labs.intermine.org JBrowse
  7. Federated Authentication • Apart from the standard login scheme (username/password),

    InterMine supports industry standard OAuth2 based login flows, implemented by Google, GitHub, Agave, etc. • ThaleMine (Arabidopsis) relies on this infrastructure to authenticate users against the araport.org tenant registered within the Agave infrastructure • Documentation available here: http://intermine.readthedocs.org/en/latest/webapp/ properties/web-properties/#openauth2-settings- aka-openid-connect
  8. Interoperability? • Ability of InterMine instances to communicate ‘automatically’ with

    each other • By way of leveraging web services • Questions to be answered: ¡ What do they say to each other? ¡ How do they say it? ¡ What mechanisms are used? ¡ Enabling these mechanisms…
  9. Data Model • Data Model === Schema of InterMine instance

    • Defined in XML format • Core data model (based on SO) can be extended to suit requirements • Access a mines data model in JSON format http://MINE_URL/service/model/?format=json • Compatibility of data models across mines ensures interoperability
  10. Advantages of common data model • Data mining scripts developed

    for one mine immediately compatible with others • Promotes crowdsourcing ¡ one/more groups write tools/widgets/parsers ¡ can be easily reused by others • Enables cross species analysis
  11. Available tools • Multi-mine search tool https://github.com/alexkalderimis/multimine-search-tool ¡ Based on

    InterMine Lucene-based search index ¡ Allows for interoperation when data models are different • Integration based on Homologs: ¡ Ontology integration using `dagify` https://github.com/intermine/dagify ¡ Pathway Integration by way of collating shared pathways • InterMine Staircase ¡ Powerful client-side interface enabling data analysis workflows and cross-mine integration via web services http://staircase.herokuapp.com
  12. InterMine Staircase

  13. InterMine Staircase Configure access to multiple mines

  14. InterMine Staircase Cross-mine search

  15. InterMine Staircase Filter results by facets

  16. InterMine Staircase Prepare and enrich lists

  17. InterMine Staircase Perform mine-to-mine list conversions

  18. InterMine Staircase App/tool compatibility

  19. InterMine Staircase Application model MedicMine SoyMine....

  20. Available Reference Mines • ThaleMine: https://github.com/Arabidopsis-Information-Portal/intermine/ ¡ Integrates variety of

    genomic datasets pertaining to Arabidopsis thaliana col-0 ¡ Leverages both data warehousing and federation methods ¡ Represents wide variety of data: genes, proteins, function, expression, co-expression, interactions, pathways, homologs, alleles, polymorphism, stocks, germplasm, phenotypes • MedicMine: https://github.com/jcvi-plant-genomics/intermine/ ¡ Warehouse for Medicago truncatula A17 genomic data ¡ Houses variety of data: genes, proteins, function, expression • PhytoMine: https://github.com/JoeCarlson/intermine/ ¡ Warehouse for 47 different Angiosperm genomes ¡ Developed on a Chado à InterMine migration path ¡ Houses variety of data: genes, proteins, expression, homologs, protein families, variation • FlyMine: https://github.com/intermine/intermine/
  21. Recommendations and Challenges • Recommendations: ¡ Develop core plant InterMine

    model ¡ Follow InterMine guidelines ¡ Learn from prior initiatives - InterMOD • Challenges ¡ Users/developers are used to current way of doing things ¡ Time taken to adapt to common data model and/or software stack ¡ Difficult to arrive at consensus with diverse group
  22. Acknowledgments • InterMine Team ¡ Gos Micklem ¡ Julie Sullivan

    ¡ Alex Kalderimis ¡ Richard Smith ¡ Sergio Contrino ¡ Josh Heimbach ¡ et al. • Araport Team ¡ Chris Town ¡ Jason Miller ¡ Matt Vaughn ¡ Maria Kim ¡ Svetlana Karamycheva ¡ Erik Ferlanti ¡ Chia-Yi Cheng ¡ Benjamin Rosen ¡ Irina Belyaeva