Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Interoperation between InterMines - LegFed Project Kickoff Meeting

Interoperation between InterMines - LegFed Project Kickoff Meeting

Overview of InterMine infrastructure, ability to interoperate with other InterMine instances via IM 2.0 StairCase

Presented at the Legume Federation Project Kickoff Meeting, 2015/06/22 by Vivek Krishnakumar

Vivek Krishnakumar

June 22, 2015
Tweet

More Decks by Vivek Krishnakumar

Other Decks in Programming

Transcript

  1. Interoperation between InterMines Legume Federation, June 22, 2015 Vivek Krishnakumar

    Chris Town J. Craig Venter Institute
  2. InterMine in a nutshell • Open-source data warehouse software •

    Integration of complex biological data • Parsers for common biological data formats • Extensible framework for custom data • Cookie-cutter interface, highly customizable • Interact using sophisticated web query tools • Programmatic access using web-service API
  3. Open-source Project • Source code available online • Distributed with

    the GNU LGPL license • GitHub Repo: https://github.com/intermine/int ermine • GitHub Organization: https://github.com/intermine intermine / intermine > bio > biotestmine > config > flymine > humanmine > imbuild > intermine > testmodel .gitignore .travis.yml LICENSE LICENSE.LIBS README.md RELEASE_NOTES
  4. Richard N. Smith et al. Bioinformatics 2012;28:3163-3165 InterMine system architecture

  5. InterMine system architecture Web Application • Java Server Pages (JSP),

    HTML, JS, CSS • Interfaces with Java Servlets and IM web-services Web Server • Tomcat 7.0.x, serves Web application ARchive file • ant based build system using Java SDK Database Server • PostgreSQL 9.2 or above • range query, btree, gist enabled (refer docs here) http://intermine.readthedocs.org/en/latest/system-requirements/
  6. Alex Kalderimis et al. Nucl. Acids Res. 2014;42:W468-W472 InterMine web

    services http://iodocs.labs.intermine.org JBrowse
  7. Federated Authentication • Apart from the standard login scheme (username/password),

    InterMine supports industry standard OAuth2 based login flows, implemented by Google, GitHub, Agave, etc. • ThaleMine (Arabidopsis) relies on this infrastructure to authenticate users against the araport.org tenant registered within the Agave infrastructure • Documentation available here: http://intermine.readthedocs.org/en/latest/webapp/ properties/web-properties/#openauth2-settings- aka-openid-connect
  8. Interoperability? • Ability of InterMine instances to communicate ‘automatically’ with

    each other • By way of leveraging web services • Questions to be answered: ¡ What do they say to each other? ¡ How do they say it? ¡ What mechanisms are used? ¡ Enabling these mechanisms…
  9. Data Model • Data Model === Schema of InterMine instance

    • Defined in XML format • Core data model (based on SO) can be extended to suit requirements • Access a mines data model in JSON format http://MINE_URL/service/model/?format=json • Compatibility of data models across mines ensures interoperability
  10. Advantages of common data model • Data mining scripts developed

    for one mine immediately compatible with others • Promotes crowdsourcing ¡ one/more groups write tools/widgets/parsers ¡ can be easily reused by others • Enables cross species analysis
  11. Available tools • Multi-mine search tool https://github.com/alexkalderimis/multimine-search-tool ¡ Based on

    InterMine Lucene-based search index ¡ Allows for interoperation when data models are different • Integration based on Homologs: ¡ Ontology integration using `dagify` https://github.com/intermine/dagify ¡ Pathway Integration by way of collating shared pathways • InterMine Staircase ¡ Powerful client-side interface enabling data analysis workflows and cross-mine integration via web services http://staircase.herokuapp.com
  12. InterMine Staircase

  13. InterMine Staircase Configure access to multiple mines

  14. InterMine Staircase Cross-mine search

  15. InterMine Staircase Filter results by facets

  16. InterMine Staircase Prepare and enrich lists

  17. InterMine Staircase Perform mine-to-mine list conversions

  18. InterMine Staircase App/tool compatibility

  19. InterMine Staircase Application model MedicMine SoyMine....

  20. Available Reference Mines • ThaleMine: https://github.com/Arabidopsis-Information-Portal/intermine/ ¡ Integrates variety of

    genomic datasets pertaining to Arabidopsis thaliana col-0 ¡ Leverages both data warehousing and federation methods ¡ Represents wide variety of data: genes, proteins, function, expression, co-expression, interactions, pathways, homologs, alleles, polymorphism, stocks, germplasm, phenotypes • MedicMine: https://github.com/jcvi-plant-genomics/intermine/ ¡ Warehouse for Medicago truncatula A17 genomic data ¡ Houses variety of data: genes, proteins, function, expression • PhytoMine: https://github.com/JoeCarlson/intermine/ ¡ Warehouse for 47 different Angiosperm genomes ¡ Developed on a Chado à InterMine migration path ¡ Houses variety of data: genes, proteins, expression, homologs, protein families, variation • FlyMine: https://github.com/intermine/intermine/
  21. Recommendations and Challenges • Recommendations: ¡ Develop core plant InterMine

    model ¡ Follow InterMine guidelines ¡ Learn from prior initiatives - InterMOD • Challenges ¡ Users/developers are used to current way of doing things ¡ Time taken to adapt to common data model and/or software stack ¡ Difficult to arrive at consensus with diverse group
  22. Acknowledgments • InterMine Team ¡ Gos Micklem ¡ Julie Sullivan

    ¡ Alex Kalderimis ¡ Richard Smith ¡ Sergio Contrino ¡ Josh Heimbach ¡ et al. • Araport Team ¡ Chris Town ¡ Jason Miller ¡ Matt Vaughn ¡ Maria Kim ¡ Svetlana Karamycheva ¡ Erik Ferlanti ¡ Chia-Yi Cheng ¡ Benjamin Rosen ¡ Irina Belyaeva