Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Enabling knowledge generation and reproducible research by embedding provenance models in metadata stores

Enabling knowledge generation and reproducible research by embedding provenance models in metadata stores

Reproducible research requires that information pertaining to all aspects of a research activity are captured and represented richly. However, most scientific domains, including neuroscience, only capture pieces of information that are deemed relevant. In this talk, we provide an overview of the components necessary to create this information-rich landscape and describe a prototype platform for knowledge exploration. In particular, we focus on a technology agnostic data provenance model as the core representation and Semantic Web technologies that leverage such a representation. While the data and analysis methods are related to brain imaging, the same principles and architecture are applicable to any scientific domain.

Satrajit Ghosh

August 28, 2013
Tweet

More Decks by Satrajit Ghosh

Other Decks in Science

Transcript

  1. Enabling knowledge generation and reproducible research
    by embedding provenance models in metadata stores
    Satrajit S. Ghosh - [email protected]
    August 28, 2013
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 1 / 50

    View Slide

  2. 1 Knowledge Generation and Reproducible Analysis
    2 Provenance and Semantic Web Tools
    3 A Prototype Platform
    4 Challenges & Future directions
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 2 / 50

    View Slide

  3. Knowledge Generation and Reproducible Analysis
    Knowledge Generation and Reproducible Analysis
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 3 / 50

    View Slide

  4. Knowledge Generation and Reproducible Analysis
    Huh! How did that happen?
    Source: Timothy Lebo - http://bit.ly/lebo_cogsci_issues_2011
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 4 / 50

    View Slide

  5. Knowledge Generation and Reproducible Analysis
    Structured questions
    From journal articles published between 2008 and 2010, retrieve all
    brain volumes and ADOS scores of persons with autism spectrum
    disorder who are right handed and under the age of 10.
    Rerun the analysis used in publication X on my data.
    Is the volume of the caudate nucleus smaller in persons with
    Obsessive Compulsive Disorder compared to controls?
    Find data-use agreements for open-accessible datasets used in articles
    by author Y.
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 5 / 50

    View Slide

  6. Knowledge Generation and Reproducible Analysis
    Why can’t we do this?
    There is no formal vocabulary to describe all entities, activities and
    agents in the domain, and vocabulary creation is a
    time-consuming process
    Standardized provenance tracking tools are typically not
    integrated into scientific software, making the curation process
    time consuming, resource intensive, and error prone
    Binary data formats do not provide standardized access to metadata
    The actual data can vary in size from 1-bit survey answers to
    terabytes
    In many research laboratories much of the derived data are
    deleted, keeping only the bits essential for publication
    There are no standards for computational platforms
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 6 / 50

    View Slide

  7. Knowledge Generation and Reproducible Analysis
    Why should we do this?
    A fundamental challenge in neuroscience is to integrate data across
    species, spatial scales (nanometers to inches), temporal scales
    (microseconds to years), instrumentation (e.g., electron microscopy,
    magnetic resonance imaging) and disorders (e.g., autism, schizophrenia).
    Datasets contain ad hoc metadata and are processed with methods
    specific to the sub-domain, limiting integration.
    The lack of shared and relevant metadata and the lack of provenance
    about data and computation in neuroscience precludes or complicates
    machine readability or reproducibility.
    Beyond the significant human effort to answer the previous queries,
    errors can happen from the lack of complete specification of data or
    methods, as well as from misinterpretation of methods
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 7 / 50

    View Slide

  8. Knowledge Generation and Reproducible Analysis
    Research workflow in Brain Imaging
    Reproducibility can mean many things.
    (Poline et al., 2012)
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 8 / 50

    View Slide

  9. Knowledge Generation and Reproducible Analysis
    Reproducibility is complicated
    Components to reproduce 1
    Participants
    Screening, inclusion and exclusion criteria
    Demographic matching
    Experimental setup
    Stimuli
    Experiment control software
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 9 / 50

    View Slide

  10. Knowledge Generation and Reproducible Analysis
    Components to reproduce 2
    Data acquisition
    MR scanner
    Pulse sequences and reconstruction algorithms
    Cognitive or neuropsychological assessments
    Data analysis:
    Software tools
    Environments
    Quality control/assurance
    Analysis scripts
    Figure creation
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 10 / 50

    View Slide

  11. Knowledge Generation and Reproducible Analysis
    Reproducibility is necessary
    “In my own experience, error is ubiquitous in scientific computing, and one
    needs to work very diligently and energetically to eliminate it. One needs a
    very clear idea of what has been done in order to know where to look for
    likely sources of error. I often cannot really be sure what a student or
    colleague has done from his/her own presentation, and in fact often
    his/her description does not agree with my own understanding of what has
    been done, once I look carefully at the scripts. Actually, I find that
    researchers quite generally forget what they have done and misrepresent
    their computations.” (Donoho, 2010)
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 11 / 50

    View Slide

  12. Knowledge Generation and Reproducible Analysis
    Aims of reproducible analysis
    Ability to reproduce analysis
    Increase accuracy
    Ability to verify analyses are consistent with intentions
    Ability to review analysis choices
    Increase clarity of communication
    Increased trustworthiness
    Ability for others to verify
    Extensibility
    Ability to easily modify and/or re-use existing analyses
    Contextualize
    Ability to establish bounds of a given application or within a given
    tolerance
    Source: https://github.com/jeromyanglim/rmarkdown-rmeetup-2012/blob/master/talk/talk.md
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 12 / 50

    View Slide

  13. Knowledge Generation and Reproducible Analysis
    Capturing research today
    The laboratory notebook (e.g., Documents, Google, Dropbox)
    Code
    Directories on filesystem
    Code repositories (e.g., Github, Sourceforge)
    Data (e.g., Databases, Archives)
    Environments
    Python requirements.txt
    Virtual Machines
    Cloud (e.g., Amazon Web Services, Azure, Rackspace)
    Supplementary information
    MIT DSpace
    Journal archives
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 13 / 50

    View Slide

  14. Provenance and Semantic Web Tools
    Provenance and Semantic Web Tools
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 14 / 50

    View Slide

  15. Provenance and Semantic Web Tools
    A central theme: Capturing information
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 15 / 50

    View Slide

  16. Provenance and Semantic Web Tools
    What will this platform look like and enable?
    Provide a decentralized linked data and computational network
    Encode information in standardized and machine accessible form
    View data from a provenance perspective
    as products of activities or transformations carried out by people,
    software or machines
    Allow any individual, laboratory, or institution to discover and share
    data and computational services, along with the provenance of that
    data
    Immediately re-test an algorithm, re-validate results or test a new
    hypothesis on new data
    Develop applications based on a consistent, federated query and
    update interface
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 16 / 50

    View Slide

  17. Provenance and Semantic Web Tools
    Some definitions
    What is Provenance?
    Provenance is information about entities, activities, and people involved in
    producing a piece of data or thing, which can be used to form assessments
    about its quality, reliability or trustworthiness. (source: w3c)
    What is a ‘data model’?
    A data model is an abstract conceptual formulation of information that
    explictly determines the structure of data and allows software and people
    to communicate and interpret data precisely. (source: wikipedia)
    What is PROV-DM?
    PROV-DM is the conceptual data model that forms a basis for the W3C
    provenance (PROV) family of specifications.
    PROV-DM provides a generic basis that captures relationships associated
    with the creation and modification of entities by activities and agents.
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 17 / 50

    View Slide

  18. Provenance and Semantic Web Tools
    PROV-DM components
    1 Entities and activities, and the time at which they were created, used,
    or ended
    2 Derivations of entities from entities
    3 Agents bearing responsibility for entities that were generated or
    activities that happened
    4 A notion of bundle, a mechanism to support provenance of provenance
    5 Properties to link entities that refer to the same thing
    6 Collections forming a logical structure for its members
    Source: PROV-DM
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 18 / 50

    View Slide

  19. Provenance and Semantic Web Tools
    used
    endedAtTime
    wasAssociatedWith
    actedOnBehalfOf
    wasGeneratedBy
    wasAttributedTo
    wasDerivedFrom
    wasInformedBy
    Activity
    Entity
    Agent
    xsd:dateTime
    startedAtTime
    xsd:dateTime
    http://www.w3.org/TR/prov-o/diagrams/starting-points.svg
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 19 / 50

    View Slide

  20. Provenance and Semantic Web Tools
    generatedAtTime
    value
    hadMember
    invalidatedAtTime
    wasStartedBy /
    wasEndedBy
    wasInvalidatedBy
    wasInfluencedBy /
    wasQuotedFrom /
    wasRevisionOf /
    hadPrimarySource
    Activity
    Entity
    Collection
    xsd:dateTime
    xsd:dateTime
    alternateOf /
    specializationOf
    atLocation
    Location
    Agent
    Person
    SoftwareAgent
    Organization
    Plan
    Bundle
    http://www.w3.org/TR/prov-o/diagrams/expanded.svg
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 20 / 50

    View Slide

  21. Provenance and Semantic Web Tools
    Why PROV-DM?
    Provenance is not an afterthought
    Captures data and metadata (about entities, activities and agents)
    within the same context
    A formal, technology-agnostic representation of machine-accessible
    structured information
    Federated queries using SPARQL when represented as RDF
    A W3C recommendation simplifies app development and allows
    integration with other future services
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 21 / 50

    View Slide

  22. Provenance and Semantic Web Tools
    Semantic Web Tools
    The Semantic Web provides a common framework that allows data
    sharing and reuse, is based on the Resource Description Framework
    (RDF), and extends the principles of the Web from pages to machine
    useful data
    Data and descriptors are accessed using uniform resource identifiers
    (URIs)
    Unlike the traditional Web, the source and the target along with the
    relationship itself are unambiguously named with URIs and form a
    ‘triple’ of a subject, a relationship and an object
    nif:tbi rdf:type nif:mental disorder .
    This flexible approach allows data to be easily added and for the
    nature of the relations to evolve, resulting in an architecture that
    allows retrieving answers to more complex queries
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 22 / 50

    View Slide

  23. Provenance and Semantic Web Tools
    RDF Example
    @prefix rdf: .
    @prefix contact: .

    rdf:type contact:Person;
    contact:fullName "Eric Miller";
    contact:mailbox ;
    contact:personalTitle "Dr.".
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 23 / 50

    View Slide

  24. Provenance and Semantic Web Tools
    SPARQL Protocol and RDF Query Language (SPARQL)
    A query language for RDF
    Triples can be represented with compact syntaxes (e.g., Turtle)
    The queries are themselves similar in syntax
    SPARQL 1.1 (official W3C recommdation in March, 2013)
    SPARQL allows users to write unambiguous queries
    Supports federation:
    a query can be distributed to multiple SPARQL endpoints, computed
    and results gathered
    A SPARQL client library can query a static RDF document or a
    SPARQL endpoint
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 24 / 50

    View Slide

  25. Provenance and Semantic Web Tools
    SPARQL examples
    PREFIX foaf:
    SELECT ?name ?email
    WHERE {
    ?person a foaf:Person.
    ?person foaf:name ?name.
    ?person foaf:mbox ?email.
    }
    PREFIX abc:
    SELECT ?capital ?country
    WHERE {
    ?x abc:cityname ?capital ;
    abc:isCapitalOf ?y .
    ?y abc:countryname ?country ;
    abc:isInContinent abc:Africa .
    }
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 25 / 50

    View Slide

  26. A Prototype Platform
    A Prototype Platform
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 26 / 50

    View Slide

  27. A Prototype Platform
    Requirements
    A standardized data model (NI-DM)
    Provenance tracking (Prov + workflow tools)
    Decentralized content creation and storage (Workflow tools, RDF
    triples, triple stores)
    Federated query (SPARQL)
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 27 / 50

    View Slide

  28. A Prototype Platform
    1. A standardized data model
    Neuroimaging Data Model (NI-DM) (Keator et al., 2013)
    Based on PROV-DM hence borrows PROV ontology (PROV-O)
    Structured information encoding
    Consistent vocabulary
    Metadata standards via domain specific object models
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 28 / 50

    View Slide

  29. A Prototype Platform
    NIDM components
    Terms
    A lexicon of all things brain imaging. (e.g., DICOM terms, software
    specific terms, statistic terms, paradigm terms)
    Object Models
    Structured information in brain imaging (e.g., directory structures,
    CSV/Tab delimited files, brain imaging file formats)
    Integrated provenance
    How are entities generated or derived and by what or who?
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 29 / 50

    View Slide

  30. A Prototype Platform
    NIDM platform
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 30 / 50

    View Slide

  31. A Prototype Platform
    2. Provenance tracking
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 31 / 50

    View Slide

  32. A Prototype Platform
    Provenance tracking tools in Python
    1 IPython notebook
    2 Sumatra
    3 Synapse (with Python client)
    4 Prov Python library (with RDF extensions)
    Similar tools exist for other languages and some of the above systems
    allow HTTP based tracking with a RESTful service API.
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 32 / 50

    View Slide

  33. A Prototype Platform
    Workflow tools supporting W3C PROV
    1 Nipype
    A brain imaging focused workflow environment
    Flexible semantics for scripting complex workflows
    2 VisTrails
    Scientific Workflow and Provenance Management
    Manage rapidly evolving workflows
    Can be used via a graphical interface
    3 Taverna/Kepler/Galaxy
    (or supports the precursor to PROV, the open provenance model)
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 33 / 50

    View Slide

  34. A Prototype Platform
    Nipype: A Workflow environment for brain imaging
    (Gorgolewski et al., 2011)
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 34 / 50

    View Slide

  35. A Prototype Platform
    Attempts at provenance in Nipype
    Logging to file
    Restructured text output per interface
    Exporting the script
    Executable IPython notebooks
    Using Prov library and storing RDF in a file or triplestore
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 35 / 50

    View Slide

  36. A Prototype Platform
    3. Decentralized content creation and storage
    Create and expose metadata where you do analysis
    Register dataset with central authority
    Use robots
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 36 / 50

    View Slide

  37. A Prototype Platform
    Example storage of provenance
    From the Nipype analysis of a single participant we get:
    429 statements/triples from a single interface/function
    runtime dependencies
    inputs
    outputs
    md5/sha512 hashes and pointers to files
    6021 statements/triples from the workflow
    includes relations between processes
    includes links to shared input output entities
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 37 / 50

    View Slide

  38. A Prototype Platform
    4. Federated Query using SPARQL on triplestores
    select ?id ?age ?vol ?viq ?dx
    where
    {
    ?c fs:subject_id ?id;
    prov:hadMember ?e1 .
    ?sc prov:wasDerivedFrom ?e1;
    a nidm:FreeSurferStatsCollection;
    prov:hadMember [ nidm:AnatomicalAnnotation ?annot;
    fs:Volume_mm3 ?vol] .
    FILTER regex(?annot, "Right-Amy")
    SERVICE {
    ?c2 nidm:ID ?id .
    ?c2 nidm:Age ?age .
    ?c2 nidm:Verbal_IQ ?viq .
    ?c2 nidm:DX ?dx .
    }
    } LIMIT 100
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 38 / 50

    View Slide

  39. A Prototype Platform
    Example applications
    Extracting data
    Javascript
    Using PROV for determining relations
    Federated query
    Python + Javascript
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 39 / 50

    View Slide

  40. A Prototype Platform
    Javascript example
    select ?val (count(?s) as ?nsubjects) WHERE {
    ?c fs:subject_id ?s;
    prov:hadMember ?e1 .
    ?sc prov:wasDerivedFrom ?e1;
    a nidm:FreeSurferStatsCollection;
    prov:hadMember [ nidm:AnatomicalAnnotation ?annot;
    fs:Volume_mm3 ?val] .
    FILTER regex(?annot, "Right-Amy")
    }
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 40 / 50

    View Slide

  41. A Prototype Platform
    Output visualized directly via Javascript
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 41 / 50

    View Slide

  42. A Prototype Platform
    Federated query
    select ?id ?age ?vol ?viq ?dx
    where
    {
    ?c fs:subject_id ?id;
    prov:hadMember ?e1 .
    ?sc prov:wasDerivedFrom ?e1;
    a nidm:FreeSurferStatsCollection;
    prov:hadMember [ nidm:AnatomicalAnnotation ?annot;
    fs:Volume_mm3 ?vol] .
    FILTER regex(?annot, "Right-Amy")
    SERVICE {
    ?c2 nidm:ID ?id .
    ?c2 nidm:Age ?age .
    ?c2 nidm:Verbal_IQ ?viq .
    ?c2 nidm:DX ?dx .
    }
    } LIMIT 100
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 42 / 50

    View Slide

  43. A Prototype Platform
    Interactive csv browser
    App call: http://localhost:5000/u?url=http://bit.ly/1atAL00
    Scatterize: https://github.com/njvack/scatterize
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 43 / 50

    View Slide

  44. Challenges & Future directions
    Challenges & Future directions
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 44 / 50

    View Slide

  45. Challenges & Future directions
    What does all this buy us?
    Common vocabulary for communication
    Rich structured information including provenance
    Domain specific object models that are embedded in the common
    structure
    Data/Content can be repurposed differentially for applications
    Execution duration could be used to instrument schedulers
    Parametric failure modes can be tracked across large databases
    Determine “amount” of existing data on a particular topic
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 45 / 50

    View Slide

  46. Challenges & Future directions
    Where are we headed?
    Formalize NI-DM object models as extensions to PROV-O
    Common vocabulary across software
    Relate to the Linked Data Web
    Publications, authors, grants
    Re-use existing vocabularies and ontologies
    Integrate with existing databases
    App instrumentation and development with built in provenance
    tracking
    Publish more structured data
    Reproduce analysis on a VM with an existing analysis pathway
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 46 / 50

    View Slide

  47. Challenges & Future directions
    A lighweight decentralized architecture
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 47 / 50

    View Slide

  48. Challenges & Future directions
    Thanks
    International Neuroinformatics Coordinating Facilities
    BIRN derived-data working group
    Neuroimaging in Python Community
    W3C PROV Working group
    NIH, INCF for support
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 48 / 50

    View Slide

  49. Challenges & Future directions
    The picture of the future
    (Bechhofer et al., 2013) http://www.researchobject.org/
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 49 / 50

    View Slide

  50. Challenges & Future directions
    Bechhofer, S., Buchan, I., Roure, D. D., Missier, P., Ainsworth, J., Bhagat, J., Couch, P., et al. (2013). Why linked data is not
    enough for scientists. Future Generation Computer Systems, 29(2), 599–611.
    doi:http://dx.doi.org/10.1016/j.future.2011.08.004
    Donoho, D. L. (2010). An invitation to reproducible computational research. Biostatistics, 11(3), 385–388.
    Gorgolewski, K., Burns, C. D., Madison, C., Clark, D., Halchenko, Y. O., Waskom, M. L., & Ghosh, S. S. (2011). Nipype: a
    flexible, lightweight and extensible neuroimaging data processing framework in python. Frontiers in neuroinformatics, 5.
    Keator, D. B., Helmer, K., Steffener, J., Turner, J. A., Van Erp, T. G., Gadde, S., Ashish, N., et al. (2013). Towards structured
    sharing of raw and derived neuroimaging data across existing resources. NeuroImage.
    Poline, J.-B., Breeze, J. L., Ghosh, S., Gorgolewski, K., Halchenko, Y. O., Hanke, M., Haselgrove, C., et al. (2012). Data
    sharing in neuroimaging research. Frontiers in neuroinformatics, 6.
    Satrajit S. Ghosh - [email protected] Enabling knowledge generation and reproducible researchby embedding provenance models
    August 28, 2013 50 / 50

    View Slide