$30 off During Our Annual Pro Sale. View Details »

OHBM 'Big Data' Workshop: Dissecting Nipype Workflows

OHBM 'Big Data' Workshop: Dissecting Nipype Workflows

A description of the Nipype (nipy.org/nipype) platform for analyzing brain imaging data with Python workflows.

Satrajit Ghosh

June 16, 2013
Tweet

More Decks by Satrajit Ghosh

Other Decks in Science

Transcript

  1. Dissec&ng  Nipype  Workflows
    an  fMRI  processing  story
    satrajit ghosh [email protected]
    massachusetts institute of technology

    View Slide

  2. CONTRIBUTORS
    hcontributors
    FUNDING
    1R03EB008673-­‐01  (NIBIB)
    5R01MH081909-­‐02  (NIMH)
    INCF
    CONFLICT  OF  INTEREST
    TankThink  Labs,  LLC

    View Slide

  3. Outline
    What  is  Nipype?
    Unwrapping  a  workflow
    Where  to  go  from  here?

    View Slide

  4. Outline
    What  is  Nipype?
    Unwrapping  a  workflow
    Where  to  go  from  here?

    View Slide

  5. Neuroimaging Pipelines and Interfaces
    nipy.org/nipype
    Gorgolewski et al., 2010
    What is Nipype?
    Python In
    Neuroscience
    Brain Imaging
    Software
    Nipype

    View Slide

  6. Poline et al., 2012

    View Slide

  7. data source: pymvpa.org
    1990 92 94 96 98 2000 02 04 06 08 2010
    Afni
    Brainvoyager
    Freesurfer
    R
    Caret
    Fmristat
    FSL
    MVPA
    NiPy
    ANTS
    SPM
    Brainvisa
    software

    View Slide

  8. ?
    Structural
    Diffusion
    Functional
    transformations

    View Slide

  9. What is Nipype?
    A community developed, opensource, lightweight
    Python library
    Exposes formal, common semantics
    Interactive exploration of brain imaging algorithms
    Workflow creation and execution
    Flexible and adaptive

    View Slide

  10. Nipype architecture

    View Slide

  11. Outline
    What  is  Nipype?
    Unwrapping  a  workflow
    Where  to  go  from  here?

    View Slide

  12. Openfmri Workflow
    All datasets available at openfmri.org (also on Amazon s3)
    Workflow itself is part of Nipype example workflows
    $ python fmri_openfmri.py --datasetdir ds107

    View Slide

  13. OpenfMRI data organization

    View Slide

  14. 30000 feet view
    Data and info
    Preprocessing
    Task Modeling
    MNI Registration
    Storage

    View Slide

  15. Close-up view

    View Slide

  16. Nodes are workflows

    View Slide

  17. The workflow object
    >>> from nipype.pipeline.engine import Workflow
    >>> my_workflow = Workflow(name=‘concept’)
    :: Each node of a Workflow can be a Workflow
    # 30000 ft overview
    my_workflow.write_graph(dotfilename='ograph.dot',
    graph2use='orig')
    # close-up view
    my_workflow.write_graph(dotfilename='ograph.dot',
    graph2use='exec')

    View Slide

  18. What are these nodes?

    View Slide

  19. What are these nodes?

    View Slide

  20. The node object
    >>> from nipype.pipeline.engine import Node
    >>> from nipype.interfaces.spm import Segment
    >>> from nipype.interfaces.fsl import ImageMaths
    >>> from nipype.interfaces.freesurfer import
    MRIConvert
    :: All nodes encapsulate Interfaces which wrap
    external programs.
    >>> segment = Node(Segment(), name=‘segmenter’)
    >>> binarize = Node(ImageMaths(op_string = ‘-nan
    -thr 0.5 -bin’), name=‘binarize’)
    >>> convert = Node(MRIConvert(out_type=‘nii’),
    name=‘converter’)

    View Slide

  21. How about them edges?
    :: One cannot connect two outputs to a single
    input.
    >>> convert = Node(MRIConvert(out_type=‘nii’),
    name=‘converter’)
    >>> segment = Node(Segment(), name=‘segmenter’)
    >>> binarize = Node(ImageMaths(op_string = ‘-nan
    -thr 0.5 -bin’), name=‘binarize’)
    my_workflow.connect(convert, ‘out_file’,
    segment, ‘data’)
    my_workflow.connect(segment, ‘native_wm_image’,
    binarize, ‘in_file’)

    View Slide

  22. !
    Inputs and outputs

    View Slide

  23. Inputs and outputs
    >>> from nipype.interfaces.camino import DTIFit
    >>> from nipype.interfaces.spm import Realign
    >>> DTIFit.help()
    >>> Realign.help()

    View Slide

  24. thus far
    Creating a Workflow
    Creating Nodes
    Connecting Nodes
    Finding inputs/outputs

    View Slide

  25. Running a workflow

    View Slide

  26. Running a workflow
    >>> my_workflow.run()

    View Slide

  27. Running a workflow
    >>> my_workflow.run()
    >>> my_workflow.base_dir = ‘/some/place’
    >>> my_workflow.run()

    View Slide

  28. Running a workflow
    >>> my_workflow.run()
    >>> my_workflow.base_dir = ‘/some/place’
    >>> my_workflow.run()
    >>> my_workflow.base_dir = ‘/some/place’
    >>> my_workflow.run(plugin=‘MultiProc’,
    plugin_args={‘nprocs’: 64})

    View Slide

  29. Running a workflow
    >>> my_workflow.run()
    >>> my_workflow.base_dir = ‘/some/place’
    >>> my_workflow.run()
    >>> my_workflow.base_dir = ‘/some/place’
    >>> my_workflow.run(plugin=‘MultiProc’,
    plugin_args={‘nprocs’: 64})
    >>> my_workflow.base_dir = ‘/some/place’
    >>> my_workflow.run(plugin=‘PBS’,
    plugin_args={‘qsub_args’: ‘-q max500’})

    View Slide

  30. Running a workflow
    >>> my_workflow.run()
    >>> my_workflow.base_dir = ‘/some/place’
    >>> my_workflow.run()
    >>> my_workflow.base_dir = ‘/some/place’
    >>> my_workflow.run(plugin=‘MultiProc’,
    plugin_args={‘nprocs’: 64})
    >>> my_workflow.base_dir = ‘/some/place’
    >>> my_workflow.run(plugin=‘PBS’,
    plugin_args={‘qsub_args’: ‘-q max500’})
    >>> my_workflow.base_dir = ‘/some/place’
    >>> my_workflow.run(plugin=‘PBSGraph’,
    plugin_args={‘qsub_args’: ‘-q max500’})

    View Slide

  31. Running a workflow
    >>> my_workflow.run()
    >>> my_workflow.base_dir = ‘/some/place’
    >>> my_workflow.run()
    >>> my_workflow.base_dir = ‘/some/place’
    >>> my_workflow.run(plugin=‘MultiProc’,
    plugin_args={‘nprocs’: 64})
    >>> my_workflow.base_dir = ‘/some/place’
    >>> my_workflow.run(plugin=‘PBS’,
    plugin_args={‘qsub_args’: ‘-q max500’})
    >>> my_workflow.base_dir = ‘/some/place’
    >>> my_workflow.run(plugin=‘PBSGraph’,
    plugin_args={‘qsub_args’: ‘-q max500’})
    Local
    Distributed

    View Slide

  32. Running a workflow
    :: Nipype plugins define how a graph is executed.
    Current plugins: Linear, MultiProc, SGE/PBS/
    LSF, PBSGraph, Condor/CondorDAGMan, IPython

    View Slide

  33. things to do
    Run on many subjects
    Set parameters
    Rerunning
    Inserting your code

    View Slide

  34. Repeating subgraphs: iterables
    >>> subjects = [‘sub001’,
    ‘sub002’, ‘sub003’ ...]
    >>> infosource.iterables =
    [('subject_id', subjects)]

    View Slide

  35. Repeating subgraphs: iterables
    >>> subjects = [‘sub001’, ‘sub002’, ‘sub003’ ...]
    >>> infosource.iterables = [('subject_id', subjects),
    (‘model_id’, [1, 2])]
    >>> smoothnode.iterables = [(‘fwhm’, [0, 3, 10])

    View Slide

  36. Repeating subgraphs: iterables
    >>> subjects = [‘sub001’, ‘sub002’, ‘sub003’ ...]
    >>> infosource.iterables = [('subject_id', subjects),
    (‘model_id’, [1, 2])]
    >>> smoothnode.iterables = [(‘fwhm’, [0, 3, 10])
    :: iterables are like nested for-loops, that you
    create simply by setting a property.

    View Slide

  37. Rerun affected nodes only
    >>> subjects = [‘sub001’]
    >>> infosource.iterables = [('subject_id', subjects),
    (‘model_id’, [1, 2])]
    >>> smoothnode.iterables = [(‘fwhm’, [6])
    :: Using content or timestamp hashing Nipype
    tracks inputs and only reruns nodes with
    changed inputs
    >>> subjects = [‘sub001’, ‘sub002’, ‘sub003]
    >>> infosource.iterables = [('subject_id', subjects),
    (‘model_id’, [1, 2])]
    >>> smoothnode.iterables = [(‘fwhm’, [0, 6])

    View Slide

  38. Running your own code
    def get_contrasts(base_dir, model_id):
    contrast_file = os.path.join(base_dir,
    'models',
    'model%03d' % model_id,'task_contrasts.txt')
    ...
    return contrasts
    contrastgen = Node(Function(input_names=['base_dir',
    'model_id'],
    output_names=['contrasts'],
    function=get_contrasts),
    name='generate_contrasts')
    >>> from nipype.interfaces.utility import Function

    View Slide

  39. Nipype features covered
    Crea&ng  Nodes  and  Workflows
    Distributed  computa&on
    Using  Iterables
    Func&on  nodes
    Rerunning  par&al  workflows

    View Slide

  40. Other Nipype features
    Using  Nipype  as  an  interface  library
    Nipype  caching
    Crea&ng  MapReduce  like  nodes
    Connec&ng  to  XNAT
    Execu&on  configura&on  op&ons

    View Slide

  41. Outline
    What  is  Nipype?
    Unwrapping  a  workflow
    Where  to  go  from  here?

    View Slide

  42. What next?
    Explore interfaces and workflows:
    - PredictHD project workflows
    - CPAC workflows
    - BIPS workflows
    http://nipy.org/nipype/quickstart.html
    Contribute back:
    - Report issues
    - Create new interfaces and workflows
    - Write tests for existing code

    View Slide

  43. nipy.org/nipype

    View Slide