Upgrade to Pro — share decks privately, control downloads, hide ads and more …

METASPACE Training

Andrew Palmer
February 16, 2018

METASPACE Training

Slides given at ourcon 2017

Andrew Palmer

February 16, 2018
Tweet

More Decks by Andrew Palmer

Other Decks in Science

Transcript

  1. METASPACE training guide 2017 OurCon’V imzML workshop Theodore Alexandrov (EMBL,

    UCSD) Andy Palmer (EMBL) Vitaly Kovalev (EMBL) Artem Tarasov (EMBL) Adam Pruska, Andreas Roempp, Annabelle Fülöp, Anne Mette Handler, Benedikt Geier, Berin Boughton, Bernhard Spengler, Buck Achim, Carina Ramallo-Guevara, Charles Pineau, Chris Anderton, Christian Janfelt, Christina Burr, Claire Carter, Corinna Henkel, Cristina Gonzalez Lopez, Cristine Quiason, David Muddiman, Denis Sammour, Dhaka Bhandari, Dinaiz Thinagaran, Dirk Hoelscher, Don Nguyen, Dušan Velickovic, Eike Ulrich Brockmann, Emilia Sogin , Emrys Jones, Eric Weaver, Erin Gemperline, Guanshi Zhang, Gus Grey, Heath Patterson, Hidenobu Miyazawa, József Pánczél, James Langridge, James McKenzie, Jan-Hinrich Rabe, Janfelt Christian, Jens Soltwisch, Jialing Zhang, Josephine Bunch, Julian Griffin, Julien Delecolle, Kaija Schaepe, Klaus Dreisewerd, Konstantin Nagornov, Ksenija Radic, Kumar Sharma, Kyana Garza, Lavigne Regis, Lennart Huizing, Liebeke Manuel, Lingjun Li, Livia Eberlin, Logan Mackay, Luca Rappez, Marina Reuter, Mario Kompauer, Mark Bokhart, Marta Sans, Marty Paine, Mathieu Gaudin, Maureen Kane, Max Müller, Michael Becker, Michael Linscheid, Mikhail Belov, Na Sun, Neha Garg, Nicolas Desbenoit, Nicole Strittmatter, Oliver Lechtenfeld, Pegah Khamehgir-Silz,, Renata Soares, Richard Caprioli, Richard Goodwin, Rima Ait-Belkacem, Ron Heeren, Samantha Walker, Sandra Schulz, Sarah Aboulmagd, Sergio Triana, Shane Ellis, Sheerin Latham, Sophie Jacobsen, Spencer Thomas, Stefanie Gerbig, Steve Castellino, Veronika Saharuka, Vitaly Kovalev, Yury Tsybin, Zoe Hall, Zoltan Takats @METASPACE2020 Contributors
  2. Part 1: Introduction Learning outcomes METASPACE project Bioinformatics for metabolite

    annotation Engine Knowledgebase Part 2: Tutorial Learning outcomes Data requirements Data submission Annotation browsing Interpretation Part 3: Export to imzML FTICR (Bruker) Orbitrap (Thermo) Other vendors Training Overview
  3. What we hope you will learn today • Ins and

    outs of metabolite annotation in HR imaging MS • Bioinformatics we developed ◦ Metabolite Signal Match (MSM) score ◦ False Discovery Rate estimation ◦ FDR-controlled annotation • Our METASPACE platform ◦ How to prepare data for submission to our service ◦ How to submit your data ◦ How to view molecular annotations in our webapp
  4. Outline • Input (data and metadata) • Online Software •

    Data Submission • Annotation Browsing • Use Cases a. mouse brain, MALDI-FTICR (U Rennes 1) b. human colorectal tumor, DESI-Orbitrap (ICL)
  5. Example 1 Mouse brain (MALDI-FTICR) Data provided by Regis Lavigne,

    Charles Pineau, University of Rennes 1 Select an annotation See the molecular distribution
  6. Example 2 Human colorectal tumor (DESI-Orbitrap) Data provided by James

    McKenzie, Zoltan Takats, Imperial College London Filter different datasets See data details View metadata
  7. Data Requirements Imaging mass spectrometry data - Any ionisation source

    - Any spatial resolution - Any tissue - One section per dataset
  8. Data Requirements Data Format - imzML Centroided - vendor preferred

    - http://metaspace2020.eu/imzml http://imzml.org/wp/introduction/
  9. Customised Processing Processing is tailored to your data! - Technical

    metadata - Resolving power - isotope prediction - Polarity - adducts R 200 =70K R 200 =280K [C 41 H 78 NO 7 P+K]+
  10. Data Requirements Your responsibility: - Data is processed ‘as is’

    - Check metadata is correct - Report resolving power accurately (check within data-set) - Low numbers of annotations often correspond to poor quality mass spectra - Calibration inaccuracy - Lock-mass errors
  11. 1. Follow conversion instructions for your instrument 2. Select the

    centroided files, .imzML and .ibd 3. The dataset will be copied to the cloud storage (accessible only to our team) Data upload
  12. • Start typing to see suggestions • Please fill truthfully

    ◦ Don’t want to disclose? Just put ‘N/A’ • Click (top right) ◦ Enabled once the files finished uploading Metadata form
  13. Dataset list processing is in progress queued finished the list

    can be filtered and exported to CSV Clicking on fields limits the list to datasets with the same value
  14. Sorting/filtering annotations Click on column headers to sort Add as

    many filters as you need Quickly add a filter by hovering over a cell and clicking the icon
  15. Quick search • search across all fields • works in

    both ‘Datasets’ and ‘Annotations’ tabs • supports* prefix match, OR operator, negation, e.g. Rennes -(rat | mouse | human) (*) ElasticSearch Simple Query
  16. Quick search • search across all fields • works in

    both ‘Datasets’ and ‘Annotations’ tabs • supports* prefix match, OR operator, negation, e.g. Rennes -(rat | mouse | human) (*) ElasticSearch Simple Query
  17. Molecule search Search by name (partial name search) Or by

    molecular formula (exact match only) Click to edit
  18. Visual insight into MSM score assignment Exact m/z of each

    ion image Click and drag to zoom Ion images for each isotope peak Isotopic patterns Blue: theoretical abundance (at instrument resolving power) Red: measured image intensity
  19. Step-by-step search Export to CSV will save the current annotations

    table. Changing the filters will change which annotations are exported Always export annotations for comparison together (so they are at the same FDR)
  20. Results Browsing Summary 1. Choose database 2. Choose data-set 3.

    Add molecule filter and type ‘PC’ a. molecular class filter 4. Type ‘PC(16:0/18:0) a. single metabolite filter 5. Select row of table a. single ion filter 6. Simple comparison of spatial distributions between adducts 7. Export of annotations to csv Also possible • Filter by m/z • Formula search • Comparison across datasets
  21. FDR Controlled Annotation False Discovery Rate - the fraction of

    incorrect annotations Control - request a set of annotations at a fixed estimated FDR Setting the level: - Adjust the number of molecules for follow-up analysis - When only limited numbers of molecules can be reviewed, adjust the FDR so that fewer/great numbers of molecules are annotated - Compare annotations between datasets - A principled way of selecting molecules to compare between datasets True annotation False discovery MSM score FDR = 0.1 FDR = 0.2 FDR = n True n False + n True
  22. Choice of metabolite database synthesized/recorded 88M CAS registry biologically occurring/active

    50M PubChem compounds single biological system 40K HMDB sample specific 1K LC-MS
  23. Choice of metabolite database Impacts search and False-Discovery-Rate estimation •

    Use one that’s relevant • Larger database ◦ more false-hits --> fewer annotations at a fixed FDR • Different databases give different annotations ◦ even for molecules in both databases due to FDR control ◦ for data-set comparison, use the same database
  24. Annotating at level of molecular formula • Possibility of multiple

    metabolites per sum formula ◦ webapp shows all hits from the database search (learn the ambiguity!) ◦ other databases can be searched (e.g. PubChem) ◦ use enrichment analysis to get biological leads • Use an orthogonal technique for reporting individual metabolites ◦ not directly integrated (yet) ◦ use web-app results help to target MS/MS studies (e.g. purchase of standards)
  25. • The METASPACE platform putatively annotates* molecular formula along with

    several candidate metabolites • A set of annotations should be reported along with the FDR threshold selected. ◦ e.g. “Molecular annotation was performed using the METASPACE annotation engine (Palmer et al, Nature Methods 2017). 150 molecules were annotated against the LipidMAPS database at 10% FDR. Results are publically available at annotate.metaspace2020.eu” • The export function of the website delivered a spreadsheet that can be included as supporting for any publication. Reporting Results Metabolomics Standards Initiative identification levels Sumner et al, 2007, Metabolomics
  26. • Preparing data for submission ◦ imzML export ◦ metadata

    • Submitting data ◦ web-app, upload • Browsing knowledgebase ◦ web-app, annotations Learning Summary • METASPACE team: ◦ web: metaspace2020.eu ◦ email: [email protected] ◦ twitter: @metaspace2020 ◦ source code: https://github.com/METASPACE2020/ • FTICR data conversion ◦ SCiLS: [email protected] • Orbitrap data conversion ◦ Thermo Fisher Scientific: [email protected] How to get help?
  27. Export to METASPACE • Export your centroided high-resolution spectra in

    the imzML format • Available for “FT-ICR type” SCiLS Lab files from SCiLS Lab 2016b • Performance in version 2018b significantly increased (4x speedup, batch export) • Best results in METASPACE if peak list is required for centroiding • Two different Bruker data formats ◦ SQLite peak list data: Peak list provided during import ◦ FT-ICR profile data: Generate a peak list after import
  28. Create imzML file for METASPACE • In the objects tab,

    click the export symbol of the region to be exported and select “Export to METASPACE” • The Export Spectra dialog opens • Set your normalization of choice • Select your peak list of choice for example “Imported Peaks” in case of SQLite • Provide your scan polarity • Click OK to save imzML file
  29. SQLite peak list data • Data must have been acquired

    with on-the-fly centroid detection i.e. there is a file called ‘peaks.sqlite’ within the .d folder of the data-set • In SCiLS Lab a peak list “Imported peaks” is available, selecting most frequent peaks By default all peaks appearing more frequently than 1% of spectra
  30. FT-ICR profile data • Older Solarix Files do not directly

    contain a peak list to perform centroiding • Create peak list with Data Analysis SCiLS Lab Help Section 7.4 • Use METASPACE tool for peak finding https://spatialmetabolomics.github.io/centroidize/ • Use other external tools (mMass, …) • Import the external peak list into SCiLS Lab File > Import > m/z intervals from CSV or Clipboard
  31. Use METASPACE tool for peak finding • Select the overview

    spectrum CSV exported from SCiLS • Upload CSV file to METASPACE tool • Copy values to clipboard • Use File > Import > m/z intervals from CSV
  32. Export into imzML: Orbitrap data (.raw) Instructions: metaspace2020.eu/imzML Software tools:

    imageQuest / raw-converter - Recommended for: MALDI images (Thermo MALDI- / TransMIT AP-S-MALDI-) imzmlConverter - Recommended for: DESI/flowProbe with separate files per row Recommended for bioinformaticians: pyimzML (Python parser)
  33. .raw -> mzML -> imzML • MSConvert ◦ free (link)

    • imzMLConverter ◦ free ◦ requires registration ◦ http://www.cs.bham.ac.uk/~ibs/imzMLConverter Export into imzML: Generic
  34. This project has received funding from the European Union’s Horizon

    2020 research and innovation programme under grant agreement № 634402. Acknowledgments METASPACE R&D team at EMBL Theodore Alexandrov Vitaly Kovalev Artem Tarasov Andrew Palmer Dominik Fay SCiLS Dennis Trede Jan Hendrik Kobarg METASPACE data contributors Achim Buck, Adam Pruska, Andreas Roempp, Andrew Palmer, Annabelle Fülöp, Anne Mette Handler, Benedikt Geier, Berin Boughton, Bernhard Spengler, Carina Ramallo-Guevara, Charles Pineau, Chris Anderton, Christian Janfelt, Christina Burr, Claire Carter, Corinna Henkel, Cristina Gonzalez Lopez, Cristine Quiason, David Muddiman, Denis Sammour, Dhaka Bhandari, Dinaiz Thinagaran, Dirk Hoelscher, Don Nguyen, Dušan Velickovic, Eike Ulrich Brockmann, Emilia Sogin , Emrys Jones, Eric Weaver, Erin Gemperline, Guanshi Zhang, Gus Grey, Heath Patterson, Hidenobu Miyazawa, József Pánczél, James Langridge, James McKenzie, Jan-Hinrich Rabe, Janfelt Christian, Jens Soltwisch, Jialing Zhang, Josephine Bunch, Julian Griffin, Julien Delecolle, Kaija Schaepe, Klaus Dreisewerd, Konstantin Nagornov, Ksenija Radic, Kumar Sharma, Kyana Garza, Lavigne Regis, Lennart Huizing, Lingjun Li, Livia Eberlin, Logan Mackay, Luca Rappez, Manuel Liebeke, Marina Reuter, Mario Kompauer, Mark Bokhart, Marta Sans, Marty Paine, Mathieu Gaudin, Maureen Kane, Max Müller, Michael Becker, Michael Linscheid, Mikhail Belov, Na Sun, Neha Garg, Nicolas Desbenoit, Nicole Strittmatter, Oliver Lechtenfeld, Pegah Khamehgir-Silz, Rappez Luca, Regis Lavigne, Renata Soares, Richard Caprioli, Richard Goodwin, Rima Ait-Belkacem, Ron Heeren, Samantha Walker, Sandra Schulz, Sarah Aboulmagd, Sergio Triana, Shane Ellis, Sheerin Latham, Sophie Jacobsen, Spencer Thomas, Stefanie Gerbig, Steve Castellino, Theodore Alexandrov, Veronika Saharuka, Vitaly Kovalev, Yury Tsybin, Zoe Hall, Zoltan Takats