Upgrade to Pro — share decks privately, control downloads, hide ads and more …

METASPACE training guide

metaspace2020
November 11, 2018

METASPACE training guide

These are the tutorial slides on METASPACE. They cover the basics, how to use it, and recommendations on data interpretation. We normally use them in face-to-face trainings but they can be equally useful if you are new to METASPACE or would like to get updated on the newest features.

For more information on METASPACE, please visit the project website http://metaspace2020.eu, follow us on Twitter https://twitter.com/metaspace2020, or email at [email protected].

metaspace2020

November 11, 2018
Tweet

More Decks by metaspace2020

Other Decks in Science

Transcript

  1. Training agenda 1. Theory 2. Tutorial and demo 3. Q&A

    These slides: http://speakerdeck.com/metaspace2020
  2. Why taking part and staying till the end? Useful for

    you: - METASPACE finds metabolites in imaging MS data in 10 min - You can access million of metabolite images within seconds - You can share your data with lab members, collaborators Don’t miss out: - Over 200 people from 50 labs already submitted >3000 datasets - 8 research papers using METASPACE were published just in 2018 - METASPACE was referred in 9 reviews just in 2018 It’s free: - All open-source and free - Funded by EU Horizon2020, NIH
  3. Part of the Alexandrov team @ EMBL Heidelberg, Germany Lachlan

    Stuart Renat Nigmetzianov The METASPACE team Theodore Alexandrov Vitaly Kovalev Alumni: Andrew Palmer, Ivan Protsyuk, Dominik Fay, Artem Tarasov, Sergey Nikolenko Interested to join?
  4. http://metaspace2020.eu Upload Main view List of datasets, yours and public

    ones Projects (groups of datasets), yours and public ones For submitting data, for access to private data
  5. High-resolving imaging MS - Any ionisation source - Any spatial

    resolution - Any tissue - One section per dataset Data requirements
  6. High mass resolution Requirement: R FWHM (200 m/z) > 70K

    Well aligned and calibrated at least < 3 ppm Yes Yes Contact your vendor Orbitrap FTICR QTOF, TOF
  7. We customize processing We customize processing w.r.t. - Resolving power

    - isotope pattern prediction - Polarity - adducts Make sure you enter it correctly! R 200 =70K R 200 =280K [C 41 H 78 NO 7 P+K]+
  8. Data format Accepted data format: - centroided imzML Export available:

    - Bruker FTICR (using SCiLS Lab) - AP-SMALDI-Orbitrap - Spectroglyph-Orbitrap Our instructions: http://project.metaspace2020.eu/imzml
  9. Select the dataset for overlaying an optical image To overlay

    optical image for your dataset, click at the “+”
  10. Dataset list processing is in progress queued finished the list

    can be filtered and exported to CSV Clicking on fields limits the list to datasets with the same value
  11. Sorting and filtering annotations Click on column headers to sort

    Add as many filters as you need Quickly add a filter by hovering over a cell and clicking the icon
  12. Molecule search Search by name (partial name search) Or by

    molecular formula (exact match only) Click to edit
  13. Quick search • search across all fields • works in

    both ‘Datasets’ and ‘Annotations’ tabs • supports* prefix match, OR operator, negation, e.g. Rennes -(rat | mouse | human) (*) ElasticSearch Simple Query
  14. Quick search • search across all fields • works in

    both ‘Datasets’ and ‘Annotations’ tabs • supports* prefix match, OR operator, negation, e.g. Rennes -(rat | mouse | human) (*) ElasticSearch Simple Query
  15. Visual insight into MSM score assignment Exact m/z of each

    ion image Click and drag to zoom Ion images for each isotope peak Isotopic patterns Blue: theoretical abundance (at instrument resolving power) Red: measured image intensity
  16. Step-by-step search Export to CSV will save the current annotations

    table. Changing the filters will change which annotations are exported Always export annotations for comparison together (so they are at the same FDR)
  17. Choice of FDR level False Discovery Rate = expected percentage

    of false annotations Setting the FDR level: - Makes results comparable - Between datasets, runs, experiments, labs - Changes the number of annotations - Low FDR (e.g. 5%) provides low number of more reliable annotations - High FDR (e.g. 20%) reports more annotations but lower quality - Default FDR is 10% - FDR 50% is should be used with an extreme caution True annotation False hit MSM score FDR 10% FDR 20% Palmer et al, 2017, Nature Methods Annotations
  18. Choice of metabolite database Impacts search and False-Discovery-Rate estimation •

    Use one that’s relevant • Larger database ◦ more false-hits --> fewer annotations at a fixed FDR • Different databases give different annotations ◦ even for molecules in both databases due to FDR control ◦ for data-set comparison, use the same database
  19. Caution! Annotating at level of molecular formula • Possibility of

    multiple metabolites per sum formula ◦ webapp shows all hits from the database search (learn the ambiguity!) ◦ other databases can be searched (e.g. PubChem) ◦ use enrichment analysis to get biological leads • Use an orthogonal technique for reporting individual metabolites ◦ not directly integrated (yet) ◦ use web-app results help to target MS/MS studies (e.g. purchase of standards)
  20. Example: Metabolite annotation of the data with the help of

    METASPACE against the HMDB database resulted in 100 metabolite annotations at FDR 10%. The metabolite annotation was performed using the FDR-controlled algorithm (Palmer et al, Nature Methods 2017) delivering annotations on the Level 2 (molecular formulas) according to the Metabolomics Standards Initiative (Sumner et al, 2007). If you make the results public, please add: The results are publicly available at http://metaspace2020.eu. You can create a public project with your datasets for publication and request us ([email protected]) to create a public URL for your project. You can export your results as a CSV file and include as supplementary materials. Reporting METASPACE results in a publication
  21. Dos and Don’ts Do • Use centroided imzML • Check

    the correctness of metadata ◦ polarity, mass resolving power • Make sure the data is <3 ppm accuracy before submitting it • Use your work email address Don’ts or rather Caution notes • Don’t submit profile data • Don’t use FDR 50% unless you understand the implications • Don’t use EMBL-dev databases • Don’t use several emails • When reporting annotations, pay attention to all reported isomers and consider the relevance of the selected database
  22. Troubleshooting • If you get <10 annotations at FDR 10%,

    it is low and might indicate: ◦ Wrong polarity specified, wrong molecular database selected ◦ Data is miscalibrated (it should be at least <3 ppm), or unstable calibration, or lock-mass errors ◦ improperly exported or not centroided
  23. More info? • METASPACE team: ◦ webapp: http://metaspace2020.eu ◦ email:

    [email protected] ◦ twitter: @metaspace2020 ◦ source code: https://github.com/METASPACE2020/ ◦ tutorial slides: http://speakerdeck.com/metaspace2020 • Data conversion ◦ FTICR, contact SCiLS [email protected] ◦ Orbitrap, contact Thermo: [email protected]
  24. Conversion of FTICR data into centroided imzML • Export your

    centroided high-resolution spectra in the imzML format • Available for “FT-ICR type” SCiLS Lab files from SCiLS Lab 2016b • Performance in version 2018b significantly increased (4x speedup, batch export) • Best results in METASPACE if peak list is required for centroiding • Two ways 1. “Method 1” for new data and software: SQLite peak list data: Peak list provided during import 2. “Method 2”, for old data or software: FT-ICR profile data: Generate a peak list after import
  25. Method 1. Create imzML file for METASPACE • In the

    objects tab, click the export symbol of the region to be exported and select “Export to METASPACE” • The Export Spectra dialog opens • Set your normalization of choice • Select your peak list of choice for example “Imported Peaks” in case of SQLite • Provide your scan polarity • Click OK to save imzML file
  26. Method 1. SQLite peak list data • Data must have

    been acquired with on-the-fly centroid detection i.e. there is a file called ‘peaks.sqlite’ within the .d folder of the data-set • In SCiLS Lab a peak list “Imported peaks” is available, selecting most frequent peaks By default all peaks appearing more frequently than 1% of spectra
  27. Method 2. FT-ICR profile data • Older Solarix files do

    not directly contain a peak list to perform centroiding • Create peak list with Data Analysis SCiLS Lab Help Section 7.4 • Use METASPACE tool for peak finding https://spatialmetabolomics.github.io/centroidize/ • Use other external tools (mMass, …) • Import the external peak list into SCiLS Lab File > Import > m/z intervals from CSV or Clipboard
  28. Method 2. Use METASPACE tool for peak finding • Select

    the overview spectrum CSV exported from SCiLS • Upload CSV file to METASPACE tool • Copy values to clipboard • Use File > Import > m/z intervals from CSV
  29. Export into imzML from .raw Orbitrap format Several possible methods:

    1. imageQuest / raw-converter ◦ Recommended for: Thermo MALDI- / TransMIT AP-S-MALDI- 2. imzmlConverter ◦ Recommended for: DESI/flowProbe with separate files per row 3. pyimzML Python parser ◦ Recommended for bioinformaticians For details, see our instructions: http://project.metaspace2020.eu/imzML
  30. .raw -> mzML -> imzML • MSConvert ◦ free (link)

    • imzMLConverter ◦ free, although requires registration ◦ http://www.cs.bham.ac.uk/~ibs/imzMLConverter Method 3. Generic export into imzML
  31. Let’s say thanks METASPACE team Vitaly Kovalev Lachlan Stuart Renat

    Nigmetzianov Alumni Andrew Palmer Artem Tarasov Dominik Fay Sergey Nikolenko Ivan Protsyuk Sergey Ryazanov METASPACE public data contributors Alireza Abdolvahabi, Sarah Aboulmagd, Buck Achim, Rima Ait-Belkacem, Theodore Alexandrov, Christopher Anderton, Charlotte Bagger, Pierre Barbier Saint Hilaire, Michael Becker, Janine Beckmann, Mikhail Belov, Dhaka Bhandari, Arunima Bhattacharjee, Tanja Bien, Mark Bokhart, Berin Boughton, John Bowling, Eike Brockmann, Achim Buck, Josephine Bunch, Christina Burr, Richard Caprioli, Claire Carter, Steve Castellino, Janfelt Christian, Dave Clarke, Katharina Clitherow, Julien Delecolle, Nicolas Desbenoit, Domenic Dreisbach, Klaus Dreisewerd, Maria Duenas, Livia Eberlin, Shane Ellis, Isabelle Fournier, Neha Garg, Vannur Garikapati, Kyana Garza, Mathieu Gaudin, Benedikt Geier, Erin Gemperline, Stefanie Gerbig, Cristina Gonzalez Lopez, Richard Goodwin, Christian Greunke, Gus Grey, Julian Griffin, Katharina Halbach, Zoe Hall, Anne Mette Handler, Mitsuhiro Hayashi, Ron Heeren, Bram Heijs, Dimitri Heintz, Corinna Henkel, Dirk Hoelscher, Carsten Hopf, Lennart Huizing, Aslihan Inal, Sophie Jacobsen, Christian Janfelt, Asta Maria Joensen, Emrys Jones, Patrik Kadesch, Maureen Kane, Pegah Khamehgir-Silz, Mario Kompauer, Vitaly Kovalev, Mélanie Lagarrigue, James Langridge, Sheerin Latham, Regis Lavigne, Oliver Lechtenfeld, Young Jin Lee, Lingjun Li, Maunel Liebeke, Michael Linscheid, Logan Mackay, Liebeke Manuel, James McKenzie, Mira Merdas, Joris Meurs, Hidenobu Miyazawa, Astrid Moerman, David Muddiman, Max Müller, Konstantin Nagornov, Don Nguyen, Marty Paine, Andrew Palmer, József Pánczél, Heath Patterson, Robin Philip, Charles Pineau, Adam Pruska, Jusal Quanico, Cristine Quiason, Ksenija Radic, Luca Rappez, Lavigne Regis, Marina Reuter, Angelos Rigopoulos, Edita Ritmejeryte, Andreas Roempp, Livia S. Eberlin, Veronika Saharuka, Denis Sammour, Marta Sans, Kaija Schaepe, Julian Schneemann, Danielle Scott, Kumar Sharma, Bindesh Shrestha, Nicholas Sing, Renata Soares, Emilia Sogin, Jens Soltwisch, Berhard Spengler, Nicole Strittmatter, Na Sun, Zoltan Takats, Dinaiz Thinagaran, Spencer Thomas, Sergio Triana, Yury Tsybin, Lulu Tucker, Daa Van den Bosch, Quentin Vanbellingen, Dusan Velickovic, Claire Villette, Michael Waletzko, Samantha Walker, Eric Weaver, Jake White, Guanshi Zhang, Jialing Zhang