$30 off During Our Annual Pro Sale. View Details »

XAS Data Interchange: A file format for a singl...

Bruce Ravel
August 22, 2015

XAS Data Interchange: A file format for a single XAS spectrum

This is my presentation on XDI for the XAFS16 Satellite meeting on Data acquisition, treatment, storage – quality assurance in XAFS spectroscopy at DESY on August 21 2015.

Bruce Ravel

August 22, 2015
Tweet

More Decks by Bruce Ravel

Other Decks in Science

Transcript

  1. need for XDI XDI specification XDI implementation XAS Data Interchange

    A file format for a single XAS spectrum Bruce Ravel1, Matt Newville2 1 NIST and NSLS-II, 2 University of Chicago Data acquisition, treatment, storage – quality assurance in XAFS spectroscopy DESY 21-22 August 2015 PDF of this talk: https://goo.gl/Nv1l3G 1 / 10 XAS Data Interchange
  2. need for XDI XDI specification XDI implementation Data challenges in

    the XAS world Even small data volumes and the simplest XAS experiments have persistent problems 1 Data archaeology – have you ever tried to extract data from the Ferrel Lytle archive at IIT? 2 Moving data from the beamline to the data analysis package 3 Sharing data between different analysis packages 4 Submitting supplemental data with a publication 5 Building data-centered applications for the web, the desktop, and the palmtop (e.g. an editable archive of standards) 6 Extracting XAS data from a multispectral data set 2 / 10 XAS Data Interchange
  3. need for XDI XDI specification XDI implementation Beamline data formats

    Every beamline has it’s own way of recording data, many of which are perfectly cromulent Most use ASCII files, some use more complex data formats Each beamline team has good reasons (one hopes!) for doing things their own way NSLS XDAC XDAC V1.4 Datafile V1 "au.b04" created on 3/15/09 at 1:28:27 PM on X-23A2 Diffraction element= Si (311). Ring energy= 2.80 GeV E0= 11919.00 NUM_REGIONS= 4 SRB= -200 -20 30 60 20k SRSS= 10 0.25 0.05k 0.05k SPP= 1 1 1 0.25k Settling time= 0.30 Offsets= 122.00 85.78 0.00 Gains= 8.00 8.00 1.00 Au foil, NSLS X23A2, 20% Ar in Io and It with harmonic rejection mirror ----------------------------------------------------------- Energy I0 It IntTime 11719.00294 18352.0000 15872.2222 1.0000 11728.99732 18380.0000 15934.2222 1.0000 11739.00126 18381.0000 15980.2222 1.0000 ... Photon Factory / SPring-8 / Aichi / SAGA 9809 KEK-PF BL12C G:hgcys-11.001 07.05.12 23:28 - 07.05.12 23:55 Hg:H2Cys 1:2 pH=12.86, 100 mM, prep. at PF, 5mm Teflon, stirred 4 hr Ring : 2.5 GeV 348.8 mA - 342.8 mA Mono : SI(111) D= 3.13551 A Initial angle= 9.25969 deg BL12C Transmission( 2) Repetition= 6 Points= 818 Param file : A:hgk16 energy axis(2) Block = 5 Block Init-Eng final-Eng Step/eV Time/s Num 1 12049.00 12150.00 6.00 1.00 17 2 12150.00 12320.00 .35 1.00 486 3 12320.00 12400.00 1.00 2.00 80 4 12400.00 12600.00 2.50 3.00 80 5 12600.00 13040.00 4.00 3.00 110 Ortec(-1) NDCH = 3 Angle(c) Angle(o) time/s 2 3 Mode 0 0 1 2 Offset 0 0 826.150 652.975 9.44433 9.44420 1.00 252916 592687 9.43958 9.43960 1.00 256349 604260 9.43483 9.43480 1.00 256429 607846 ... 3 / 10 XAS Data Interchange
  4. need for XDI XDI specification XDI implementation Problems with beamline

    formats They require additional processing in order to display µ(E), including Conversion to energy Dead-time or other corrections Merging of 10s, 100s, or 1000s of scans and/or detectors Ambiguous metadata, for instance How is the beamline identified? What consitutes a user comment? What describes the condition of the source or the beamline? XAS data analysis software and plotting software may have difficulty importing and interpreting the data This data is probably not appropriate for submission to a journal as supplemental material Data interchange A standard for the interchange of µ(E) data would help address most of these concerns. 4 / 10 XAS Data Interchange
  5. need for XDI XDI specification XDI implementation Goals of a

    data interchange format The smallest unit of currency is the µ(E) spectrum. 1 Be easy for a human to read. Be easy for a computer to read. 2 Establish a common language for transferring µ(E) spectra between XAS experimenters, data analysis packages, web applications, journals and anything else that needs to process XAS data. This will enhance the user experience. 3 Increase the relevance and longevity of experimental data by reducing the amount of data archaeology future interpretations of that data will require. 4 Provide a mechanism for extracting and preserving a single XAS or XAS-like data set from a multispectral experiment or from a complex data structure. 5 Be a building block for hierarchical or relational data structures. 5 / 10 XAS Data Interchange
  6. need for XDI XDI specification XDI implementation Goals of a

    data interchange format The smallest unit of currency is the µ(E) spectrum. 1 Be easy for a human to read. Be easy for a computer to read. 2 Establish a common language for transferring µ(E) spectra between XAS experimenters, data analysis packages, web applications, journals and anything else that needs to process XAS data. This will enhance the user experience. 3 Increase the relevance and longevity of experimental data by reducing the amount of data archaeology future interpretations of that data will require. 4 Provide a mechanism for extracting and preserving a single XAS or XAS-like data set from a multispectral experiment or from a complex data structure. 5 Be a building block for hierarchical or relational data structures. 5 / 10 XAS Data Interchange
  7. need for XDI XDI specification XDI implementation Goals of a

    data interchange format The smallest unit of currency is the µ(E) spectrum. 1 Be easy for a human to read. Be easy for a computer to read. 2 Establish a common language for transferring µ(E) spectra between XAS experimenters, data analysis packages, web applications, journals and anything else that needs to process XAS data. This will enhance the user experience. 3 Increase the relevance and longevity of experimental data by reducing the amount of data archaeology future interpretations of that data will require. 4 Provide a mechanism for extracting and preserving a single XAS or XAS-like data set from a multispectral experiment or from a complex data structure. 5 Be a building block for hierarchical or relational data structures. 5 / 10 XAS Data Interchange
  8. need for XDI XDI specification XDI implementation Goals of a

    data interchange format The smallest unit of currency is the µ(E) spectrum. 1 Be easy for a human to read. Be easy for a computer to read. 2 Establish a common language for transferring µ(E) spectra between XAS experimenters, data analysis packages, web applications, journals and anything else that needs to process XAS data. This will enhance the user experience. 3 Increase the relevance and longevity of experimental data by reducing the amount of data archaeology future interpretations of that data will require. 4 Provide a mechanism for extracting and preserving a single XAS or XAS-like data set from a multispectral experiment or from a complex data structure. 5 Be a building block for hierarchical or relational data structures. 5 / 10 XAS Data Interchange
  9. need for XDI XDI specification XDI implementation Goals of a

    data interchange format The smallest unit of currency is the µ(E) spectrum. 1 Be easy for a human to read. Be easy for a computer to read. 2 Establish a common language for transferring µ(E) spectra between XAS experimenters, data analysis packages, web applications, journals and anything else that needs to process XAS data. This will enhance the user experience. 3 Increase the relevance and longevity of experimental data by reducing the amount of data archaeology future interpretations of that data will require. 4 Provide a mechanism for extracting and preserving a single XAS or XAS-like data set from a multispectral experiment or from a complex data structure. 5 Be a building block for hierarchical or relational data structures. 5 / 10 XAS Data Interchange
  10. need for XDI XDI specification XDI implementation XDI: XAS Data

    Interchange XDI is an ad hoc format loosely based on the format of e-mail and structured in a way that looks like a familiar column data file. # XDI/1.0 MX/2.0 # Beamline.name: 10ID # Beamline.harmonic_rejection: flat Rh-coated mirror # Facility.name: APS # Facility.xray_source: undulator A # Facility.energy: 7.00 GeV # Mono.name: Si 111 # Mono.d_spacing: 3.1356 Angstrom # Element.symbol: Fe # Element.edge: K # Scan.edge_energy: 7112.00 eV # Scan.start_time: 2005-03-08T20:08:57 # Column.1: energy eV # Column.2: mutrans # Column.3: i0 # MX.Num-regions: 1 # MX.SRB: 6900 # MX.SRSS: 0.5 # MX.SPP: 0.1 # MX.Settling-time: 0 # MX.Offsets: 11408.00 11328.00 13200.00 10774.00 # MX.Gains: 8.00 7.00 7.00 9.00 #/// # Fe K-edge, Lepidocrocite powder on kapton tape, RT # 4 layers of tape # exafs, 20 invang #--- # energy mutrans i0 6899.9609 -1.3070486 149013.70 6900.1421 -1.3006104 144864.70 6900.5449 -1.3033816 132978.70 6900.9678 -1.3059724 125444.70 6901.3806 -1.3107085 121324.70 (....etc....) 6 / 10 XAS Data Interchange
  11. need for XDI XDI specification XDI implementation XDI: XAS Data

    Interchange The data table is clearly organized into columns of numbers, with the abscissa (energy, in this case) as the left-most column. The non-data part of the file is clearly demarcated. This file can be imported as is into most common plotting and data processing tools (such as Excel, Origin, KaleidaGraph, and many others∗ ) # XDI/1.0 MX/2.0 # Beamline.name: 10ID # Beamline.harmonic_rejection: flat Rh-coated mirror # Facility.name: APS # Facility.xray_source: undulator A # Facility.energy: 7.00 GeV # Mono.name: Si 111 # Mono.d_spacing: 3.1356 Angstrom # Element.symbol: Fe # Element.edge: K # Scan.edge_energy: 7112.00 eV # Scan.start_time: 2005-03-08T20:08:57 # Column.1: energy eV # Column.2: mutrans # Column.3: i0 # MX.Num-regions: 1 # MX.SRB: 6900 # MX.SRSS: 0.5 # MX.SPP: 0.1 # MX.Settling-time: 0 # MX.Offsets: 11408.00 11328.00 13200.00 10774.00 # MX.Gains: 8.00 7.00 7.00 9.00 #/// # Fe K-edge, Lepidocrocite powder on kapton tape, RT # 4 layers of tape # exafs, 20 invang #--- # energy mutrans i0 6899.9609 -1.3070486 149013.70 6900.1421 -1.3006104 144864.70 6900.5449 -1.3033816 132978.70 6900.9678 -1.3059724 125444.70 6901.3806 -1.3107085 121324.70 (....etc....) 6 / 10 XAS Data Interchange ∗ such as that silly thing...
  12. need for XDI XDI specification XDI implementation XDI: XAS Data

    Interchange The version of the XDI format is identified in the first line as is the application that wrote this specific file. # XDI/1.0 MX/2.0 # Beamline.name: 10ID # Beamline.harmonic_rejection: flat Rh-coated mirror # Facility.name: APS # Facility.xray_source: undulator A # Facility.energy: 7.00 GeV # Mono.name: Si 111 # Mono.d_spacing: 3.1356 Angstrom # Element.symbol: Fe # Element.edge: K # Scan.edge_energy: 7112.00 eV # Scan.start_time: 2005-03-08T20:08:57 # Column.1: energy eV # Column.2: mutrans # Column.3: i0 # MX.Num-regions: 1 # MX.SRB: 6900 # MX.SRSS: 0.5 # MX.SPP: 0.1 # MX.Settling-time: 0 # MX.Offsets: 11408.00 11328.00 13200.00 10774.00 # MX.Gains: 8.00 7.00 7.00 9.00 #/// # Fe K-edge, Lepidocrocite powder on kapton tape, RT # 4 layers of tape # exafs, 20 invang #--- # energy mutrans i0 6899.9609 -1.3070486 149013.70 6900.1421 -1.3006104 144864.70 6900.5449 -1.3033816 132978.70 6900.9678 -1.3059724 125444.70 6901.3806 -1.3107085 121324.70 (....etc....) 6 / 10 XAS Data Interchange
  13. need for XDI XDI specification XDI implementation XDI: XAS Data

    Interchange Metadata is clearly identified and grouped into useful “namespaces”. The data columns are identified and, where appropriate, units are given. For the programmers in the audience, XDI headers map directly onto an associative array. (Other programming languages call this a dictionary, symbol table, hash, or map.) The metadata dictionary defines 8 families, the six shown here plus Detector. and Sample. # XDI/1.0 MX/2.0 # Beamline.name: 10ID # Beamline.harmonic_rejection: flat Rh-coated mirror # Facility.name: APS # Facility.xray_source: undulator A # Facility.energy: 7.00 GeV # Mono.name: Si 111 # Mono.d_spacing: 3.1356 Angstrom # Element.symbol: Fe # Element.edge: K # Scan.edge_energy: 7112.00 eV # Scan.start_time: 2005-03-08T20:08:57 # Column.1: energy eV # Column.2: mutrans # Column.3: i0 # MX.Num-regions: 1 # MX.SRB: 6900 # MX.SRSS: 0.5 # MX.SPP: 0.1 # MX.Settling-time: 0 # MX.Offsets: 11408.00 11328.00 13200.00 10774.00 # MX.Gains: 8.00 7.00 7.00 9.00 #/// # Fe K-edge, Lepidocrocite powder on kapton tape, RT # 4 layers of tape # exafs, 20 invang #--- # energy mutrans i0 6899.9609 -1.3070486 149013.70 6900.1421 -1.3006104 144864.70 6900.5449 -1.3033816 132978.70 6900.9678 -1.3059724 125444.70 6901.3806 -1.3107085 121324.70 (....etc....) 6 / 10 XAS Data Interchange
  14. need for XDI XDI specification XDI implementation XDI: XAS Data

    Interchange Three pieces of metadata are required to be in the XDI file. The monochromator d-spacing is required if a correction needs to be made to the energy axis of the data. The symbol and edge of the element are required to unambiguously identify the data. For example, both Cr K and Ba LI have tabulated energies of 5989 eV, while Se K and Tl LIII are both at 12658 eV. # XDI/1.0 MX/2.0 # Beamline.name: 10ID # Beamline.harmonic_rejection: flat Rh-coated mirror # Facility.name: APS # Facility.xray_source: undulator A # Facility.energy: 7.00 GeV # Mono.name: Si 111 # Mono.d_spacing: 3.1356 Angstrom # Element.symbol: Fe # Element.edge: K # Scan.edge_energy: 7112.00 eV # Scan.start_time: 2005-03-08T20:08:57 # Column.1: energy eV # Column.2: mutrans # Column.3: i0 # MX.Num-regions: 1 # MX.SRB: 6900 # MX.SRSS: 0.5 # MX.SPP: 0.1 # MX.Settling-time: 0 # MX.Offsets: 11408.00 11328.00 13200.00 10774.00 # MX.Gains: 8.00 7.00 7.00 9.00 #/// # Fe K-edge, Lepidocrocite powder on kapton tape, RT # 4 layers of tape # exafs, 20 invang #--- # energy mutrans i0 6899.9609 -1.3070486 149013.70 6900.1421 -1.3006104 144864.70 6900.5449 -1.3033816 132978.70 6900.9678 -1.3059724 125444.70 6901.3806 -1.3107085 121324.70 (....etc....) 6 / 10 XAS Data Interchange
  15. need for XDI XDI specification XDI implementation XDI: XAS Data

    Interchange Five pieces of metadata are recommended to be in the XDI file. These establish the provenance of the data and tell the reader how to interpret the abscissa. # XDI/1.0 MX/2.0 # Beamline.name: 10ID # Beamline.harmonic_rejection: flat Rh-coated mirror # Facility.name: APS # Facility.xray_source: undulator A # Facility.energy: 7.00 GeV # Mono.name: Si 111 # Mono.d_spacing: 3.1356 Angstrom # Element.symbol: Fe # Element.edge: K # Scan.edge_energy: 7112.00 eV # Scan.start_time: 2005-03-08T20:08:57 # Column.1: energy eV # Column.2: mutrans # Column.3: i0 # MX.Num-regions: 1 # MX.SRB: 6900 # MX.SRSS: 0.5 # MX.SPP: 0.1 # MX.Settling-time: 0 # MX.Offsets: 11408.00 11328.00 13200.00 10774.00 # MX.Gains: 8.00 7.00 7.00 9.00 #/// # Fe K-edge, Lepidocrocite powder on kapton tape, RT # 4 layers of tape # exafs, 20 invang #--- # energy mutrans i0 6899.9609 -1.3070486 149013.70 6900.1421 -1.3006104 144864.70 6900.5449 -1.3033816 132978.70 6900.9678 -1.3059724 125444.70 6901.3806 -1.3107085 121324.70 (....etc....) 6 / 10 XAS Data Interchange
  16. need for XDI XDI specification XDI implementation XDI: XAS Data

    Interchange Extension metadata – metadata specific to a beamline, a data acquisition system, or a data processing program – is specified by “extension headers”. These use the same format as standard metadata headers, but with a domain specific “namespace”. # XDI/1.0 MX/2.0 # Beamline.name: 10ID # Beamline.harmonic_rejection: flat Rh-coated mirror # Facility.name: APS # Facility.xray_source: undulator A # Facility.energy: 7.00 GeV # Mono.name: Si 111 # Mono.d_spacing: 3.1356 Angstrom # Element.symbol: Fe # Element.edge: K # Scan.edge_energy: 7112.00 eV # Scan.start_time: 2005-03-08T20:08:57 # Column.1: energy eV # Column.2: mutrans # Column.3: i0 # MX.Num-regions: 1 # MX.SRB: 6900 # MX.SRSS: 0.5 # MX.SPP: 0.1 # MX.Settling-time: 0 # MX.Offsets: 11408.00 11328.00 13200.00 10774.00 # MX.Gains: 8.00 7.00 7.00 9.00 #/// # Fe K-edge, Lepidocrocite powder on kapton tape, RT # 4 layers of tape # exafs, 20 invang #--- # energy mutrans i0 6899.9609 -1.3070486 149013.70 6900.1421 -1.3006104 144864.70 6900.5449 -1.3033816 132978.70 6900.9678 -1.3059724 125444.70 6901.3806 -1.3107085 121324.70 (....etc....) 6 / 10 XAS Data Interchange
  17. need for XDI XDI specification XDI implementation XDI: XAS Data

    Interchange User supplied comments (typically, but not exclusively, at the time of data acquisition) are clearly demarcated by a line of slashes and line of dashes. White space must be preserved, that is, a user comment like this must be preserved faithfully: -- -- / \ / \ / \/ \ \ Best / \ Sample / \ Ever / \ !! / \ / \/ # XDI/1.0 MX/2.0 # Beamline.name: 10ID # Beamline.harmonic_rejection: flat Rh-coated mirror # Facility.name: APS # Facility.xray_source: undulator A # Facility.energy: 7.00 GeV # Mono.name: Si 111 # Mono.d_spacing: 3.1356 Angstrom # Element.symbol: Fe # Element.edge: K # Scan.edge_energy: 7112.00 eV # Scan.start_time: 2005-03-08T20:08:57 # Column.1: energy eV # Column.2: mutrans # Column.3: i0 # MX.Num-regions: 1 # MX.SRB: 6900 # MX.SRSS: 0.5 # MX.SPP: 0.1 # MX.Settling-time: 0 # MX.Offsets: 11408.00 11328.00 13200.00 10774.00 # MX.Gains: 8.00 7.00 7.00 9.00 #/// # Fe K-edge, Lepidocrocite powder on kapton tape, RT # 4 layers of tape # exafs, 20 invang #--- # energy mutrans i0 6899.9609 -1.3070486 149013.70 6900.1421 -1.3006104 144864.70 6900.5449 -1.3033816 132978.70 6900.9678 -1.3059724 125444.70 6901.3806 -1.3107085 121324.70 (....etc....) 6 / 10 XAS Data Interchange
  18. need for XDI XDI specification XDI implementation Validation Example code

    in C 1 #include "xdifile.h" XDIFile *xdifile; int ret; 4 xdifile = malloc(sizeof(XDIFile)); ret = XDI_readfile (xdifile , "mydata.xdi"); 7 /* test return code ‘ret ’ for errors */ /* test for required metadata */ 10 j = XDI_required_metadata (xdifile); if (j != 0 ) { printf("\n# ( requirement code %ld):\n%s\n", 13 j, xdifile -> error_message ); } /* test for recommended metadata */ 16 j = XDI_recommended_metadata (xdifile); if (j != 0 ) { printf("\n# ( recommendation code %ld):\n%s\n", 19 j, xdifile -> error_message ); } /* examine each individual metadata item */ 22 for (i=0; i < xdifile ->nmetadata; i++) { j = XDI_validate_item (xdifile , xdifile -> meta_families [i], 25 xdifile -> meta_keywords [i], xdifile -> meta_values [i]); /* test return value ‘j’ */ 28 } /* ************************ */ /* do stuff with the data */ 31 /* ************************ */ XDI_cleanup(xdifile , ret); free(xdifile); 1 Read the XDI file. Without obvious errors (inconsistent number of data columns, non-numbers in data table, etc.), file content stored in a struct. (Lines 1-6) 2 Test if required metadata items are present. (Lines 10-14) 3 Test if recommended metadata items are present. (Lines 16-20) 4 Validate individual items against the dictionary. (Lines 22-28) Steps 2-4 are optional 7 / 10 XAS Data Interchange
  19. need for XDI XDI specification XDI implementation What’s implemented? 1

    A specification 2 A dictionary of metadata families and items with guidelines for appropriate values 3 An API written in C 4 An API written in Fortran 5 Bindings to the C API in Python and Perl 6 A test suite of valid and invalid XDI data 7 Dynamic analysis with Valgrind – no memory leaks in the C library! XDI is ready for use We hope it will be picked up by authors of data acquisition and data analysis software. 9 / 10 XAS Data Interchange
  20. need for XDI XDI specification XDI implementation What’s not implemented

    1 Dictionary definitions for metadata related to non-monochromatic sources, such as dispersive optics or plasma sources. 2 Dictionary items related to grating monochromators could be stronger. 3 Bindings for many popular languages: Matlab, R, IDL, LabView, Ruby, Lua, Mathematica, . . . . . . . . . . . . . . . and whatever language you like best. Contribute new bindings! Fork the repository, add more language bindings, make a pull request. We’ll take all comers! 10 / 10 XAS Data Interchange