Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Xarray Update - PyData NYC 2019

Xarray Update - PyData NYC 2019

A brief update on xarray for the NumFocus-sponsored projects session at PyData NYC.

Companion notebook: https://nbviewer.jupyter.org/gist/rabernat/3923ece693ae5e2d9867a3940bf2fc50

Ryan Abernathey

November 04, 2019
Tweet

More Decks by Ryan Abernathey

Other Decks in Programming

Transcript

  1. Data Have Labels - Use Them! time longitude latitude elevation

    Data variables used for computation Coordinates describe data Indexes align data Attributes metadata ignored by operations + land_cover “netCDF meets pandas.DataFrame”
  2. Xarray Applications Weather / Ocean / Climate Fluid Mechanics /

    Turbulence Geospatial Imagery Microscopy Neuroscience Quantitative Finance
  3. Xarray Core Features • Apply operations over dimensions by name:

    x.sum('time') • Select values by label (or logical location) instead of integer location:
 x.sel(time='2014-01-01') • Mathematical operations (e.g., x - y) vectorize across multiple dimensions (array broadcasting) based on dimension names, not shape. • Deep dask.array integration 
 • Easily use the split-apply- combine paradigm with groupby:
 x.groupby(‘time.dayofyear')
 .mean(). • Database-like alignment based on coordinate labels that smoothly handles missing values:
 x, y = xr.align(x, y,
 join=‘outer'). • Keep track of arbitrary metadata in the form of a Python dictionary: x.attrs.
  4. Xarray New Features • Rich HTML repr in Jupyter notebooks

    • Flexible array support (NEP18) • Support for non-standard calendars via CFTimeIndex • Improved zarr integration (storage) • Improved rasterio integration (geospatial rasters) • New array methods: coarsen, integrate, quantile (groupby), @, map_blocks, etc, etc etc • Lots of documentation improvements Nov 2018 ➡ Nov 2019: Xarray v0.11.0 ➡ v0.14.1
  5. Xarray Funding! A community platform for Big Data geoscience Supporting

    part-time development work at several different locations
 (Anaconda, NCAR, UW) New openings for full time software engineers at Columbia!
  6. What Xarray Project Needs • More documentation improvements • f“Xarray

    for {discipline}” tutorials • Larger gallery of examples • Internal refactoring: • Entry points to make things more pluggable / customizable • Marketing! Xarray is underused in pydata ecosystem…