Slide 1

Slide 1 text

Ryan Abernathey on behalf of Xarray Core Devs
 PyData NYC 2019

Slide 2

Slide 2 text

Data Have Labels - Use Them! time longitude latitude elevation Data variables used for computation Coordinates describe data Indexes align data Attributes metadata ignored by operations + land_cover “netCDF meets pandas.DataFrame”

Slide 3

Slide 3 text

Xarray Applications Weather / Ocean / Climate Fluid Mechanics / Turbulence Geospatial Imagery Microscopy Neuroscience Quantitative Finance

Slide 4

Slide 4 text

Xarray Core Features • Apply operations over dimensions by name: x.sum('time') • Select values by label (or logical location) instead of integer location:
 x.sel(time='2014-01-01') • Mathematical operations (e.g., x - y) vectorize across multiple dimensions (array broadcasting) based on dimension names, not shape. • Deep dask.array integration 
 • Easily use the split-apply- combine paradigm with groupby:
 x.groupby(‘time.dayofyear')
 .mean(). • Database-like alignment based on coordinate labels that smoothly handles missing values:
 x, y = xr.align(x, y,
 join=‘outer'). • Keep track of arbitrary metadata in the form of a Python dictionary: x.attrs.

Slide 5

Slide 5 text

Xarray New Features • Rich HTML repr in Jupyter notebooks • Flexible array support (NEP18) • Support for non-standard calendars via CFTimeIndex • Improved zarr integration (storage) • Improved rasterio integration (geospatial rasters) • New array methods: coarsen, integrate, quantile (groupby), @, map_blocks, etc, etc etc • Lots of documentation improvements Nov 2018 ➡ Nov 2019: Xarray v0.11.0 ➡ v0.14.1

Slide 6

Slide 6 text

Xarray Funding! A community platform for Big Data geoscience Supporting part-time development work at several different locations
 (Anaconda, NCAR, UW) New openings for full time software engineers at Columbia!

Slide 7

Slide 7 text

What Xarray Project Needs • More documentation improvements • f“Xarray for {discipline}” tutorials • Larger gallery of examples • Internal refactoring: • Entry points to make things more pluggable / customizable • Marketing! Xarray is underused in pydata ecosystem…