Upgrade to Pro — share decks privately, control downloads, hide ads and more …

XGCM: General Circulation Model Postprocessing in Python (SciPy 2019)

XGCM: General Circulation Model Postprocessing in Python (SciPy 2019)

Ryan Abernathey

July 11, 2019
Tweet

More Decks by Ryan Abernathey

Other Decks in Technology

Transcript

  1. G e n e r a l C i r

    c u l a t i o n M o d e l P o s t p r o c e s s i n g i n P y t h o n R y a n A b e r n a t h e y Columbia University J u l i u s B u s e c k e Princeton University GCM
  2. • Ocean simulations are great at targeting parallel architecture of

    high-performance computers. • We can already simulate the ocean at incredibly high fidelity. • I could easily produce a petabyte of scientifically useful data within a week or two. !3 C o m p u t e r a r c h i t e c t u r e : 
 S i m u l at i o n Lawrence Livermore National Laboratory's Sierra supercomputer.
 Randy Wong/LLNL
  3. • Infrastructure for data analysis / viz has not been

    a high priority. • Standard approach: download data to personal computer, use MATLAB to analyze. • Unable to scale with simulation capacity. !4 C o m p u t e r a r c h i t e c t u r e : 
 A n a ly s i s a n d V i s u a l i z at i o n
  4. G e n e r a l C i r

    c u l at i o n M o d e l s ( G C M S ) !6
  5. X a r r ay i s A w e

    s o m e ! !8 time longitude latitude elevation Data variables used for computation Coordinates describe data Indexes align data Attributes metadata ignored by operations + land_cover “netCDF meets pandas.DataFrame” Credit: Stephan Hoyer
  6. x a r r ay m a k e s

    s c i e n c e e a s y !9 import xarray as xr ds = xr.open_dataset('NOAA_NCDC_ERSST_v3b_SST.nc') ds <xarray.Dataset> Dimensions: (lat: 89, lon: 180, time: 684) Coordinates: * lat (lat) float32 -88.0 -86.0 -84.0 -82.0 -80.0 -78.0 -76.0 -74.0 ... * lon (lon) float32 0.0 2.0 4.0 6.0 8.0 10.0 12.0 14.0 16.0 18.0 20.0 ... * time (time) datetime64[ns] 1960-01-15 1960-02-15 1960-03-15 ... Data variables: sst (time, lat, lon) float64 nan nan nan nan nan nan nan nan nan ... Attributes: Conventions: IRIDL source: https://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NCDC/.ERSST/...
  7. x a r r ay : l a b e

    l- b a s e d s e l e c t i o n !10 # select and plot data from my birthday ds.sst.sel(time='1982-08-07', method='nearest').plot()
  8. x a r r ay : l a b e

    l- b a s e d o p e r at i o n s !11 # zonal and time mean temperature ds.sst.mean(dim=(‘time', 'lon')).plot()
  9. x a r r ay : g r o u

    p i n g a n d a g g r e g at i o n !12 sst_clim = sst.groupby('time.month').mean(dim='time') sst_anom = sst.groupby('time.month') - sst_clim nino34_index = (sst_anom.sel(lat=slice(-5, 5), lon=slice(190, 240)) .mean(dim=('lon', 'lat')) .rolling(time=3).mean(dim='time')) nino34_index.plot()
  10. • label-based indexing and arithmetic • interoperability with the core

    scientific Python packages (e.g., pandas, NumPy, Matplotlib) • out-of-core computation on datasets that don’t fit into memory (thanks dask!) • wide range of input/output (I/O) options: netCDF, HDF, geoTIFF, zarr • advanced multi-dimensional data manipulation tools such as group- by and resampling !13 x a r r ay https://github.com/pydata/xarray
  11. G C M G r i d C e l

    l s !15 7/9/2019 grid2d_hv.svg C-grid — horizontal view C-grid — vertical view t u w w u w u w u u w u w X Z Y t u v v f f u f f X Y Z tracers located at cell centers e.g. temperature, pressure vectors located at cell faces e.g. velocity, heat flux other quantities located at cell corners e.g. vorticity Aiko Arakawa
  12. F i n i t e V o l u

    m e C a l c u l u s !16 7/9/2019 grid2d_hv.svg C-grid — horizontal view C-grid — vertical view t u w w u w u w u u w u w X Z Y t u v v f f u f f X Y Z Fundamental Operations Interoperation Difference x = 1 2 ( i +1 / 2 + i 1 / 2 ) x = i +1 / 2 i 1 / 2 These move us from one grid position to another. Example Derived Quantities ⇣ = @u @y + @v @x ⇣ = ( y xcu + x ycv ) /A⇣ Vorticity
  13. F i n i t e V o l u

    m e C a l c u l u s !17 7/9/2019 grid2d_hv.svg C-grid — horizontal view C-grid — vertical view t u w w u w u w u u w u w X Z Y t u v v f f u f f X Y Z Fundamental Operations Interoperation Difference x = 1 2 ( i +1 / 2 + i 1 / 2 ) x = i +1 / 2 i 1 / 2 https://mitgcm.readthedocs.io/en/latest/algorithm/algorithm.html#flux-form-momentum-equations
  14. C e n t r a L P r o

    b l e m : Xarray doesn’t understand grid cells. Xgcm to the rescue!
  15. • Consume and produce xarray data structures (never “leave” xarray)

    • Operate eagerly on NumPy inputs and lazily on Dask inputs • Follow existing metadata standards.
 (Be as flexible as possible about variable or dimension names) • Keep it as simple as possible! Solve one problem well.
 (NOT a visualization library!) !19 X G C M D e s i g n P r i n c i p l e s https://xgcm.readthedocs.io/
  16. X G C M C o n c e p

    t s : A x i s !20 7/9/2019 axis_positions.svg center f[0] f[1] … f[n-1] left f[0] f[1] … f[n-1] right f[0] f[1] … f[n-1] inner f[0] … f[n-2] position outer f[0] f[1] … f[n-1] f[n] An Axis is a set of Xarray dimensions that lie along the same axis of a locally orthogonal coordinate system. Each dimension of the Axis has a position, which describes how the data are located w.r.t the cell center.
  17. X G C M C o n c e p

    t s : G r i d !21 A Grid is a group of one or more Axis objects.
 This is the primary user interaction point with xgcm.
  18. X G C M U s a g e !24

    https://github.com/xgcm/xgcm
  19. X G C M U s a g e !25

    https://github.com/xgcm/xgcm
  20. • Provide high-level calculus methods (div, grad, curl, integral, etc.)

    • Support for more models (this is a metadata / standards problem) • Extend concepts to unstructured grids
 (see Chris Barker’s excellent gridded project) • This is a [small] community open-source project. Get involved! !26 X G C M R o a d m a p GCM https://github.com/xgcm/xgcm
  21. • xrft: https://github.com/xgcm/xrft
 Lazy, multidimensional, coordinate aware Fourier transforms for

    Xarray data structures • xhistogram: https://github.com/xgcm/xhistogram
 Lazy, multidimensional, coordinate aware histograms for Xarray data structures !27 O t h e r “ M i c r o - Pa c k a g e s ”