Jupyter meets the Earth:
from geophysical inversions to open, collaborative geoscience
Lindsey Heagy
UC Berkeley
@lindsey_jh
Slide 2
Slide 2 text
hello (a bit about me)
geophysical
inversions
open-source
software
open research &
education
geoscience + data
science
+
Slide 3
Slide 3 text
Observations
/ Data
After Hamman, 2018
Theory &
Ideas
EMAG2: Earth Magnetic Anomaly Grid (2-arc-minute
resolution). Image credit: Dom Fournier (toolkit.geosci.xyz)
what drives progress in geoscience?
Simulations,
Computation
Slide 4
Slide 4 text
what drives progress in geoscience?
Slide 5
Slide 5 text
imaging the subsurface: some important problems
Slide 6
Slide 6 text
forward and inverse problems in geophysics
Slide 7
Slide 7 text
forward and inverse problems in geophysics
Numerical simulations:
predict data
Optimization:
estimate a model
Slide 8
Slide 8 text
tools by and for researchers
● Modular, multi-physics
○ Gravity
○ Magnetics
○ Direct current resistivity
○ Induced Polarization
○ Electromagnetics
■ Frequency Domain
■ Time Domain
○ Fluid Flow
■ Richards Equation
https://simpeg.xyz
3D Airborne Time Domain EM
Slide 9
Slide 9 text
simulations: create a mesh
9
Slide 10
Slide 10 text
simulations: discretize & solve
10
DC resistivity
discrete equations
A
Slide 11
Slide 11 text
inversions
Data Misfit Regularization
Inverse Problem
Slide 12
Slide 12 text
a better approach: CORE science*
Collaborative
Open
Reproducible
Extensible
* With a nod to the FAIR principles of open data
Slide 13
Slide 13 text
Collaborative?
Slide 14
Slide 14 text
Collaborative scientific community
● Contributors, users:
Academic & industry
● Applications: mining,
groundwater, tectonic
studies, …
Slide 15
Slide 15 text
Open?
Slide 16
Slide 16 text
Dimensions of openness
● Open source code
● Open (FAIR) data
● Open access publications & artifacts
● Open standards: interoperability (even with
proprietary tools)
● Open community: all welcome!
● …
Slide 17
Slide 17 text
Reproducible?
the foundation of collaboration
Slide 18
Slide 18 text
the science more than the paper
An article about computational science in a scientific
publication is not the scholarship itself, it is merely
advertising of the scholarship. The actual scholarship is the
complete software development environment and the
complete set of instructions which generated the figures.
-- Buckheit and Donoho (paraphrasing Claerbout)
WaveLab and Reproducible Research, 1995
Slide 19
Slide 19 text
An article about computational science in a scientific
publication is not the scholarship itself, it is merely
advertising of the scholarship. The actual scholarship is the
complete software development environment and the
complete set of instructions which generated the figures.
(and a place to run the code?)
the science more than the paper
-- Buckheit and Donoho (paraphrasing Claerbout)
WaveLab and Reproducible Research, 1995
Slide 20
Slide 20 text
mybinder.org
shareable, interactive, reproducible
environments from your public git repository
Slide 21
Slide 21 text
http://bit.ly/black-holes-woop
Black holes! LIGO, Sept 14, 2015
Slide 22
Slide 22 text
We have access to all the same tools
Slide 23
Slide 23 text
Extensible?
Slide 24
Slide 24 text
inversion
gravity magnetics
density mag. susceptibility
invert
Data 100:
~800 students
Data 8:
~1,300
students
Berkeley Data Science Education: Fall 2018
Slide 36
Slide 36 text
https://blog.jupyter.org/teaching-and-learning-with-jupyter-c1d965f7b93a
Teaching and Learning with Jupyter
Slide 37
Slide 37 text
R. Abernathey
Columbia/Lamont
Oceanography
Pangeo co-Lead
An Introduction to Earth and
Environmental Data Science
https://earth-env-data-science.github.io/intro
PyEarth: A Python Introduction to Earth Science
JupyterBook
N. Swanson-Hysell
Berkeley EPS
Earth Data Science: Open Education
Slide 38
Slide 38 text
enabling new science
● Integrating multiple geophysical data
types for richer geological models
● Physics + machine learning in
electromagnetics
Slide 39
Slide 39 text
large-scale magnetic vector inversions
Satellite data:
2-arc minute
EMAG v2
Divide and conquer
Global problem
Tiled forward
D. Fournier
J. Capriotti
OcTree Meshes!
B. Sullivan
Slide 40
Slide 40 text
joint inversion of QUEST data
Susceptibility model Density model
Jae Deok Kim Jiajia Sun
modeling electromagnetics on cylindrical meshes
43
Validating the physics
● Kaufman (1990): Charges, currents,
electric fields
Slide 44
Slide 44 text
44
time-domain EM response
Slide 45
Slide 45 text
physics + machine learning
ML to estimate a source
term for the correction
True solution Error
At DC: can replace well with
solid cylinder
Slide 46
Slide 46 text
scaling
computation & communities
Slide 47
Slide 47 text
Interactive exploration at scale?
Slide 48
Slide 48 text
Harnessing the power of cloud
computing to study the whole
Earth interactively
Interactivity
Distributed computing
Data models / numerics
Slide 49
Slide 49 text
Jupyter meets the Earth: an NSF grant (2M / 3Y)!
Fernando
Pérez
Joe
Hamman
Laurel
Larsen
Kevin Paul Lindsey
Heagy
Chris
Holdgraf
Yuvi
Panda
Research use-cases Tech developments
● Climate data analysis
● Hydrology
● Geophysics
● Data discovery
● Interactivity
● Cloud/HPC infrastructure
For more: http://bit.ly/jupytearth
Slide 50
Slide 50 text
Next-generation LIDAR satellite
https://icesat-2.gsfc.nasa.gov A. Arendt J. Scheik L. Heagy
M. Siegfried
F. Pérez
Slide 51
Slide 51 text
Open, collaborative geoscience
● Data volumes & computational needs:
● Bridging across technical & disciplinary lines:
○ Interoperability of tools
○ Resources for communicating ideas
○ Interplay between research & education
● Challenges are bigger than an individual: need an open ecosystem of tools and
collaborative communities.
● Problems of major societal impact: close the scientist / public / policymaker
gap.