Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Heiland Lecture at Colorado School of Mines

Lindsey Heagy
November 20, 2019

Heiland Lecture at Colorado School of Mines

Jupyter meets the Earth: from geophysical inversions to open, collaborative geoscience

Today’s most critical questions in the geosciences, from climate studies to the management of natural resources, require that we integrate knowledge and methods across domains. With the widespread availability of large-scale computational resources, and the unprecedented quality and quantity of scientific data being collected today, we have opportunities to ask in-depth questions and perform analyses that would have been impossible only years ago. These cross-cutting questions also require that we bridge across technical and social barriers that exist between disciplines. Furthermore, these topics involve a complex spectrum of stakeholders, from individual researchers to policy makers and the general public. The adoption of open-source tools, such as those in the Python ecosystem, is one mechanism for fostering communities and making progress on these challenges.

Within geophysics, I have been a part of one of the teams leading a transition towards open-science practices. In 2013, we started the SimPEG project, an open-source software package that integrates multiple methods (e.g. gravity, magnetics, electromagnetics, fluid flow) into a common framework. SimPEG offers both an architecture that fosters technical interoperability of algorithms, and a community approach that encourages multidisciplinary collaboration. Growing these collaborations to include geologists, hydrologists, geochemists, engineers and others motivates the need for educational resources that provide context for how geophysical techniques fit into the broader goal of answering a question about the subsurface. SimPEG is the research foundation atop which we’ve built the Geosci.xyz, a collection of open educational resources for geosciences. SimPEG and GeoSci.xyz exist within a broad ecosystem of open tools that are now transforming the practices of research, education and scientific communication. We use (and contribute to) Project Jupyter, and we now participate in the Pangeo initiative. Pangeo helps geoscientists explore petabyte-scale datasets and simulations interactively on modern computational environments (from HPC centers to the cloud).

In this talk I will outline my own trajectory in our efforts to develop an open, collaborative, and reproducible model that we think is needed for geoscience to tackle the scientific and social challenges that lie ahead.

Lindsey Heagy

November 20, 2019
Tweet

More Decks by Lindsey Heagy

Other Decks in Science

Transcript

  1. Jupyter meets the Earth: from geophysical inversions to open, collaborative

    geoscience Lindsey Heagy UC Berkeley @lindsey_jh
  2. hello (a bit about me) geophysical inversions open-source software open

    research & education geoscience + data science +
  3. Observations / Data After Hamman, 2018 Theory & Ideas EMAG2:

    Earth Magnetic Anomaly Grid (2-arc-minute resolution). Image credit: Dom Fournier (toolkit.geosci.xyz) what drives progress in geoscience? Simulations, Computation
  4. tools by and for researchers • Modular, multi-physics ◦ Gravity

    ◦ Magnetics ◦ Direct current resistivity ◦ Induced Polarization ◦ Electromagnetics ▪ Frequency Domain ▪ Time Domain ◦ Fluid Flow ▪ Richards Equation https://simpeg.xyz 3D Airborne Time Domain EM
  5. Collaborative scientific community • Contributors, users: Academic & industry •

    Applications: mining, groundwater, tectonic studies, …
  6. Dimensions of openness • Open source code • Open (FAIR)

    data • Open access publications & artifacts • Open standards: interoperability (even with proprietary tools) • Open community: all welcome! • …
  7. the science more than the paper An article about computational

    science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment and the complete set of instructions which generated the figures. -- Buckheit and Donoho (paraphrasing Claerbout) WaveLab and Reproducible Research, 1995
  8. An article about computational science in a scientific publication is

    not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment and the complete set of instructions which generated the figures. (and a place to run the code?) the science more than the paper -- Buckheit and Donoho (paraphrasing Claerbout) WaveLab and Reproducible Research, 1995
  9. R. Abernathey Columbia/Lamont Oceanography Pangeo co-Lead An Introduction to Earth

    and Environmental Data Science https://earth-env-data-science.github.io/intro PyEarth: A Python Introduction to Earth Science JupyterBook N. Swanson-Hysell Berkeley EPS Earth Data Science: Open Education
  10. enabling new science • Integrating multiple geophysical data types for

    richer geological models • Physics + machine learning in electromagnetics
  11. large-scale magnetic vector inversions Satellite data: 2-arc minute EMAG v2

    Divide and conquer Global problem Tiled forward D. Fournier J. Capriotti OcTree Meshes! B. Sullivan
  12. modeling electromagnetics on cylindrical meshes • Finite volume discretization ◦

    cylindrically symmetric ◦ 3D cylindrical meshes • DC, FDEM, TDEM 42 Heagy & Oldenburg, 2018
  13. modeling electromagnetics on cylindrical meshes 43 Validating the physics •

    Kaufman (1990): Charges, currents, electric fields
  14. physics + machine learning ML to estimate a source term

    for the correction True solution Error At DC: can replace well with solid cylinder
  15. Harnessing the power of cloud computing to study the whole

    Earth interactively Interactivity Distributed computing Data models / numerics
  16. Jupyter meets the Earth: an NSF grant (2M / 3Y)!

    Fernando Pérez Joe Hamman Laurel Larsen Kevin Paul Lindsey Heagy Chris Holdgraf Yuvi Panda Research use-cases Tech developments • Climate data analysis • Hydrology • Geophysics • Data discovery • Interactivity • Cloud/HPC infrastructure For more: http://bit.ly/jupytearth
  17. Open, collaborative geoscience • Data volumes & computational needs: •

    Bridging across technical & disciplinary lines: ◦ Interoperability of tools ◦ Resources for communicating ideas ◦ Interplay between research & education • Challenges are bigger than an individual: need an open ecosystem of tools and collaborative communities. • Problems of major societal impact: close the scientist / public / policymaker gap.