Verifying the Forecast: How Climate Models are Developed and Tested

Verifying the Forecast: How Climate Models are Developed and
Tested Steve Easterbrook Email: [email protected] Blog: www.easterbrook.ca/steve Twitter: @SMEasterbrook

2 A complex software eco-system… Alexander, K., Easterbrook, S. (2015).
The software architecture of climate models: a graphical comparison of CMIP5 and EMICAR5 configurations. Geoscientific Model Development 8, 1221-1232.

3 Example: NCAR, Boulder Alexander, K., Easterbrook, S. (2015). The
software architecture of climate models: a graphical comparison of CMIP5 and EMICAR5 configurations. Geoscientific Model Development 8, 1221-1232.

4 Outline 1. What are climate models? In which we
meet a 19th Century Swedish chemist and a famous computer scientist, and find out if butterflies cause hurricanes 2. What is their purpose? In which we perform several dangerous experiments on the life support systems of planet Earth, and live to tell the tale 3. What Software Engineering practices are used? In which we politely suggest that the question “does it work with FORTRAN?” helps keep the snake oil salesmen away… 4. Are they fit for purpose? In which we measure a very low bug density, lose faith in software metrics, and encounter two remarkably effective V&V practices.

5 The First Computational Climate Model 1895: Svante Arrhenius constructs
an energy balance model to test his hypothesis that the ice ages were caused by a drop in CO2;; (Predicts global temperature rise of 5.7°C if we double CO2) •Stockholm Arrhenius, S. (1896). On the Influence of Carbonic Acid in the Air upon the Temperature of the Ground. Philosophical Magazine and Journal of Science, 41(251)

6 Schematic of Arrhenius’s model Image source: Easterbrook, S. M.
(2018) Computing the Climate. Cambridge University Press. Forthcoming

7 First Numerical Simulation of Weather 1950s: John Von Neumann
develops a killer app for the first programmable electronic computer ENIAC: weather forecasting Imagines uses in weather control, geo-engineering, etc. Lynch, P. (2008). The ENIAC Forecasts: A Recreation. Bulletin of the American Meteorological Society, 1–11

8 Lynch, P. (2008). The ENIAC Forecasts: A Recreation. Bulletin
of the American Meteorological Society, 1–11

9 Source: W. F. Ruddiman (2001) Earth's Climate: Past and
Future

10 •Image Source: IPCC Fifth Assessment Report, Jan 2014. Working
Group 1, Fig 1.14(b) Grid scale in a high resolution model

11 Run on a massively parallel supercomputer… •The Yellowstone supercomputer
at the NCAR Wyoming Supercomputing Center, Cheyenne

13 Ensemble Forecasts for Irma •Image Source: https://www.wunderground.com/hurricane/atlantic/2017/tropical-storm-irma

14 Ensemble Forecasts for a Warmer Climate •Image Source: https://www.gfdl.noaa.gov/global-warming-and-hurricanes/

15 A little Chaos Theory •See: Lorenz, E. N. (1993).
Our Chaotic Weather. In The essence of chaos (pp. 77–110).

17 Fitness for purpose? ❍ Climate model model purposes include:
• To explore the consequences of a current theory;; • To test a hypothesis about the observational system • To test a hypothesis about the calculational system • To provide homogenized datasets (e.g. re-analysis);; • To conduct thought experiments about different climates;; • To act as a comparator when debugging another model;; • To provide inputs to assessments that inform policymaking;; Calculational System Theoretical System Observational System 1) Study this… 2) To gain insights on this… 3) …to make sense of this

18 …To permit inter-disciplinary collaboration!

19 Coupled model Atmospheric Dynamics and Physics Ocean Dynamics
Sea Ice Land Surface Processes Atmospheric Chemistry Ocean Biogeochemistry Overlapping Communities

20 Understanding What-if Experiments •E.g. How do volcanoes •affect climate?
Sources: (a) http://www.imk-ifu.kit.edu/829.php (b) IPCC Fourth Assessment Report, 2007. Working Group 1, Fig 9.5.

21 Can we limit warming to < +2ºC? •From: MR
Allen et al. Nature 458, 1163-1166 (2009)

22 Image Source: http://www.climate-lab-book.ac.uk/2014/when-will-we-reach-2c/ When will we reach +2° C?

23 Can we artificially cool the planet? From: Berdahl, M.,
et al. (2014). Arctic cryosphere response in the Geoengineering Model Intercomparison Project G3 and G4 scenarios. Journal of Geophysical Research: Atmospheres, 119(3), 1308–1321. •Global average near- surface temperature (°C) •Artic Sea Ice Extent (millions of km2)

25 UK Met Office Hadley Centre (UKMO) Max-Planck Institut
für Meteorologie (MPI-M) National Center for Atmospheric Research (NCAR) Institut Pierre-Simon Laplace (IPSL)

26 Example: SW Evolution of UKMet’s UM Easterbrook, S. M.,
& Johns, T. C. (2009). Engineering the Software for Understanding Climate Change. Computing in Science and Engineering, 11(6), 65–74.

27 A complex software eco-system… Alexander, K., Easterbrook, S. (2015).
The software architecture of climate models: a graphical comparison of CMIP5 and EMICAR5 configurations. Geoscientific Model Development 8, 1221-1232.

28 Example: NCAR vs UK Met Office Alexander, K., Easterbrook,
S. (2015). The software architecture of climate models: a graphical comparison of CMIP5 and EMICAR5 configurations. Geoscientific Model Development 8, 1221-1232.

29 Example: Ocean Model Genealogy Image Source: http://efdl.as.ntu.edu.tw/research/timcom/

30 The Earth System Modeling Framework ❍ Superstructure: Component data
structures and methods for coupling model components ❍ Infrastructure: Field data structures and methods for building model components, and utilities for coupling Time ESMF Superstructure AppDriver Component Classes: GridComp, CplComp, State Time ESMF Infrastructure Data Classes: Bundle, Field, Grid, Array Utility Classes: Clock, LogErr, DELayout, VM, Config Time U ser Code

31 The NUOPC architecture… Model: l Implements a specific physical
domain, e.g. atmosphere, ocean, wave, ice. Mediator: l Scientific coupling code (flux calculations, accumulation, averaging, etc.) between (potentially multiple) Models. Connector: l Connects pairs of components in one direction, e.g. Model to Model;; Model to/from Mediator l Executes simple transforms (Regrid, units). Driver: l Provides a harness for Models, Mediators, and Connectors (supporting hierarchies). l Coordinates initialize and run sequences.

33 Defect density Pipitone, J., Easterbrook, S. (2012). Assessing climate
model software quality: a defect density analysis of three models. Geoscientific Model Development, 5(4), 1009–1022.

34 Hypotheses for low defect rates ❍ Domain Expertise
• Developers are users and experts ❍ Slow, cautious development process ❍ Rigorous Development Process • Code changes as scientific experiments, with peer review ❍ Narrow Usage Profile • And hence potential for brittleness ❍ Intrinsic Defect Sensitivity / Tolerance • Bugs are either obvious or irrelevant ❍ Successful Disregard (and hence higher technical debt) • Scientists tolerate poor code & workarounds, if it doesn’t affect the science Pipitone, J., Easterbrook, S. (2012). Assessing climate model software quality: a defect density analysis of three models. Geoscientific Model Development, 5(4), 1009–1022.

35 Few Defects Post-release ❍ Obvious errors (eliminated during pre-release
testing): • Model won’t compile / won’t run • Model crashes during a run • Model runs, but variables drift out of tolerance • Runs don’t bit-compare (when they should) ❍ Subtle errors (model runs appear “valid”): • Model does not simulate the intended physical processes (e.g. incorrect model configuration) • The right results for the “wrong reasons” (e.g. over-tuning) ❍ “Acceptable Imperfections” • All models are wrong! • Processes omitted due to computational constraints • Known errors tolerated because the effect is “close enough!”

36 •? •Model •Weakness •Develop •Hypothesis •Run •Experiment •Interpret •Results
•Peer Review •Try another hypothesis •OK? •New Model Version Experiment-Driven Development (EDD)

37 Tools for EDD Image Source: UK Met Office

38 CMIP (1996 on) CMIP2 (1997 on) CMIP3 (2005-2006) CMIP5
(2010-2011) Number of Experiments 1 2 12 110 Centres Participating 16 18 15 24 # of Distinct Models 19 24 21 45 # of Runs (≈ Models x Expts) 19 48 211 841 Total Dataset Size 1 Gigabyte 500 Gigabyte 36 Terabyte 3.3 Petabyte Total Downloads from archive ? ? 1.2 Petabyte (still growing) Number of Papers Published 47 595 TBD Users 6700 TBD The Coupled Model Intercomparison Projects See: http://www.easterbrook.ca/steve/2012/04/some-cmip5-statistics/

39 MIPs as Software Benchmarking ❍ Susan Sim’s theory of
Software Benchmarking: • A benchmark “defines” the research paradigm (in the Kuhnian sense) • Benchmarks (can) cause rapid scientific progress • Benefits are both sociological and technological ❍ A software benchmark comprises: • A Motivating Comparison • A Task Sample • Performance Measures (not necessarily quantitative) ❍ Critical Success Factors: • Collaborative development of the benchmark • Open, transparent & critical evaluation of tools against the benchmark • Retirement of old benchmarks to prevent over-fitting Sim, S. E., Easterbrook, S. M., & Holt, R. C. (2003). Using benchmarking to advance research: a challenge to software engineering. In 25th IEEE Int. Conf. on Software Engineering (ICSE’03)

40 CMIP Model improvement Image Source: Reichler, T., & Kim,
J. (2008). How Well Do Coupled Models Simulate Today’s Climate? Bulletin of the American Meteorological Society, 89(3), 303–311. For more MIPs see: http://www.clivar.org/clivar-panels/former-panels/aamp/resources/mips

41 A Climate Model Configuration ? Scientific Question Model Development,
Selection & Configuration Running Model Interpretation of results Papers & Reports Scope of typical model evaluations Scope of fitness-for-purpose validation of a modeling system Is this model configuration appropriate to the question? Are the model outputs used appropriately? From models to modeling systems

42 Key Success Factors ❍ Software developers = domain experts
= users • Bottom up decision-making;; experts control technical direction • Shared ownership and commitment to quality ❍ Openness (code & data freely available*) ❍ Core set of effective SE tools • Version control;; bug tracking;; automated testing;; continuous integration ❍ Experiment-Driven Development • Hypothesis testing, peer review, etc. ❍ Model Intercomparisons & ensembles • …a form of software benchmarking

43 •Image: https://www.flickr.com/photos/good_day/211972522/ Questions from the Audience

Verifying the Forecast: How Climate Models are ...

Verifying the Forecast: How Climate Models are Developed and Tested

More Decks by Steve Easterbrook

Other Decks in Research

Featured

Transcript