Biodiversity data cubes: spatial aggregation and uncertainty

Biodiversity Building Blocks for policy Damiano Oldoni, Ward Langeraert, Toon
Van Daele, Tim Adriaens, Peter Desmet, Quentin Groom Research Institute Nature and Forest (INBO), Belgium Biodiversity data cubes: spatial aggregation and uncertainty Open Earth Monitor – Global workshop 2023/10/06 - Bolzano

Biodiversity Building Blocks for policy Hi!

Biodiversity Building Blocks for policy The B-Cubed project

Biodiversity Building Blocks for policy The global biodiversity crisis requires
rapid, reliable and repeatable biodiversity monitoring data which decision makers can use to evaluate policy. ABOUT Such information – from local to global level and within relevant timescales – calls for an improved integration of data on biodiversity from different sources. B-Cubed is standardising access to biodiversity data empowering policymakers to address the impacts of biodiversity change. Challenges Opportunities Aim

Biodiversity Building Blocks for policy APPROACH

Biodiversity Building Blocks for policy To improve the access to
rapid biodiversity data at a low cost, B-Cubed is packaging known methods together into standardised workflows. They can be run by anyone for any region and can be updated according to advances in data, methods and models. WORKFLOWS Repeatable workflows to create data cubes Automated workflows to calculate indicators from biodiversity data cubes Deep-learning to discover long-term spatiotemporal dependencies in species distribution models Exemplar workflows Deep learning Automated workflows

Biodiversity Building Blocks for policy CONSORTIUM

Biodiversity Building Blocks for policy WHY?

Biodiversity Building Blocks for policy WHY? Why occurrence data cubes?
• Address the ongoing biodiversity crisis • Essential Biodiversity Variables ( EBVs): a global system of harmonized observations, Pereira et al. (2013) • Aggregated “data cubes” to build EBVs of species distribution and abundance at a global scale, Kissling et al. (2018) • Repeatable? Scalable? Automated?

Biodiversity Building Blocks for policy WHAT?

Biodiversity Building Blocks for policy WHAT? Biodiversity occurrence • Evidence
of the occurrence of a species (or other taxon) at a particular place on a specified date

of the occurrence of a species (or other taxon) at a particular place on a specified date • Occurrences are events in a 3-dimensional space

of the occurrence of a species (or other taxon) at a particular place on a specified date • Occurrences are events in a 3-dimensional space • Taxonomic (what)

of the occurrence of a species (or other taxon) at a particular place on a specified date • Occurrences are events in a 3-dimensional space • Taxonomic (what) • Temporal (when)

of the occurrence of a species (or other taxon) at a particular place on a specified date • Occurrences are events in a 3-dimensional space • Taxonomic (what) • Temporal (when) • Spatial (where)

Biodiversity Building Blocks for policy WHAT? From occurrences to occurrence
cubes • Aggregate occurrences to partition the 3-dimensional space: • Taxonomic (e.g. at species level) • Temporal (e.g. at year level) • Spatial (e.g. at 1x1km level, EEA reference grid)

Biodiversity Building Blocks for policy WHAT? From occurrences to occurrence
cubes: tabular representation year eea_cell_code speciesKey n 2000 1kmE3809N3113 2889173 1 2000 1kmE3809N3135 2889173 1 ... ... ... ... 2014 1kmE3886N3121 2889173 51 2014 1kmE3886N3122 2889173 109 ... ... ... ... 2018 1kmE4047N3067 2889173 1

Biodiversity Building Blocks for policy HOW?

Biodiversity Building Blocks for policy HOW? From occurrences to occurrence
cubes: overview • 1 Specify constraints (what, when, where) and granularity • 2 Assess data quality, harvest occurrences (e.g. from GBIF) • 3 Solve uncertainty • 4 Aggregate

cubes: step 1 • Specify constraints: taxonomic (what), time (when), spatial (where) • Specify granularity time taxonomic spatial

cubes: step 1 • Specify constraints: what, when, where • Specify granularity time taxonomic spatial

cubes: step 2 • Harvest occurrences • Assess data quality

cubes: step 2 • Harvest occurrences • Assess data quality (a priori)

cubes: step 2 • Harvest occurrences • Assess data quality (a posteriori) issue_to_discard <- c( "ZERO_COORDINATE", "COORDINATE_OUT_OF_RANGE", "COORDINATE_INVALID", "COUNTRY_COORDINATE_MISMATCH" )

cubes: step 3 • Solve taxonomic uncertainty: via taxonomic backbone services, e.g. GBIF backbone • Solve temporal uncertainty • Solve spatial uncertainty scientificName taxonRank species taxonomicStatus Reynoutria japonica Houtt. SPECIES Reynoutria japonica ACCEPTED Fallopia japonica (Houtt.) Ronse Decraene SPECIES Reynoutria japonica SYNONYM Fallopia compacta (Hook.fil.) G.H.Loos & P.Keil SPECIES Reynoutria japonica SYNONYM Fallopia japonica var. japonica VARIETY Reynoutria japonica DOUBTFUL

cubes: step 3 • Solve taxonomic uncertainty • Solve temporal uncertainty: trivial for most typical aggregation levels • Solve spatial uncertainty

cubes: step 3 • Solve taxonomic uncertainty • Solve temporal uncertainty • Solve spatial uncertainty: directly assigning coordinates to grid can lead to huge spatial bias

cubes: step 3 • Solve taxonomic uncertainty • Solve temporal uncertainty • Solve spatial uncertainty: random assignment to grid within uncertainty circle

cubes: step 3 • Overview synonyms, lower ranks trivial for most typical aggregation levels time taxonomic spatial random assignment

cubes: step 4 • Aggregate: number of occurrences of a specific taxon in a specific cell and in a specific time interval year eea_cell_code speciesKey n min_coord_uncertainty 2014 1kmE3886N3121 2889173 51 10 2014 1kmE3886N3122 2889173 109 10 ... ... ... ... ... 2018 1kmE4047N3067 2889173 1 2828

Biodiversity Building Blocks for policy HOW TO USE?

Biodiversity Building Blocks for policy HOW TO USE? Use the
occurrence cube: visualization purposes • Random assignment step generates different cubes from same occurrences

Biodiversity Building Blocks for policy HOW TO USE? Using the
occurrence cube: visualization purposes • Random assignment step generates different cubes from same occurrences • Random assignment means that we cannot blindly create a map from the cube NO!

occurrence cube: visualization purposes • Random assignment step generates different cubes from same occurrences • Random assignment means that we cannot blindly create a map from the cube

occurrence cube: visualization purposes • Random assignment step generates different cubes from same occurrences • Add map of minimum coordinate uncertainty of the grid cells

occurrence cube: data quality filtering • How to deal with the intrinsic spatial uncertainty? • Solution 1: make cubes with precise enough data only (data quality step) • Solution 2: remove cells with “high” min_coord_uncertainty • Downside: enough data left? (Van Eupen, 2021)

occurrence cube: stability of statistics • Random assignment step generates different cubes from same occurrences • How stable are summary statistics such as the observed occupancy, i.e. number of occupied grid cells by a species? • What is the minimum number of cubes needed to robustly infer the average observed occupancy and its uncertainty?

Biodiversity Building Blocks for policy HOW MANY CUBES?

Biodiversity Building Blocks for policy HOW MANY CUBES? Monte Carlo
simulations with synthetic data: input 10 points 1000 cubes

simulations with synthetic data: distributions for some grid cells 10 points 1000 cubes

simulations with synthetic data: distribution of observed occupancy 10 points 1000 cubes

simulations with synthetic data: probability of occupancy 10 points 1000 cubes

simulations with synthetic data: convergence at grid cell level 10 points 1000 cubes

simulations with synthetic data: convergence observed occupancy 10 points 1000 cubes

Biodiversity Building Blocks for policy WORK IN PROGRESS

Biodiversity Building Blocks for policy WORK IN PROGRESS What’s going
on now • GBIF is building a service to produce and download occurrence cubes following users preference

on now • Further study of convergence of observed occupancy on real data and other synthetic data • Preliminary studies: real data seem to converge fast

on now • Random assignment using a different distribution: normal distribution for data acquired with GPS technology, although not strictly a gaussian process (Specht2020)

Biodiversity Building Blocks for policy Thank you! This project receives
funding from the European Union’s Horizon Europe Research and Innovation Programme (ID No 101059592). Views and opinions expressed are those of the author(s) only and do not necessarily reflect those of the European Union or the European Commission. Neither the EU nor the EC can be held responsible for them. Damiano Oldoni Open science lab for biodiversity (oscibio) Research Institute Nature and Forest (INBO) Abstract Slides: pptx, pdf Photo by Viridiflavus - Own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=4956453 b-cubed.eu @BCubedProject B-Cubed Project

Biodiversity data cubes: spatial aggregation an...

Biodiversity data cubes: spatial aggregation and uncertainty

More Decks by Damiano Oldoni

Other Decks in Research

Featured

Transcript