Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Community-Supported Data Repositories in Paleoecoinformatics: Building the Middle Tail

Research Data Services
February 17, 2016
560

Community-Supported Data Repositories in Paleoecoinformatics: Building the Middle Tail

Presentation given as part of the RDS Holz Brown Bag series, February 2016.

Research Data Services

February 17, 2016
Tweet

Transcript

  1. Community-Supported Data Repositories in Paleoecoinformatics: Building the Middle Tail Jack

    Williams | Department of Geography | UW-Madison Simon Goring | Department of Geography | UW-Madison
  2. Community‐Supported Data Repositories in Paleoecoinformatics: Building the Middle Tail Jack

    Williams, Dept. Geography & Nelson Center for Climatic Research Simon Goring, Dept. Geography Neotoma DB www.neotomadb.org @IceAgeEcologist @sjGoring Part 1: Understanding the Data, Framing the Challenge Part 2: Connecting Users, Data, & Repositories NSF-Earth Cube
  3. Paleoecology – the quick overview Paleoecologists use geological and historical

    data to understand the processes governing the functioning of species and ecosystems, for states of the earth system and time scales that are inaccessible to direct observation.
  4. Strongly motivated by climate change & species responses to climate

    change Dawson et al. 2011 Science IPCC 2013 AR5 WGI Chap. 12 Fig. 12.5 Projected Temperature Rises Integrated Biodiversity Science
  5. IPCC 2007 WG1 Ch6 Fig. 6.3 Greenland Temperature Age (103

    years before 2005) The Quaternary: a model system for studying & modeling biotic responses to climate change Repeated large, and rapid climate variations
  6. The Quaternary: a model system for studying & modeling biotic

    responses to climate change Data-rich Ice Cores Loess Ocean Sediments Speleothems Tree Rings LAKES
  7. The last deglaciation – C:\Jack\Figures\OthersFIgs\GISPtempLGM- 0.JPG (Grootes et al. 1993

    Nature) Temperature Variations Since the Last Glacial Maximum GISP2 Ice Core (Greenland) PLEISTOCENE || HOLOCENE Bølling-Allerød • Global temperature: rose ~5°C • Ice sheets melted • Sea level: rose by 120m • CO2atm : rose from 190 to 280 ppm Difference from present (°C) Time Age (years before present [BP])
  8. Species responses to past climate change: Lessons from the Past

    Migration Adaptation in situ Extinction Woodrat body size, 21,000 yr BP to present
  9. 1,000 Picea (Spruce) 21,000 yr BP Paleodata Work Cycle Fieldwork

    Lab Work Data Analysis & Publication Data Deposition Data Synthesis 1,000 yr BP New questions, hypotheses
  10. Paleoecological Data: Key characteristics • ‘Long Tail’: Collected in the

    field by small scientific teams. Workers vary w.r.t. data management expertise, capacity, interest • Commonality & Heterogeneity: All geological data, various measurements & methods • Long Shelf Life: specimens & samples collected decades ago are still analyzed • Scientific expertise distributed by proxy type, region, time period, and/or taxonomic group
  11. Many of our field’s Big Questions require assembly of individual

    records into larger networks Do global temperatures lead or lag CO2 during deglaciations? 21,000 11,000 Modern 15,000 7,000 % Spruce distributions: last glacial maximum to present % % % No Data Williams et al. (2004) Ecological Monographs Spruce Pollen Ice Ice Ice How far and fast can species migrate when climates change? Global temperatures & CO2 : 22ka->0ka Shakun et al. (2012) Nature
  12. Community Data Repositories have emerged to tackle these bigger questions

    Neotoma DB www.neotomadb.org Key Characteristics Open Data Curated by Community Standardized Taxonomy Time: Age Controls and Age Models Paleobiology DB paleobiodb.org
  13. accessible small data BIG DATA findable identification, persistence authorization, protocols

    context, provenance re-usable harmonized, community governance & input interoperable “… data have no value or meaning in isolation; they exist within a knowledge infrastructure — an ecology of people, practices, technologies, institutions, material objects, and relationships.” - C.L. Borgman Moving up the Value Chain: Generic Depositories vs. Community-Led Repositories Modified from K. Lehnert Community- Led Repositories Generic Depositories
  14. Neotoma Paleoecology Database: Design Concepts • Spatiotemporal database: species occurrences

    & abundances in space and time • Age controls and age models stored • Centralized IT and Distributed Scientific Governance. Neotoma composed of several constituent databases (e.g. North American Pollen Database, FAUNMAP) • Open data accessible via Explorer, APIs, R Neotoma • Broad user community: Paleoecologists, ecosystem modellers, paleoclimatologists, biogeographers, educators, … Neotoma DB www.neotomadb.org
  15. “. . . Careful data collection and measurement are important.

    Data analysis is the glamour [child] of statistics, but you can’t do much if your data are no good.” Andrew Gelman
  16. Credits: from top to bottom: NOAA Okeanos Explorer Program (CC

    BY-SA 2.0), NASA/Kathryn Hansen (CC BY 2.0), and Canyonlands National Park/Neal Herbert (CC BY-NC-SA 2.0).
  17. iSamples Internet of Samples in Earth Sciences iSamples RCN is

    to dramatically improve discovery, access, sharing, analysis, and curation of physical samples and data generated by their study. - https://www.youtube.com/user/cyber4paleo Cyber4Paleo Collaboration & Cyberinfrastructure for Paleogeosciences C4P RCN focuses on development of standards for aggregation & dissemination of paleogeoscience data, to facilitate research on Earth- Life history. EC3 Earth-Centered Communication for Cyberinfrastructure: Challenges of Field Data Collection, Management & Integration EC3 network aims to facilitate dialogue between field-based geologists, and computer and social scientists to address problems faced by field- based geological community. Research Coordination Networks RCNs
  18. Building Blocks (BB) - Earth System Bridge Spanning Scientific Communities

    with Interoperable Modeling Frameworks Earth System Bridge will allow interoperable modeling frameworks, enabling communities to collaborate and advance earth system science. BCube A Broker Framework for Next Generation Geosciences Building tools to improve data brokering & improving access valuable data by developing web crawlers. GeoDeepDive A Cognitive Computer Infrastructure for Geoscience Developing capabilities in machine reading to benefit scientists in all domains & creating infrastructure to lower barriers to text and data mining activities.
  19. Science Committee Technology & Architecture Committee Liaison Team LEADERSHIP COUNCIL

    Office Council of Data Facilities Engagement Team Talk to EarthCube Participants! Attend EarthCube Workshops! Mailing List - earthcube.org Twitter - @earthcube Funding - EC Travel Grants & Distinguished Lecturers