Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Lightning Talk by Megan Sheffield - Planning for Research Data Storage at Clemson University

Lightning Talk by Megan Sheffield - Planning for Research Data Storage at Clemson University

More Decks by Science Boot Camp for Librarians Southeast 2014

Transcript

  1. Planning for Research Data Storage at Clemson University Megan Sheffield

    [email protected] E-Science Librarian Science Bootcamp SE July 16-18, 2014
  2. About Clemson University • Mid-size public land grant university •

    Over 20k FTE with over 1300 faculty • Aiming for Research 1 Carnegie Classification • Over $100 million in research grants annually • Home of the Palmetto Cluster supercomputer
  3. About Clemson Libraries • Employs 28 faculty librarians, 62 staff,

    78 students • Main library building plus 4 satellite locations/branches • $13.8 million total budget • 1.3 million volumes • Traditional organizational structure
  4. Data Needs Survey • Surveyed science faculty and graduate students

    in 2011-2012 • Notable findings: – Most are unaware of their current options – Most are unaware of data management best practices… REALLY unaware! – Almost no one knows what metadata is – Most researchers need only moderate amounts of storage space
  5. Data Management Services Group • Formed this year to determine

    best practices and workflows • Completely within the library • Membership: – Liaisons for sciences, social sciences, and humanities – Metadata specialist – Library technology specialist – University Archivist
  6. • TigerPrints is a bepress Digital Commons repository (http://tigerprints.clemson.edu) •

    New ability to add data to the repository with some limitations • Data Pioneers group through bepress to discuss issues with datasets
  7. Test Case: Biology Data Set • From a graduate student

    that studies how animals walk/swim • Contains many file types and very intricate workflows, transformed data sets, thousands of total files • Sample set? Representative trial? Summary of data? • How to approach metadata?
  8. Test Case: Biology Data Set • Used summary data (76

    kb spreadsheet) • 3 Major Considerations: – TigerPrints existing schema (i.e. discipline & sub-discipline categorization) – DataCite Metadata Schema v 3.0 – Recommended metadata fields from the bepress Data Management Toolkit • It’s a work in progress!
  9. Future Plans • Data Management Plans – add more liaisons,

    train using RDM Rose • Solicit more data sets from researchers • Involve other stakeholders on campus • Dreaming BIG: Collaborate with our IT unit to make a standalone repository that could also be used for backups
  10. Links of Interest • TigerPrints • DataCite Metadata Schema •

    Data Management Toolkit from bepress • RDM Rose – Data Management Training Program for Librarians