- even just briefly! - rather than forego them entirely • Hopefully you’ll get ideas for concepts and tools to explore later • Expectations and best practices are often field- specific, so it’s tough to generalize
useful + put it into practice you’ll share your top data management tips with us you’ll tell me what you want to know more about for future workshops! you’ll be reenergized enough by the topic to find something else that works for you and / or
in the scientific community as necessary to validate research findings.” INCLUDES: code, figures, statistics, interviews, transcripts EXCLUDES: preliminary analyses, drafts of papers, plans for further research, communication + peer reviews, physical samples - OMB Circular, White House
all federal funding agencies. Office of Science and Technology Policy (OSTP) memo – Released spring 2013; took effect fall 2015 – Requires open sharing of published articles and data – Publication repository is provided; data repository is not – Applies to agencies with $100M + in R&D
few skills to manage it effectively Movement toward openness, impacted by OSTP and spurred by early career researcher expectations Disciplinary culture shifts toward data reuse + reproducibility Need for multi-purpose online spaces to collaborate, share, store, and archive research outputs (including data)
meaningfully or jumbled together? Do you know where your data is? Documentation • How much contextual information accompanies your data? Can you understand it? Can a stranger understand it? Storage & backup • Where is your data stored and backed up? Could you recover from hardware failure or accidental deletion? Media obsolescence • Do you know how the software, hardware, and file formats you use will impact your data’s readability in the future?
readme file. (Good example located here: http://hdl.handle.net/2022/17155) – Document any data processing and analyses. – Don’t forget written notes. Item-level – Remember the importance of file names for conveying descriptive information. – Find and adhere to disciplinary metadata standards • XML • Dublin Core
information for people associated with the project! • List of ﬁles, including a description of their relationship to one another! • Copyright + licensing information! • Limitations of the data! • Funding sources / institutional support! ! tl;dr !! Any information necessary for someone with no knowledge of your research to understand and / or replicate your work.!
access regularly and change frequently. In general, losing your storage means losing current versions of the data. backup = regular process of copying data separate from storage. You don’t really need it until you lose data, but when you need to restore a file it will be the most important process you have in place.
become obsolete through business deals, new versions, or a gradual decline in user base. (Consider WordPerfect.) • Anticipate average lifespan of media to be 3-5 years. Migrate your files every few years, if not more frequently!
data organized meaningfully or jumbled together? Do you know where your data is? Documentation • How much contextual information accompanies your data? Can you understand it? Can a stranger understand it? Storage & backup • Where is your data stored and backed up? Could you recover from hardware failure or accidental deletion? Media obsolescence • Do you know how the software, hardware, and file formats you use will impact your data’s readability in the future?
your data! – Institutional + disciplinary repositories – Data papers/journals • If your research is federally funded, remember that you’ll now have to share your data • Data is not copyrightable; best practice is to apply a Creative Commons 0 license • There’s even a proven citation advantage to sharing your data* *Piwowar HA, Vision TJ. (2013) Data reuse and the open data citation advantage. PeerJ 1:e175 https://dx.doi.org/10.7717/peerj.175
practices will impact your ability to access your data days/weeks/years from now. • If organizing retroactively, prioritize your most important research. • Managing digital stuff requires a LOT of decision making, so embrace it! • Any plan is better than no plan at all. Start today. Ask for help.
data management plan compiled by project leaders. The plan should cover: • Organization + naming • Documentation + metadata • Storage + sharing • Any and all other pertinent details. (The more the better; it’ll save you headaches later.) The plan should be actively revisited and adapted as needed throughout the project.