Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Advancing educational research through research data management 20180216

ANurnberger
February 16, 2018

Advancing educational research through research data management 20180216

Emerging trends in educational research and practices continue to rely more heavily on data, whether that is big data in terms of organizational data or the data that makes learning analytics and personalized learning possible. But where do those data come from and how do they work together? In this talk, Amy Nurnberger will address issues of research data management in the field of education research in terms of what they mean for our students, our practices, and our intellectual legacy.

ANurnberger

February 16, 2018
Tweet

More Decks by ANurnberger

Other Decks in Education

Transcript

  1. NSF NE Big data for education spoke, 2018-02-16 Advancing Educational

    Research through Research Data Management Amy Nurnberger Program Head, Data Management Services, MIT Adj Asst Professor, Learning Analytics, Teachers College, Columbia University @ANurnberger ORCID: 0000-0002-5931-072X
  2. NSF NE Big data for education spoke, 2018-02-16 The plan:

    • Thank you! • Defining data • Research Data Management (RDM) • FAIR data • Sharing data • Teaching RDM , Ryan Baker & NE BDES community meeting organizers
  3. NSF NE Big data for education spoke, 2018-02-16 Warning: “data”

    usage http://phdcomics.com/comics.php?f=1816 Singular? Plural? I will use it both ways…
  4. NSF NE Big data for education spoke, 2018-02-16 The Data

    of your research Material or information "on which an argument, theory, test or hypothesis, or another research output is based." “(i) Research data is defined as the recorded factual material commonly accepted in the [research] community as necessary to validate research findings…” Queensland University of Technology. Manual of Procedures and Policies. Section 2.8.3. http://www.mopp.qut.edu.au/D/D_02_08.jsp http://www.whitehouse.gov/omb/circulars_a110#36
  5. NSF NE Big data for education spoke, 2018-02-16 Research Data

    comes from: • Experiment • Observation • Simulation/Models • Compilation/Derivation • Reference data • Documenting the research process
  6. NSF NE Big data for education spoke, 2018-02-16 Data exists

    in many formats • Text • Numeric • Audio • Video • Image • Code: Software, analysis, models structured AND unstructured
  7. NSF NE Big data for education spoke, 2018-02-16 Data types

    • Non-digital text (lab books, field notebooks, archival texts) • Digital texts or digital copies of text • Spreadsheets & log files • Audio, video • Computer Aided Design/CAD • Statistics (SPSS, SAS, R) • Databases • Geographic Information Systems (GIS) and spatial data • Digital copies of images • Non-digital images • Matlab files & Models • Metadata & Paradata • Data visualizations • Analysis code (R, Python) • Standard operating procedures and protocols • Artistic products • Web files • Curricular materials • Collection of digital objects acquired and generated during research Adapted from: Georgia Tech–http://libguides.gatech.edu/content.php?pid=123776&sid=3067221
  8. NSF NE Big data for education spoke, 2018-02-16 Data happens

    all the time Raw or Primary Cleaned Processed Analyzed Visualized Collecting Analyzing Publicizing Wrapping it up Preparing Proposal / Planning
  9. NSF NE Big data for education spoke, 2018-02-16 So, why

    does data management matter? https://www.youtube.com/watch?v=N2zK3sAtr-4 by NYU Health Sciences Library, CC By Data Sharing and Management Snafu in 3 Short Acts
  10. NSF NE Big data for education spoke, 2018-02-16 So, why

    does data management matter? https://www.youtube.com/watch?v=N2zK3sAtr-4 Data Sharing and Management Snafu in 3 Short Acts #somanyboxes …because none of us want to be this bear
  11. NSF NE Big data for education spoke, 2018-02-16 If the

    data you need still exists; If you found the data you need; If you understand the data you found; If you trust the data you understand; If you can use the data you trust; Someone did a good job of data management. - Rex Sanders
  12. NSF NE Big data for education spoke, 2018-02-16 Research Data

    Management: Protects your intellectual legacy http://dx.doi.org/10.1890/1051-0761(1997)007%5B0330:NMFTES%5D2.0.CO;2 http://www.slideshare.net/shlake/documentation-metadatadentonlake
  13. NSF NE Big data for education spoke, 2018-02-16 Research Data

    Management: Protects research integrity http://www.slideshare.net/carolegoble/ismb2013-keynotecleangoble #17
  14. NSF NE Big data for education spoke, 2018-02-16 Research Data

    Management: Increases your impact Figure 1. Distribution of 2004–2005 citation counts of 85 trials by data availability. Piwowar HA, Day RS, Fridsma DB (2007) Sharing Detailed Research Data Is Associated with Increased Citation Rate. PLoS ONE 2(3): e308. https://doi.org/10.1371/journal.pone.0000308
  15. NSF NE Big data for education spoke, 2018-02-16 Research Data

    Management: Increases your impact Figure 3: Results and Analyses, Focus on 2010. Dorch, B.F, et al. (2015) Evidence that data sharing increases citation impact from astrophysics [slides]. LIBER. http://www.liber2015.org.uk/wp- content/uploads/2015/03/4.1-Evidence-that-Data- Sharing-Increases-Citation-Impact.pdf Figure 1. Distribution of 2004–2005 citation counts of 85 trials by data availability. Piwowar HA, Day RS, Fridsma DB (2007) Sharing Detailed Research Data Is Associated with Increased Citation Rate. PLoS ONE 2(3): e308. https://doi.org/10.1371/journal.pone.0000308
  16. NSF NE Big data for education spoke, 2018-02-16 Research Data

    Management: Increases your impact Figure 3: Results and Analyses, Focus on 2010. Dorch, B.F, et al. (2015) Evidence that data sharing increases citation impact from astrophysics [slides]. LIBER. http://www.liber2015.org.uk/wp- content/uploads/2015/03/4.1-Evidence-that-Data- Sharing-Increases-Citation-Impact.pdf Figure 1. Distribution of 2004–2005 citation counts of 85 trials by data availability. Piwowar HA, Day RS, Fridsma DB (2007) Sharing Detailed Research Data Is Associated with Increased Citation Rate. PLoS ONE 2(3): e308. https://doi.org/10.1371/journal.pone.0000308 “Very significant increases in research, teaching and studying efficiency were realised by the users as a result of their use of the data centres;” Beagrie, N. and Houghton J.W. (2014) The Value and Impact of Data Sharing and Curation: A synthesis of three recent studies of UK research data centres, Jisc. http://repository.jisc.ac.uk/5568/1/iDF308_- _Digital_Infrastructure_Directions_Report%2C_Jan14_v 1-04.pdf for economic & social, archaeology, and atmospheric data
  17. NSF NE Big data for education spoke, 2018-02-16 …also, Research

    Data Management: Simplifies your life Saves time Increases efficiency
  18. NSF NE Big data for education spoke, 2018-02-16 Managed data

    is FAIR data http://datafairport.org/
  19. NSF NE Big data for education spoke, 2018-02-16 Managed data

    is FAIR data Findable Accessible Interoperable Reusable http://datafairport.org/
  20. NSF NE Big data for education spoke, 2018-02-16 What makes

    data Findable? http://www.phdcomics.com/comics/archive.php?comicid=1323
  21. NSF NE Big data for education spoke, 2018-02-16 http://www.dcc.ac.uk/resources/metadata-standards What

    makes data Findable? • Lab notebooks – GitHub – Open Science Framework • Data descriptions / code book / readMe • File naming – Consistency: Pick a system, write it down, & stick with it – Identify necessary elements & consider their order – Create brief, understandable names – Date: YYYY-MM-DD – Version: v01, v02,…FINAL In general, try to stay away from spaces in filenames as well as the following characters: .\ / : * ? “ < > | [ ] & $ • File / directory structure Description/ Documentation! Make a system. Share the system. Follow the system
  22. NSF NE Big data for education spoke, 2018-02-16 What makes

    data Findable? ✓local resources, e.g. https://libraries.mit.edu/data-management/
  23. NSF NE Big data for education spoke, 2018-02-16 What makes

    data Accessible? • Open file formats • Program/platform/script availability • Machine readability / computational accessibility • Due ethical and legal considerations
  24. NSF NE Big data for education spoke, 2018-02-16 • Non-proprietary

    • Open, documented standard • Standard representation (e.g., ASCII, Unicode) • Common, or commonly used by the research community (e.g. FITS, CIF) • Unencrypted • Uncompressed Some commonly recognized formats to avoid for storage include: Word [.doc(x)], SPSS [.sav], Excel [.xls(x)], STATA [.dta], Access [.mdb, .accdb], JPEG [.jpg], .gif, Quicktime [.mov], SAS [.sas] Some commonly recognized formats meeting these criteria: ASCII [e.g., .csv, .txt], PDF [.pdf], FLAC, TIFF, JPEG2000 [.jp2], MPEG-4 [.mp4], XML [.xml, .odf, .rdf], R [.r] X ✓ Not sure about the extension? Check https://www.nationala rchives.gov.uk/PRONO M/default.htm http://www.data-archive.ac.uk/media/2894/managingsharing.pdf http://www.digitalpreservation.gov/formats/index.shtml?PHPSESSID =c26c5e5101396d5f5ebacedb13cae6e3 What makes data Accessible? Open file formats
  25. NSF NE Big data for education spoke, 2018-02-16 What makes

    data Accessible? • Open file format • Program/platform/script availability • Machine readable & computational access • Due ethical & legal considerations – Restricted data = Responsible sharing (ICPSR resources) – Salo, D. and Jones, K. M. L. (2017). Learning analytics and the academic library: professional ethics commitments at a crossroads. College & Research Libraries (forthcoming). https://ssrn.com/abstract=2955779 – https://www.imsglobal.org/learning-data-analytics-key-principles – https://pervade.umd.edu/about/general/
  26. NSF NE Big data for education spoke, 2018-02-16 • Tidy

    data • Taxonomies & Ontologies • Common/Shared data elements What makes data Interoperable?
  27. NSF NE Big data for education spoke, 2018-02-16 Experiment X

    Ver_2x CrV33 Ver_7b CrV33 Ttby-7 Ttyl-42 1337x 10Q 1x_44 6 0.025 1.33 Grn 3700 NA 0.136 32_xx 1 4y_3ub 0.025 2.5 Blu -- 3xLg 0.365 32_xx 1 55-ertt 0.024 1.33 Yel 3700 2ee3 0.159 0.0 1 3o_44 4 0.02 3.2 Cha 3700 NA 0.220 32_xx 1 411 0.024 xxx Brn 3700 5tr6 0.302 33_xx 1 B2W_u 0.023 3.14 Blu 3700 3xe3 ---- 32_xx 1 2G2BT 0.025 ? Gol 0 xxx 0.254 33_xx 1 88 0.33 1.33 Gol 3.700 NA 0.162 33_xx 1 …... … … … … … … … … I can't send you the original data because I don't remember what my excel file names mean anymore #overlyhonestmethods What makes data Interoperable? Tidy data
  28. NSF NE Big data for education spoke, 2018-02-16 • Tidy

    data – Wickham, H. (2014). Tidy Data. Journal of Statistical Software, 59(10), 1 - 23. doi:http://dx.doi.org/10.18637/jss.v059.i10 • Taxonomies, ontologies & controlled vocabularies – Common Education Data Standards: https://ceds.ed.gov – Schools Interoperability Framework (SIF) & Lightweight Interoperability Standard for Schools (LISS) • Common/Shared data elements – e.g., https://www.nlm.nih.gov/cde/ – Lessons to be learned from health What makes data Interoperable?
  29. NSF NE Big data for education spoke, 2018-02-16 What makes

    data Reusable? • Trustworthiness • Licensing • Due ethical considerations
  30. NSF NE Big data for education spoke, 2018-02-16 Trust: Documentation

    & contextual data 00100100 00111111 01101010 10001000 10000101 10100011 00001000 11010011 00010011 00011001 10001010 00101110 00000011 01110000 01110011 01000100 10100100 00001001 00111000 00100010 00101001 10011111 00110001 11010000 00001000 00101110 11111010 10011000 11101100 01001110 01101100 10001001 ???
  31. NSF NE Big data for education spoke, 2018-02-16 Trust: Documentation

    & contextual data Methods • What was done • How it was done • Instrumentation • Limitations Code • All of the meanings Description / Documentation Labels (w/ units!) • Codebook • Data dictionary (No, the article is probably not sufficient) C d 00100100 00111111 01101010 10001000 10000101 10100011 00001000 11010011 00010011 00011001 10001010 00101110 00000011 01110000 01110011 01000100 10100100 00001001 00111000 00100010 00101001 10011111 00110001 11010000 00001000 00101110 11111010 10011000 11101100 01001110 01101100 10001001
  32. NSF NE Big data for education spoke, 2018-02-16 Trust: Documentation

    & contextual data Methods • What was done • How it was done • Instrumentation • Limitations Code • All of the meanings Description / Documentation Labels (w/ units!) • Codebook • Data dictionary (No, the article is probably not sufficient) C d π 00100100 00111111 01101010 10001000 10000101 10100011 00001000 11010011 00010011 00011001 10001010 00101110 00000011 01110000 01110011 01000100 10100100 00001001 00111000 00100010 00101001 10011111 00110001 11010000 00001000 00101110 11111010 10011000 11101100 01001110 01101100 10001001
  33. NSF NE Big data for education spoke, 2018-02-16 What makes

    data Reusable? • Trustworthiness • Licensing –✓local resources! – Personal recommendation: CC 0, or equivalent • Due ethical considerations (see previous)
  34. NSF NE Big data for education spoke, 2018-02-16 Managed data

    is FAIR data Findable Accessible Interoperable Reusable http://datafairport.org/
  35. NSF NE Big data for education spoke, 2018-02-16 Collecting Analyzing

    Publicizing Wrapping it up Proposal / Planning Preparing Managed data is shareable data
  36. NSF NE Big data for education spoke, 2018-02-16 YOU Happy

    researcher: http://openarchaeologydata.metajnl.com/about/
  37. NSF NE Big data for education spoke, 2018-02-16 Data publication

    & citation • Data Citation • Identifiers – ORCiD – Persistent URLs (e.g., doi) • Publication in Repositories – Guidelines – Disciplinary repository – Institutional repository • Data selection for publication
  38. NSF NE Big data for education spoke, 2018-02-16 Data citation

    P Publisher / Distributor 5 A Authors & Contributors 1 Pd Publication date 4 T Title 2 Ei Electronic ID, DOI 3 Table of citation elements - Track reuse - Measure impact - Support reproducibility https://www.force11.org/group/joint-declaration-data-citation-principles-final
  39. NSF NE Big data for education spoke, 2018-02-16 Data publication

    & citation • Data Citation • Identifiers – ORCiD – Persistent URLs (e.g., doi) • Publication in Repositories – Guidelines – Disciplinary repository – Institutional repository • Data selection for publication
  40. NSF NE Big data for education spoke, 2018-02-16 a note

    on author identification: Ryan Baker
  41. NSF NE Big data for education spoke, 2018-02-16 Data publication

    & citation • Data Citation • Identifiers – ORCiD – Persistent URLs (e.g., doi) • Publication in Repositories – Guidelines – Disciplinary repository – Institutional repository • Data selection for publication
  42. NSF NE Big data for education spoke, 2018-02-16 Repository considerations

    1. Will your community look for it there? 2. Can your data be uploaded in a format useful to others? 3. Can you restrict access to your data as needed? 4. Can the data be cited and found in a unique and persistent way? 5. Are preservation actions being taken to maintain the integrity of your data? 6. How long are your data to be retained? What happens then? 7. Is there support for data documentation / data deposit? 8. Are the rights of the repository and your rights as depositor clear? 9. Are the rights and licenses under which your data can be accessed and used clear? 10. What will it cost to have your data deposited in this repository? Adapted from http://dms.data.jhu.edu/data-management-resources/publish-and-share/find-a-repository/selecting-a-repository-for-data- deposit/ CC by 4.0
  43. NSF NE Big data for education spoke, 2018-02-16 Data publication

    & citation • Data Citation • Identifiers – ORCiD – Persistent URLs (e.g., doi) • Publication in Repositories – Guidelines & Repositories – DataShop: https://pslcdatashop.web.cmu.edu – http://www.laceproject.eu • Data selection for publication
  44. NSF NE Big data for education spoke, 2018-02-16 Teaching RDM

    • Inform • Model • Explicate • Integrate
  45. NSF NE Big data for education spoke, 2018-02-16 Teaching RDM

    Inform Know your institution’s Research Data Management Services: • https://libraries.mit.edu/data-management/ • https://scholcomm.columbia.edu/data-management/ • http://dms.data.jhu.edu/ • https://www.wpi.edu/research/support/academic-research- computing/data-management • https://libraries.psu.edu/research/research-data-services/data- management Not here? Check with your local information professionals, aka Librarians
  46. NSF NE Big data for education spoke, 2018-02-16 Teaching RDM

    Model As instructors, we model the scholarship ethos and practices of our disciplines for our students. We model how to plan ahead and make informed decisions about: • Research procedures • Formats • Tools http://theautismhelper.com/teaching-skill-imitation/
  47. NSF NE Big data for education spoke, 2018-02-16 Teaching RDM

    • Inform • Model • Explicate • Implement
  48. NSF NE Big data for education spoke, 2018-02-16 Teaching RDM

    Explicate • Critically consider both the assumed and desired competencies associated with assignments that deal with data • Communicate these clearly to students
  49. NSF NE Big data for education spoke, 2018-02-16 Syllabus &

    Course activities Teaching RDM Explicate Shorish, Yasmeen, "Data Information Literacy and Undergraduates: A Critical Competency" (2015). Libraries. Paper 27. http://commons.lib.jmu.edu/letfspubs/27 Carlson, J. (2015, October 9). Threshold Concepts and Data Information Literacy | e-Science Community. Retrieved from http://esciencecommunity.umassmed.edu/2015/10/09/threshold-concepts-and-data-information-literacy/
  50. NSF NE Big data for education spoke, 2018-02-16 Teaching RDM

    Integrate “providing comprehensive replication documentation for research involving statistical data should be as ubiquitous and routine as providing a list of references” http://www.projecttier.org
  51. NSF NE Big data for education spoke, 2018-02-16 Teaching RDM

    https://www.flickr.com/photos/roberlan/14358725053 CC By-ND 2.0 Roberlan Borges
  52. NSF NE Big data for education spoke, 2018-02-16 Let’s talk!

    Amy Nurnberger, Program Head, Data Management Services, MIT Adj Asst Prof, Learning Analytics, Teachers College Phone: 617.258.5596 E-mail: [email protected] or [email protected] ORCID: 0000-0002-5931-072X Twitter: @ANurnberger Thank you!
  53. NSF NE Big data for education spoke, 2018-02-16 Sources &

    Resources DCC Metadata Catalog: http://www.dcc.ac.uk/resources/metadata-standards ORCiD: https://orcid.org DCC Data Selection: http://www.dcc.ac.uk/resources/how- guides/appraise-select-data Open Science Framework: https://osf.io This work is licensed under CC-By International 4.0. Please respect the rights of others used in this work. Please cite this work as Nurnberger, A. L. (2018). Advancing educational research through research data management. Presentation at the NSF Northeast Big data for education spoke. 2018-02-16. https://speakerdeck.com/anurnberger/advancing-educational-research-through- research-data-management-20180216 Icons licensed for use from Noun Project, unless otherwise noted. Lightbulb. https://en.wikipedia.org/wiki/Wikipedia:Featured_picture_ candidates/compact-fluorescent-light-bulb2 Stones. http://lssacademy.com/wp- content/uploads/2009/04/simplicity.jpg Alarm Clocks. https://commons.wikimedia.org/wiki/File:Trento- Mercatino_dei_Gaudenti-alarm_clocks.jpg Heart photo frame by Alex80 https://pixabay.com/en/heart-photo-frame-love-sign-pink- 2055206/ CC 0