Upgrade to Pro — share decks privately, control downloads, hide ads and more …

FISH 6002: Week 6 - Collecting and Managing Tidy Data

FISH 6002: Week 6 - Collecting and Managing Tidy Data

Week 6 lecture for FISH 6002 updated 10 Oct

MI Fisheries Science

October 20, 2017
Tweet

More Decks by MI Fisheries Science

Other Decks in Science

Transcript

  1. Week 6: Collecting and managing tidy data FISH 6000: Science

    Communication for Fisheries Brett Favaro 2017 This work is licensed under a Creative Commons Attribution 4.0 International License
  2. This week: 2 Parts Part 1: • Collecting data safely

    • Data management Part 2: • Building a database
  3. This week: 2 parts Part 1: • Collecting data safely

    • Data management Part 2: • Building a database This part may deal with uncomfortable subject matter
  4. Fieldwork can be physically dangerous https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1035215/ “It is concluded that

    fishing is one of the most hazardous occupations in terms of mortality related to work”
  5. Fieldwork can be physically dangerous http://www.csls.ca/reports/csls2006-04.pdf From 1996-2005, fishing and

    trapping was 3rd most dangerous job in Canada (1 / 2800 workers died) #1: Mining, quarrying, and oil wells #2: Logging and forestry Newfoundland, specifically, had second highest industrial death rate in Canada (behind territories)
  6. 1. Get mentored. On first field experience, go with someone

    knowledgeable, or link up with trusted person in the field 2. Embark in detailed logistical planning: Consider food (especially if you have allergies), medication (anti- nausea?), etc. Survival suits? PFDs? Fieldwork can be physically dangerous 3. When working with new partners: DO NOT ASSUME safety is valued. Create a written field plan. Assume they won’t have anything and bring it yourself.
  7. Fieldwork can be financially impactful - Including liability 1. Make

    sure you have all required licenses, approvals, and insurance coverage before embarking on any fieldwork. 2. Clearly discuss what field expenses will and will not be covered by your supervisor in advance of the project. No two projects are alike
  8. Academia – and especially fieldwork - is not free of

    harassment or bullying http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0102172
  9. “A majority (64%, N = 423/658) of all survey respondents, stated that

    they had personally experienced sexual harassment: i.e. inappropriate or sexual remarks, comments about physical beauty, cognitive sex differences, or other such jokes” “Over 20% of respondents reported that they had personally experienced sexual assault: i.e. physical sexual harassment, unwanted sexual contact, or sexual contact in which they could not or did not give consent, or felt it would be unsafe to fight back or not give consent (N = 140/644, 21.7%)”
  10. “Respondents typically had limited awareness of workplace policies or mechanisms

    for reporting. Fewer than half of survey respondents recalled ever encountering a code of conduct at any of the field sites at which they had worked (N = 251/666, 37.7%). Fewer than one fourth of respondents recalled having ever worked at a field site with a sexual harassment policy (148/666, 22.2%).”
  11. “The theme of clarity regarding appropriate behavioral expectations and rules,

    and the repercussions for breaking established rules, emerged from the interviews” “Examples of testing behavior included, but were not limited to: going on long, strenuous hikes while refusing to tell the respondent how long they would be gone from camp; not permitting the respondent food, water, or urination breaks during data collection; and sharing pornographic images with the respondent and gauging her or his reaction. Many of the performances of physical feats were not required for the successful completion of data collection” http://onlinelibrary.wiley.com/doi/10.1111/aman.12929/epdf
  12. Non-sexual, non-academic complaint procedures: Anyone (student, employee, or non-university person)

    complaining against student: Student Code of Conduct https://www.mun.ca/student/sscm/conduct/code_of_conduct.php Student complaining against employee: Non-Academic Appeals https://www.mun.ca/main/non_academic_appeals.php Employee complaining against employee: Procedure for Resolution of a Formal Respectful Workplace Complaint http://www.mun.ca/policy/site/procedure.php?id=519 All sexual harassment: http://www.mun.ca/policy/site/procedure.php?id=348 MUN Sexual Harassment Office: https://www.mun.ca/sexualharassment/
  13. MUN policies Harassment – means comments or conduct which are

    abusive, offensive, demeaning or vexatious that are known or ought reasonably to be known to be unwelcome and which may be intended or unintended. Types of harassment include Harassment based on Prohibited Grounds of Discrimination and Personal Harassment. Harassment may occur during a single incident or a series of single incidents. Whether or not a single incident constitutes harassment will depend on the nature and type of incident(s). Harassment, for example, does not include: a. Interpersonal conflict or disagreement, which is expressed in a respectful manner; or b. Performance management, attendance management or workplace discipline, which is expressed in a respectful and appropriate manner. http://www.mun.ca/policy/site/policy.php?id=167
  14. http://www.mun.ca/policy/site/policy.php?id=192 Sexual Harassment - Comments or conduct of a sexual

    nature and/or abusive conduct based on gender, gender identity, sex (including pregnancy and breast feeding) or sexual orientation directed at an individual or group of individuals by a person or persons of the same or opposite sex, who knows or ought reasonably to know that such comments or conduct is unwelcome and/or unwanted. Comments or conduct constitute sexual harassment when: a. submission to such comments or conduct is made either explicitly or implicitly a term or condition of an individual's employment, academic status, academic accreditation, or b. submission to or rejection of such comments or conduct by an individual is used as the basis for employment, or for academic performance, status or accreditation decisions affecting such individual, or c. such comments or conduct interferes with, or adversely affects, directly or indirectly, an individual's work or academic environment or performance, or d. such comments or conduct calls attention to the gender, gender identity, sex (including pregnancy and breast feeding) or sexual orientation of an individual or individuals in a manner that creates an intimidating, hostile or offensive work/study environment).
  15. http://www.mun.ca/policy/site/policy.php?id=192 Sexual Harassment - Comments or conduct of a sexual

    nature and/or abusive conduct based on gender, gender identity, sex (including pregnancy and breast feeding) or sexual orientation directed at an individual or group of individuals by a person or persons of the same or opposite sex, who knows or ought reasonably to know that such comments or conduct is unwelcome and/or unwanted. Comments or conduct constitute sexual harassment when: a. submission to such comments or conduct is made either explicitly or implicitly a term or condition of an individual's employment, academic status, academic accreditation, or b. submission to or rejection of such comments or conduct by an individual is used as the basis for employment, or for academic performance, status or accreditation decisions affecting such individual, or c. such comments or conduct interferes with, or adversely affects, directly or indirectly, an individual's work or academic environment or performance, or d. such comments or conduct calls attention to the gender, gender identity, sex (including pregnancy and breast feeding) or sexual orientation of an individual or individuals in a manner that creates an intimidating, hostile or offensive work/study environment).
  16. http://onlinelibrary.wiley.com/doi/10.1111/aman.12929/full Sexual Harassment includes but is not limited to: •unwelcome

    sexual invitations or requests; •demands for sexual favours; •unnecessary touching or patting of a person's body; •leering at a person's body; •unwelcome and repeated innuendos or taunting about a person's gender, gender identity, sex (including pregnancy and breast feeding) or sexual orientation; •unwelcome remarks or verbal abuse of a sexual nature; •visual displays of sexual images perceived to be degrading or offensive; •unwelcome remarks or verbal abuse based on gender, gender identity, sex (including pregnancy and breast feeding) or sexual orientation which are demeaning or degrading; •threats of a sexual nature; •sexual assault and; •any other unwanted verbal or physical conduct of a sexual nature. Sexual harassment may occur during a single incident, or a series of single incidents. Whether or not a single incident constitutes sexual harassment will depend on the nature and type of incident(s). Sexual harassment may occur between individuals of the same sex or between the sexes.
  17. Reporting sexual harassment: All processes, informal or formal, start here

    Forms: https://www.mun.ca/sexualha rassment/reporting/forms.php
  18. Reporting non-sexual harassment, non-academic issues: Anyone against student Written complaint

    to the Student Conduct Officer Angie Clarke Director, Student Affairs [email protected] Tel: (709) 778-0565 Office: W3017 Student against employee 2. Formal, written complaint to Director of Student Support Services 1. Complain informally (oral or writing) to the employee’s immediate supervisor or administrative head CFER: Tom Brown CSAR: Paul Winger CASD: Heather Manuel SOF, not in a centre (or if complaint is about a director) – Fred Anstey Dr. Jennifer Massey [email protected] 864-8312 Employee against employee Formal, written complaint to Department of Faculty Relations (regular MUN) or HR (MI)
  19. Other diversity/equality literature • Men ask more questions than women

    at a scientific conference http://journals.plos.org/plosone/article?id=10.1371/journal.pone.01 85534 • Your science conference should have a code of conduct http://journals.plos.org/plosone/article?id=10.1371/journal.pone.01 60015 • Diversity and inclusion in conservation: a proposal for a marine diversity network https://www.frontiersin.org/articles/10.3389/fmars.2017.00234/full • Not “Pulling up the ladder”: Women who organize conference symposia provide greater opportunities for women to speak at conservation conferences http://journals.plos.org/plosone/article?id=10.1371/journal.pone.01 60015
  20. This week: 2 parts Part 1: • Collecting data safely

    • Data management Part 2: • Building a database
  21. Data Management • Data Management Plans (DMPs) are documents that

    describe data, explain how they will be stored, and clarify ownership and dissemination rights • DMPs cover four areas (from https://libraries.mit.edu/data- management/plan/write/): 1. Project, experiment, and data description 2. Documentation, organization, and storage 3. Access, sharing, and re-use 4. Archiving
  22. https://dmptool.org/ DMPtool is a web-form that provides templates to DMPs

    for specific funders (Explore the site) It also describes requirements of major funders, and includes sample DMPs: https://dmptool.org/guidance
  23. Why do a DMP? • Some funders require it •

    Common IP issues are clarified: • Who keeps the data once your degree is complete? (especially relevant to Ph.D students heading to post-docs) • What happens if you quit the lab (e.g., by changing supervisors) before the end of your degree? • Who gets to co-author manuscripts based on the data? • Avoid data loss • Project planning: How much data storage space will be needed? • Involve Marine Institute ICT http://www.icpsr.umich.edu/icpsrweb/content/datamanagement/dmp/elements.html
  24. MINOR ASSIGNMENT 2: Please download the FISH6002-DMP.docx from the class

    website Complete as much of it as you can. Point-form answers are fine Upload your plan to Teams. Grading: /2.5 for each of the 4 sections 0 = Section ignored. 2.5 = Specific answers provided to all questions in the section. If answer is “I don’t know” – When will you know and how? What’s your plan to get it answered? Due Mon Oct 28
  25. Recall: Tidy data 1.One column = one variable 2.One row

    = one observation 3.One cell = one value 4.One column = one data type Grolemund and Wickham (2016), Fig 12.1
  26. You will work with two types of data • Data

    given to you by others • Data you collect yourself  Last week  This week Fieldwork, physical datasheets, surveys, data from figures – data that have NOT been coded, where you can design the spreadsheet yourself Data that have already been coded and stored in a table
  27. Recall • Four steps: 1. Did data load correctly? (Check

    rows and columns) 2. Are data types what they should be? (Correct as needed) 3. Numbers: Are there impossible values? (Check each number) 4. Factors: Are factor levels correct? (Check each factor) But how do we get our data into a spreadsheet in the first place?
  28. Scenario: You are a biologist and you are attempting to

    reconstruct the above dataset from datasheets Part A) Create a spreadsheet and enter data
  29. Consider: • Spreadsheet layout – Wide (good for people) or

    long (good for computers)? • Multiple people entering data at the same time? Or just one? • Quality checks? First, let’s draw it together on the white board Net number, fish number, biological data
  30. Next, let’s make a spreadsheet, so many people can enter

    data. Recommended workflow: 1. Plan the project layout on a white board, by hand 2. Make a shared OneDrive folder. Lay it out like a proper R project. (Note: I have a completed one on the course website) 3. By hand, draw how the Spreadsheet should be laid out 4. In an /excel subfolder, create a Spreadsheet 5. Work together online to populate that spreadsheet *Show Demo*
  31. *Demo* Make a folder in OneDrive Share it via a

    link that allows anyone to edit Post the link in Teams. Show how it can be opened in the Browser I have provided you with paper records of some catch data. Please record everything
  32. Scenario: You are a biologist and you are attempting to

    reconstruct the above dataset from datasheets Part A) Create a spreadsheet and enter data Part B) You’ve been given more data! Append your data to existing data
  33. Download PygmyWFBC-PartB.csv from the course website Now: Combine data from

    Parts A and B into a single sheet - Put it into a sensible spot in your class project folder - Need to start an R Script! - Please put the script in a sensible location
  34. Data manipulation: - First: Change variable names so they are

    equivalent # A method for changing variable names df2_fixed <- df2 %>% mutate(net_no = Netno) %>% # Make a new variable called net_no select(-Netno) %>% # Remove Netno. mutate(week = Week) %>% select(-Week) %>% mutate(wt = Weight) %>% select(-Weight) df1 df2 e.g. df2_fixed
  35. Data manipulation: - First: Change variable names so they are

    equivalent - Second: Employ bind_rows to add rows! Do it now
  36. Scenario: You are a biologist and you are attempting to

    reconstruct the above dataset from datasheets Part A) Create a spreadsheet and enter data Part B) Append your data to existing data Part C) Data have been discovered in an obscure file format! - Unlock, and append to existing data
  37. Download PygmyWFBC-PartC.sav from the course website Use the rio package

    Recoding will be needed: Save it somewhere sensible
  38. Scenario: You are a biologist and you are attempting to

    reconstruct the above dataset from datasheets Part A) Create a spreadsheet and enter data Part B) Append your data to existing data Part C) Data have been discovered in an obscure file format! - Unlock, and append to existing data Part D) We found a new variable! Add new data to the datasheet
  39. Download PygmyWFBC-OtolithAges.csv from the course website Save it somewhere sensible

    # Use a JOIN to attach final <- left_join(df, otoliths, by=“fish_no”) If done right you should have a data frame with 369 rows, 11 columns!