Upgrade to Pro — share decks privately, control downloads, hide ads and more …

SOC 4650 & SOC 5650 - Lecture 02

SOC 4650 & SOC 5650 - Lecture 02

Slides for Lecture 02 of the Saint Louis University Course Introduction to GIS. These slides go deeper in the concept of analysis development, introduce different types of data in GIS work, and introduce the ggplot2 package in R.

Christopher Prener

January 29, 2018
Tweet

More Decks by Christopher Prener

Other Decks in Education

Transcript

  1. WELCOME! GETTING STARTED Check Slack’s #_news channel for a link

    to this week’s entry ticket. Install GitHub Desktop (icon in folder on desktop) Your seat tonight is your “assigned” seat for the first part of the semester. Choose wisely! Install the tidyverse package using RStudio: 
 install.packages(“tidyverse”)
  2. WORKING WITH DATA (PART 1) INTRO TO GIS SPRING, 2018

    CHRISTOPHER PRENER, PH.D. LECTURE 02 WEEK 03 Revised Version
  3. AGENDA 1. Front Matter 2. Open Data 3. Zen and

    the Art of Data Analysis 4. Analysis Development 5. Visualizing Data 6. Back Matter INTRO TO GIS / WEEK 03 / LECTURE 02
  4. Next week is the first GIS & Public Policy discussion

    1. FRONT MATTER ANNOUNCEMENTS Pace beginning to pick- up - make sure you reach out if you have ?’s LP-01 grades returned via GitHub Final Project memo due next Monday Lab 01 and Lecture Prep 03 due next Monday SOC 5650 students need to sign-up for GIS & Public Policy days
  5. KEY TERM “Open data are 
 government data typically 


    provided for free, in a machine readable format, and with minimal 
 restrictions on reuse.” Source: Johnson et al (2017)
  6. 3. ZEN AND THE ART OF DATA ANALYSIS HABITS OF

    MIND Soft Skills Hard Skills
  7. 3. ZEN AND THE ART OF DATA ANALYSIS THREADS IN

    SLACK Use threads to respond if someone posts a question that you also had, to ask a clarification question, or to thank someone for posting! Hover your mouse over a message to reveal a mini toolbar:
  8. 3. ZEN AND THE ART OF DATA ANALYSIS EMOJIS IN

    SLACK Emojis can be used to respond quickly to people’s posts. They are absolutely encouraged! Hover your mouse over a message to reveal a mini toolbar:
  9. 3. ZEN AND THE ART OF DATA ANALYSIS #WEEKLY-WINS If

    something works right, you learn something new or something that you’re excited about, if someone else was particularly helpful… share it!
  10. 3. ZEN AND THE ART OF DATA ANALYSIS #WEEKLY-WINS If

    something works right, you learn something new or something that you’re excited about, if someone else was particularly helpful… share it!
  11. Install the here package:
 install.packages(“here”) Install the knitr package:
 install.packages(“knitr”)

    If you did not complete Lecture Prep 02, go to 
 https://github.com/slu-soc5650/Lecture-02 and download the repo, extract its contents, and use the replication file. WELCOME! GETTING STARTED Install the rmarkdown package:
 install.packages(“rmarkdown”)
  12. KEY QUESTIONS ▸ How do you organize files? ▸ Do

    you keep different versions of files as your assignment or project progresses? ▸ If you needed your files in 5 years, could you find them? ▸ If you needed your files in 5 years, could you open them? ▸ Do you backup files ever? ▸ If your house was robbed or burned down, would your backup also be destroyed? 4. ANALYSIS DEVELOPMENT
  13. KEY QUESTIONS ▸ How do you organize files? ▸ Do

    you keep different versions of files as your assignment or project progresses? ▸ If you needed your files in 5 years, could you find them? ▸ If you needed your files in 5 years, could you open them? ▸ Do you backup files ever? ▸ If your house was robbed or burned down, would your backup also be destroyed? 4. ANALYSIS DEVELOPMENT Git & GitHub can help you address all 
 of these key questions/issues!
  14. GIT WORKFLOW 4. ANALYSIS DEVELOPMENT TYPICAL WORKFLOW Commits are snapshots

    of files that
 are saved at particular points in time.
  15. 4. ANALYSIS DEVELOPMENT TYPICAL WORKFLOW “OK, so why the $#&%

    did I save a
 second copy?!?!?! And why the $#&% was the first copy 
 edited after the second copy!?!?!?
  16. 4. ANALYSIS DEVELOPMENT GIT WORKFLOW Local repos can “sync” with

    a “remote”
 repo, making backup and sharing easy
  17. 4. ANALYSIS DEVELOPMENT GIT WORKFLOW Copying data for the first

    time from 
 GitHub is called making a “clone”
  18. 4. ANALYSIS DEVELOPMENT GIT WORKFLOW Copying data for the first

    time from 
 GitHub is called making a “clone” Clone make
 changes Commit Sync
  19. 4. ANALYSIS DEVELOPMENT THE FIRST RULE OF GIS… USE ONE

    AND ONLY ONE COURSE DIRECTORY STRUCTURE, SAVED ON AN EXTERNAL DEVICE.
  20. 4. ANALYSIS DEVELOPMENT THE SECOND RULE OF GIS… USE ONE

    AND ONLY ONE COURSE DIRECTORY STRUCTURE, SAVED ON AN EXTERNAL DEVICE.
  21. SOME PRINCIPLES OF ANALYSIS DEVELOPMENT 1. Use one and only

    one course directory structure, saved on an external device. 2. Commit changes early and often. 3. Use R Projects for all assignments requiring R. 4. Always use the project directory structure. 5. Write notebooks for humans, not computers. 4. ANALYSIS DEVELOPMENT
  22. SOME PRINCIPLES OF ANALYSIS DEVELOPMENT 1. Use one and only

    one course directory structure, saved on an external device. 2. Commit changes early and often. 3. Use R Projects for all assignments requiring R. 4. Always use the project directory structure. 5. Write notebooks for humans, not computers. 4. ANALYSIS DEVELOPMENT
  23. SOME PRINCIPLES OF ANALYSIS DEVELOPMENT 1. Use one and only

    one course directory structure, saved on an external device. 2. Commit changes early and often. 3. Use R Projects for all assignments requiring R. 4. Always use the project directory structure. 5. Write notebooks for humans, not computers. 4. ANALYSIS DEVELOPMENT
  24. LET US CHANGE OUR TRADITIONAL ATTITUDE TO THE CONSTRUCTION OF

    PROGRAMS: INSTEAD OF IMAGINING THAT OUR MAIN TASK IS TO INSTRUCT A COMPUTER WHAT TO DO, LET US CONCENTRATE RATHER ON EXPLAINING TO HUMANS WHAT WE WANT THE COMPUTER TO DO. Donald Knuth, Ph.D. Literate Programming
 (1984)
  25. Install the reprex package:
 install.packages(“reprex”) 
 Double check that stlData

    is installed. If it is not:
 install.packages(“devtools”)
 devtools::install_github(“chris-prener/stlData”) 5. VISUALIZING DATA GETTING STARTED
  26. ASSIGNMENT OPERATOR 5. VISUALIZING DATA <- Using the stlLead data

    from stlData: > leadData <- stlLead Assignments are, by convention, typically made from right to left! f(x)
  27. ▸ .data is a data frame object in the global

    environment Available in utils
 Included in base R installations 5. VISUALIZING DATA VIEWING DATA Parameters: View(.data) f(x)
  28. ▸ .data is a data frame object in the global

    environment 5. VISUALIZING DATA VIEWING DATA Parameters: View(.data) f(x)
  29. VIEWING DATA 5. VISUALIZING DATA View(.data) Using the stlLead data

    from stlData: > View(stlLead) This will open up a second window with a spreadsheet-like view of your data frame. f(x)
  30. Available in ggplot2
 Download the via the tidyverse package on

    CRAN 5. VISUALIZING DATA CREATING A GGPLOT OBJECT ggplot() f(x)
  31. CREATING A GGPLOT OBJECT 5. VISUALIZING DATA ggplot() This will

    produce an empty plot object: > ggplot() If you look at the book or online, you will see that there arguments that can be supplied here. We’ll keep things simple for now, though! f(x)
  32. ▸ .data is a data frame object in the global

    environment ▸ aesthetic controls what variables are displayed and how 5. VISUALIZING DATA ADDING A GEOM Parameters: geom_histogram(.data, mapping = aes(aesthetic)) f(x)
  33. ADDING A GEOM 5. VISUALIZING DATA geom_histogram(.data, mapping = aes(aesthetic))

    Using the pctElevated variable from stlData’s stlLead data: > ggplot() + geom_histogram(stlLead, mapping = aes(pctElevated)) There are many different geoms that are available for ggplot2 f(x)
  34. SAVING A GGPLOT OBJECT 5. VISUALIZING DATA ggsave(“filePath”) This will

    produce an empty plot object: > ggsave(“results/leadHistogram.png”) This will save the last plot you created. By default, the plot has the following dimensions - 7” by 7”. This can be changed, however. f(x)
  35. AGENDA REVIEW 2. Open Data 3. Zen and the Art

    of Data Analysis 4. Analysis Development 5. Visualizing Data 6. BACK MATTER
  36. REMINDERS 6. BACK MATTER Next week is the first GIS

    & Public Policy discussion Pace beginning to pick- up - make sure you reach out if you have ?’s LP-01 grades returned via GitHub Final Project memo due next Monday Lab 01 and Lecture Prep 03 due next Monday SOC 5650 students need to sign-up for GIS & Public Policy days