Upgrade to Pro — share decks privately, control downloads, hide ads and more …

SOC 4650 & SOC 5650 - Lecture 04

SOC 4650 & SOC 5650 - Lecture 04

Slides for Lecture 04 of the Saint Louis University Course Introduction to GIS. These slides introduce basic mapping tasks in both R and ArcGIS.

Christopher Prener

February 12, 2018
Tweet

More Decks by Christopher Prener

Other Decks in Education

Transcript

  1. Install the development version of ggplot2:
 devtools::install_github(“tidyverse/ggplot2”) If you have

    not already done so, look for the Mac and Windows specific links for software downloads on #_news in Slack. WELCOME! GETTING STARTED Install the rlang and sf packages from CRAN
  2. THE NATURE OF SPATIAL DATA (PART 2) INTRO TO GISC

    CHRISTOPHER PRENER, PH.D. SPRING, 2018 WEEK 05 LECTURE 04
  3. AGENDA 1. Front Matter 2. GISc & Public Policy 3.

    Representing Spatial Data 4. Map Design Basics 5. Basic Maps in R 6. Basic Maps in ArcGIS 7. Back Matter INTRO TO GISC / WEEK 05 / LECTURE 04
  4. Final project workgroups should briefly meet and divide up tasks;

    a meeting report should be posted in your Slack channel by next Monday as well. 1. FRONT MATTER ANNOUNCEMENTS Lab-03 and PS-02 from this lecture and LP-05 for next lecture are due next Monday (February 19th). Final project groups have been posted to our GitHub organization in the finalProject repository
  5. ▸ We’ve now identified bugs with two packages, including: •

    here - sporadic errors with working directories; alternate file path specification is up on website. • reprex - errors with pandoc (this is most likely on a lab computer); GitHub issue opened and fix added to the development version - re-install reprex from GitHub if you get this error! 1. FRONT MATTER BUG REPORTS
  6. ▸ We’ve also stumbled into known bugs in two packages:

    • reprex - errors writing to the clipboard; install the development version of reprex from GitHub! • janitor - errors with the clean_names() function for Windows users; install the development version of janitor from GitHub! 1. FRONT MATTER BUG REPORTS
  7. ▸ Finally, I stumbled into a known bugs in ggplot2

    this weekend while prepping this lecture: • The error says (in part): Error in grid.Call…polygon edge not found • If you get this error, re-execute the code chunk (you may have to do this multiple times). • I updated an existing bug report but no fix yet. 1. FRONT MATTER BUG REPORTS
  8. ▸ The naniar package was updated to version 2.0 on

    Thursday ▸ The update changed how both miss_var_summary() and miss_case_summary() behave ▸ They now both require the 
 order = TRUE option to reproduce the behavior that was shown in last week’s slides ▸ Change is not documented 1. FRONT MATTER NEW RELEASE > library(stlData) > library(naniar) > miss_var_summary(stlMurders, order = TRUE) # A tibble: 11 x 4 variable n_miss pct_miss n_miss_cumsum <chr> <int> <dbl> <int> 1 address 15 1.08 15 2 id 0 0 0 3 fullDate 0 0 0
  9. 1. FRONT MATTER TRADE OFFS Using (and teaching with) open

    source software does come with some significant trade offs. POWER STABILITY FLEXIBILITY CLARITY REPRODUCIBLE “EASY”
  10. ▸ We want to use standardized approaches to numeritizing place

    names whenever possible ▸ ISO 3166-1 • 2 letter country codes (US) • 3 letter country codes (USA) • 3 digit numeric codes (840) ▸ The R package countrycode has tools for working with country names, abbreviations, and ISO codes 3. REPRESENTING SPATIAL DATA REPRESENTING NAMES
  11. ▸ We want to use standardized approaches to numeritizing place

    names whenever possible ▸ FIPS aka ANSI Codes 3. REPRESENTING SPATIAL DATA REPRESENTING NAMES Divison Label Label STATE Postal Code Missouri STATEFP FIPS Code 29 COUNTYFP County FIPS 510 COUNTY County Name St. Louis City
  12. ▸ We want to use standardized approaches to numeritizing place

    names whenever possible ▸ FIPS aka ANSI Codes ▸ The R package tigris (which we’ll use later this semester) has tools for working with state and country- level FIPS codes. 3. REPRESENTING SPATIAL DATA REPRESENTING NAMES
  13. ST. LOUIS CITY CENSUS TRACTS 29510119300 State County Tract Census

    Tracts are typically 
 updated by the U.S. Census Bureau for each decennial census. They range in size from 1,200 to 8,000 persons with an optimal population of 4,000 individuals.
  14. ST. LOUIS CITY CENSUS TRACTS This tract had a population

    of
 5,454 individuals in 2010. 29510119300 State County Tract
  15. SLU CENSUS TRACT 29510119300 State County Tract Clocktower Streets shown

    within Tract for reference Lindell West Pine North Grand LaClede Vandeventer
  16. SLU CENSUS BLOCK GROUPS 295101193002 State County Tract Clocktower Lindell

    West Pine North Grand LaClede Vandeventer BG Block groups range in size from 600 to 3,000 persons.
  17. SLU CENSUS BLOCK GROUPS 295101193002 State County Tract Clocktower Lindell

    West Pine North Grand LaClede Vandeventer BG This block group
 had a population
 of 1,600 in 2010.
  18. SLU CENSUS BLOCKS West Pine Mall Lindell Vandeventer North Grand

    Clocktower 295101193002 State County Tract BG
  19. SLU CENSUS BLOCKS 295101193002004 State County Tract BG Block Lindell

    West Pine Mall Vandeventer Clocktower This block had a 
 population of 479
 in 2010.
  20. LOCAL PLACE NAMES Lindell West Pine Mall Clocktower Morrissey Hall

    McGannon Hall Queen’s Daughters Hall Wuller Hall
  21. STREET ADDRESSES Lindell West Pine Mall Clocktower 3700 Lindell Blvd

    3750 Lindell Blvd 3730 Lindell Blvd 3711 West Pine Mall
  22. ▸ Set of reference points for modeling the approximate spherical

    shape of Earth ▸ Earth is not a perfect ellipsoid, 
 so we need different datums for approximating its shape in different regions ▸ WGS 1984 is a general reference datum for all of Earth ▸ NAD 1983 is a more accurate datum for North America 3. REPRESENTING SPATIAL DATA GEODIC DATUM
  23. WHAT IS A SHAPEFILE? We often describe shapefiles in the

    singular, as if they were one file on our computer. That is how ArcGIS sees them. Our computer sees things differently, however: data.shp (geometry) data.shx (shape index) data.dbf (attributes) data.sbn (spatial index) data.sbx (spatial index) data.shp.xml (metadata) data.cpg (character encoding) data.prj (projection)
  24. WHAT IS A GEODATABASE? Geodatabases are designed to overcome weaknesses

    of shapefiles, and may contain a large number of feature classes. cityData.gdb Boundary_City Demographics_Tracts Hydro_MajorLakes Hydro_MajorRivers PublicSaftey_PoliceStations PublicSaftey_FireStations Trans_Interstates Trans_StreetCenterlines
  25. PLANNING WITH PURPOSE ▸ “What information is being mapped?” ▸

    “Who will be reading the map?” ▸ “Is the map content coordinated with written text or other graphics?” ▸ “What size and medium will be used to display the map?” ▸ “What are the time and budget constraints on map production?” ▸ “Who is the audience?” 4. MAP DESIGN BASICS
  26. BASEMAP What are the data types (land cover, imagery, vector)

    that best support the information you are sharing with your audience?
  27. BASEMAP What reference features (bodies of water, roadways, point locations,

    boundaries) that your audience will need to help orient themselves?
  28. VISUAL HIERARCHY The importance of features on your map are

    implied with color, size, the weight of lines, and other cues.
  29. SCALE & RESOLUTION The display of the final map should

    be considered when selecting data - mismatches may result in distracting or difficult to interpret map products.
  30. ▸ sf (or “simple features”) for its tibble-like spatial data

    structure that combines the tabular and geometric data into a single object ▸ ggplot2 for plotting simple features • The development version from GitHub must be used - the geom_sf() geom is not available in the CRAN release! 5. BASIC MAPS IN R PACKAGES
  31. ▸ fileName is the name of the file you wish

    to import; part of a broader filePath statement that should point to your data folder ▸ stringsAsFactors should always be set to false so that string data are not manipulated when they are imported Available in sf
 Download via CRAN 5. BASIC MAPS IN R IMPORT SHAPEFILES Parameters: st_read(“data/fileName.shp”, stringsAsFactors = FALSE) f(x)
  32. ▸ fileName is the name of the file you wish

    to import; part of a broader filePath statement that should point to your data folder ▸ stringsAsFactors should always be set to FALSE so that string data are not manipulated when they are imported 5. BASIC MAPS IN R IMPORT SHAPEFILES Parameters: st_read(“data/fileName.shp”, stringsAsFactors = FALSE) f(x)
  33. f(x) 5. BASIC MAPS IN R IMPORT SHAPEFILES st_read(“data/fileName.shp”, stringsAsFactors

    = FALSE) Importing the example file STL_BOUNDARY_City.shp: > river <- st_read(“data/STL_BOUNDARY_City.shp”, stringsAsFactors = FALSE) sf objects can be edited just like tibbles using dplyr and other tidyverse tools
  34. f(x) 5. BASIC MAPS IN R IMPORT SHAPEFILES st_read(“data/fileName.shp”, stringsAsFactors

    = FALSE) Importing a hypothetical file MO_HYDRO_Rivders.shp: > river <- st_read(“data/MO_HYDRO_Rivers.shp”, stringsAsFactors = FALSE) Make sure you copy all needed files from the DataLibrary into the data directory in your project folder. Be careful not to miss a shapefile component.
  35. ▸ Use subfolders if you are working with more than

    1 or 2 shapefiles ▸ Name shapefiles and subfolders identically ▸ Use a clear naming system • Prefix for the geographic extent (e.g. STL, MO) • Use categories to group like files (e.g. HYDRO, TRANS) 5. BASIC MAPS IN R ORGANIZING DATA
  36. ▸ .data should be a simple features object ▸ for

    polygons: • fill is used to define the fill color of the object • color is used to define the outline color of the object ▸ hex should be a hexadecimal color value (e.x. dark gray = #5d5d5d) Available in ggplot2
 Download via GitHub (not CRAN!) 5. BASIC MAPS IN R MAP SIMPLE FEATURES Parameters: geom_sf(data = .data, fill = “hex”, color = “hex”) f(x)
  37. ▸ .data should be a simple features object ▸ for

    polygons: • fill is used to define the fill color of the object • color is used to define the outline color of the object ▸ hex should be a hexadecimal color value (e.x. dark gray = #5d5d5d) 5. BASIC MAPS IN R MAP SIMPLE FEATURES Parameters: geom_sf(data = .data, fill = “hex”, color = “hex”) f(x)
  38. f(x) 5. BASIC MAPS IN R MAP SIMPLE FEATURES geom_sf(data

    = .data, fill = “hex”, color = “hex”) Using the example STL_BOUNDARY_City data stored in city: > ggplot() + geom_sf(data = city, fill = "#5d5d5d", color = "#5d5d5d") The data argument needs to be explicitly stated or current development versions of ggplot2 will return an error.
  39. f(x) 5. BASIC MAPS IN R MAP SIMPLE FEATURES geom_sf(data

    = .data, fill = “hex”, color = “hex”) Using the example STL_BOUNDARY_City data stored in city: > ggplot() + geom_sf(data = city, fill = "#5d5d5d", color = "#5d5d5d") If you get Error in grid.Call…polygon edge not found errors, try executing the code block again.
  40. 5. BASIC MAPS IN R MAP SIMPLE FEATURES # load

    dependencies library(ggplot2) library(sf) # load data city <- st_read("data/STL_BOUNDARY_City/STL_BOUNDARY_City.shp") econ <- st_read("data/STL_ECON_HUDEconZones/STL_ECON_HUDEconZones.shp") hydro <- st_read("data/STL_HYDRO_MajorBodies/STL_HYDRO_MajorBodies.shp") # map 1 - city boundary ggplot() + geom_sf(data = city, fill = "#5d5d5d", color = "#5d5d5d")
  41. 5. BASIC MAPS IN R MAP SIMPLE FEATURES # map

    2 - city boundary with hydro ggplot() + geom_sf(data = city, fill = "#5d5d5d", color = "#5d5d5d") + geom_sf(data = hydro, fill = "#72bcd4", color = "#72bcd4")
  42. The Mississippi River
 layer is on top because
 it is

    the second geom
 in our ggplot() call. Notice how the extent
 of the city boundary
 peeks out from under
 the Mississippi River.
  43. 5. BASIC MAPS IN R MAP SIMPLE FEATURES # map

    3a - city boundary with hydro and HUD economic development districts ggplot() + geom_sf(data = city, fill = "#5d5d5d", color = "#5d5d5d") + geom_sf(data = hydro, fill = "#72bcd4", color = "#72bcd4") + geom_sf(data = econ, fill = "#d48a72", color = "#d48a72")
  44. Notice how the
 Economic Development
 Zones are on top of


    the Mississippi River The same phenomenon
 occurs with both the
 Mississippi and the
 River Des Peres at their
 confluence.
  45. 5. BASIC MAPS IN R MAP SIMPLE FEATURES # map

    3b - city boundary with hydro and HUD economic development districts, properly layered ggplot() + geom_sf(data = city, fill = "#5d5d5d", color = "#5d5d5d") + geom_sf(data = econ, fill = "#d48a72", color = "#d48a72") + geom_sf(data = hydro, fill = "#72bcd4", color = "#72bcd4")
  46. f(x) 5. BASIC MAPS IN R SAVING PLOTS AT SPECIFIC

    DPI ggsave(“filepath”, dpi = 300) Using a hypothetical plot: > ggsave(“results/leadHistogram.png”, dpi = 300) Export at 300 dots per inch (dpi) for coursework.
  47. 6. BASIC MAPS IN ARCGIS WHICH TOOL IS BEST? R

    is better for tidying and transforming data, previewing maps, and producing stand alone choropleth maps. ArcGIS is better for creating reference maps and more complex map layouts.
  48. 6. BASIC MAPS IN ARCGIS USING DATA FRAMES Data frames

    in ArcGIS are independent maps within a map document. These can be helpful for constructing map layouts.
  49. USING DATA FRAMES 6. BASIC MAPS IN ARCGIS Insert ⾣

    Data Frame This will add a new data frame to your table of contents.
  50. 6. BASIC MAPS IN ARCGIS BROKEN CONNECTIONS If you either

    open a previously working map document or copy and paste layers into a new data frame, you may see this:
  51. BROKEN CONNECTIONS 6. BASIC MAPS IN ARCGIS Double click on

    layer ⾣ Source ⾣Set Data Source… Once you get to the dialogue box with your computer’s file system, navigate to where your data is stored and select the appropriate file as the data source. This happens most often because you have moved the map document. It also happens occasionally when you create new data frames. Try turning relative paths off, saving the document, copying a single layer, saving the document again, turning relative paths back on, and saving the document again(!).
  52. AGENDA REVIEW 2. GISc & Public Policy 3. Representing Spatial

    Data 4. Map Design Basics 5. Basic Maps in R 6. Basic Maps in ArcGIS 7. BACK MATTER
  53. REMINDERS 7. BACK MATTER Final project workgroups should briefly meet

    and divide up tasks; a meeting report should be posted in your Slack channel by next Monday as well. Lab-03 and PS-02 from this lecture and LP-05 for next lecture are due next Monday (February 19th). Final project groups have been posted to our GitHub organization in the finalProject repository