not already done so, look for the Mac and Windows specific links for software downloads on #_news in Slack. WELCOME! GETTING STARTED Install the rlang and sf packages from CRAN
a meeting report should be posted in your Slack channel by next Monday as well. 1. FRONT MATTER ANNOUNCEMENTS Lab-03 and PS-02 from this lecture and LP-05 for next lecture are due next Monday (February 19th). Final project groups have been posted to our GitHub organization in the finalProject repository
here - sporadic errors with working directories; alternate file path specification is up on website. • reprex - errors with pandoc (this is most likely on a lab computer); GitHub issue opened and fix added to the development version - re-install reprex from GitHub if you get this error! 1. FRONT MATTER BUG REPORTS
• reprex - errors writing to the clipboard; install the development version of reprex from GitHub! • janitor - errors with the clean_names() function for Windows users; install the development version of janitor from GitHub! 1. FRONT MATTER BUG REPORTS
this weekend while prepping this lecture: • The error says (in part): Error in grid.Call…polygon edge not found • If you get this error, re-execute the code chunk (you may have to do this multiple times). • I updated an existing bug report but no fix yet. 1. FRONT MATTER BUG REPORTS
Thursday ▸ The update changed how both miss_var_summary() and miss_case_summary() behave ▸ They now both require the order = TRUE option to reproduce the behavior that was shown in last week’s slides ▸ Change is not documented 1. FRONT MATTER NEW RELEASE > library(stlData) > library(naniar) > miss_var_summary(stlMurders, order = TRUE) # A tibble: 11 x 4 variable n_miss pct_miss n_miss_cumsum <chr> <int> <dbl> <int> 1 address 15 1.08 15 2 id 0 0 0 3 fullDate 0 0 0
names whenever possible ▸ ISO 3166-1 • 2 letter country codes (US) • 3 letter country codes (USA) • 3 digit numeric codes (840) ▸ The R package countrycode has tools for working with country names, abbreviations, and ISO codes 3. REPRESENTING SPATIAL DATA REPRESENTING NAMES
names whenever possible ▸ FIPS aka ANSI Codes 3. REPRESENTING SPATIAL DATA REPRESENTING NAMES Divison Label Label STATE Postal Code Missouri STATEFP FIPS Code 29 COUNTYFP County FIPS 510 COUNTY County Name St. Louis City
names whenever possible ▸ FIPS aka ANSI Codes ▸ The R package tigris (which we’ll use later this semester) has tools for working with state and country- level FIPS codes. 3. REPRESENTING SPATIAL DATA REPRESENTING NAMES
Tracts are typically updated by the U.S. Census Bureau for each decennial census. They range in size from 1,200 to 8,000 persons with an optimal population of 4,000 individuals.
shape of Earth ▸ Earth is not a perfect ellipsoid, so we need different datums for approximating its shape in different regions ▸ WGS 1984 is a general reference datum for all of Earth ▸ NAD 1983 is a more accurate datum for North America 3. REPRESENTING SPATIAL DATA GEODIC DATUM
singular, as if they were one file on our computer. That is how ArcGIS sees them. Our computer sees things differently, however: data.shp (geometry) data.shx (shape index) data.dbf (attributes) data.sbn (spatial index) data.sbx (spatial index) data.shp.xml (metadata) data.cpg (character encoding) data.prj (projection)
of shapefiles, and may contain a large number of feature classes. cityData.gdb Boundary_City Demographics_Tracts Hydro_MajorLakes Hydro_MajorRivers PublicSaftey_PoliceStations PublicSaftey_FireStations Trans_Interstates Trans_StreetCenterlines
“Who will be reading the map?” ▸ “Is the map content coordinated with written text or other graphics?” ▸ “What size and medium will be used to display the map?” ▸ “What are the time and budget constraints on map production?” ▸ “Who is the audience?” 4. MAP DESIGN BASICS
structure that combines the tabular and geometric data into a single object ▸ ggplot2 for plotting simple features • The development version from GitHub must be used - the geom_sf() geom is not available in the CRAN release! 5. BASIC MAPS IN R PACKAGES
to import; part of a broader filePath statement that should point to your data folder ▸ stringsAsFactors should always be set to false so that string data are not manipulated when they are imported Available in sf Download via CRAN 5. BASIC MAPS IN R IMPORT SHAPEFILES Parameters: st_read(“data/fileName.shp”, stringsAsFactors = FALSE) f(x)
to import; part of a broader filePath statement that should point to your data folder ▸ stringsAsFactors should always be set to FALSE so that string data are not manipulated when they are imported 5. BASIC MAPS IN R IMPORT SHAPEFILES Parameters: st_read(“data/fileName.shp”, stringsAsFactors = FALSE) f(x)
= FALSE) Importing the example file STL_BOUNDARY_City.shp: > river <- st_read(“data/STL_BOUNDARY_City.shp”, stringsAsFactors = FALSE) sf objects can be edited just like tibbles using dplyr and other tidyverse tools
= FALSE) Importing a hypothetical file MO_HYDRO_Rivders.shp: > river <- st_read(“data/MO_HYDRO_Rivers.shp”, stringsAsFactors = FALSE) Make sure you copy all needed files from the DataLibrary into the data directory in your project folder. Be careful not to miss a shapefile component.
1 or 2 shapefiles ▸ Name shapefiles and subfolders identically ▸ Use a clear naming system • Prefix for the geographic extent (e.g. STL, MO) • Use categories to group like files (e.g. HYDRO, TRANS) 5. BASIC MAPS IN R ORGANIZING DATA
polygons: • fill is used to define the fill color of the object • color is used to define the outline color of the object ▸ hex should be a hexadecimal color value (e.x. dark gray = #5d5d5d) Available in ggplot2 Download via GitHub (not CRAN!) 5. BASIC MAPS IN R MAP SIMPLE FEATURES Parameters: geom_sf(data = .data, fill = “hex”, color = “hex”) f(x)
polygons: • fill is used to define the fill color of the object • color is used to define the outline color of the object ▸ hex should be a hexadecimal color value (e.x. dark gray = #5d5d5d) 5. BASIC MAPS IN R MAP SIMPLE FEATURES Parameters: geom_sf(data = .data, fill = “hex”, color = “hex”) f(x)
= .data, fill = “hex”, color = “hex”) Using the example STL_BOUNDARY_City data stored in city: > ggplot() + geom_sf(data = city, fill = "#5d5d5d", color = "#5d5d5d") The data argument needs to be explicitly stated or current development versions of ggplot2 will return an error.
= .data, fill = “hex”, color = “hex”) Using the example STL_BOUNDARY_City data stored in city: > ggplot() + geom_sf(data = city, fill = "#5d5d5d", color = "#5d5d5d") If you get Error in grid.Call…polygon edge not found errors, try executing the code block again.
2 - city boundary with hydro ggplot() + geom_sf(data = city, fill = "#5d5d5d", color = "#5d5d5d") + geom_sf(data = hydro, fill = "#72bcd4", color = "#72bcd4")
3a - city boundary with hydro and HUD economic development districts ggplot() + geom_sf(data = city, fill = "#5d5d5d", color = "#5d5d5d") + geom_sf(data = hydro, fill = "#72bcd4", color = "#72bcd4") + geom_sf(data = econ, fill = "#d48a72", color = "#d48a72")
3b - city boundary with hydro and HUD economic development districts, properly layered ggplot() + geom_sf(data = city, fill = "#5d5d5d", color = "#5d5d5d") + geom_sf(data = econ, fill = "#d48a72", color = "#d48a72") + geom_sf(data = hydro, fill = "#72bcd4", color = "#72bcd4")
DPI ggsave(“filepath”, dpi = 300) Using a hypothetical plot: > ggsave(“results/leadHistogram.png”, dpi = 300) Export at 300 dots per inch (dpi) for coursework.
is better for tidying and transforming data, previewing maps, and producing stand alone choropleth maps. ArcGIS is better for creating reference maps and more complex map layouts.
layer ⾣ Source ⾣Set Data Source… Once you get to the dialogue box with your computer’s file system, navigate to where your data is stored and select the appropriate file as the data source. This happens most often because you have moved the map document. It also happens occasionally when you create new data frames. Try turning relative paths off, saving the document, copying a single layer, saving the document again, turning relative paths back on, and saving the document again(!).
and divide up tasks; a meeting report should be posted in your Slack channel by next Monday as well. Lab-03 and PS-02 from this lecture and LP-05 for next lecture are due next Monday (February 19th). Final project groups have been posted to our GitHub organization in the finalProject repository