Slide 1

Slide 1 text

Introduction to SQL Data Carpentry @ UW Madison Christina Koch

Slide 2

Slide 2 text

Questions •  Look at the research question on your table. •  Looking at surveys.csv, plots.csv and species.csv, what would you need to do to answer your question? •  Report back in 4-5 minutes

Slide 3

Slide 3 text

Answers (sort of) To answer our research questions, we need to: •  select subsets of the data (rows and columns) •  group subsets of data •  do math and other calculations •  combine data across spreadsheets

Slide 4

Slide 4 text

Goals •  Extract and manipulate data to answer our research questions •  Once data grows beyond ~50 rows, it is challenging to manipulate “by hand” so we want to use tools that are: – scalable (grow with our data) – reproducible (we can repeat them) – accuracy-enabling (reduce human error)

Slide 5

Slide 5 text

Our Tool: Databases* A relational database stores data in relations made up of records with fields. The relations are usually represented as tables. *will use R tomorrow to do some of the same tasks

Slide 6

Slide 6 text

And Data Queries SQL (Standard Query Language) is a language for querying data from databases

Slide 7

Slide 7 text

A transition From working with data directly... ...to studying it via queries (text commands)

Slide 8

Slide 8 text

To use a metaphor... From seeing what you get to “ordering” it with a query.

Slide 9

Slide 9 text

Why SQL? •  For medium to large datasets, SQL can be the most efficient way to store and query data •  Good format for collaborative data gathering •  In some disciplines, data is commonly stored in databases; useful to be able to access

Slide 10

Slide 10 text

Why SQL? •  Demystify databases! •  Good introduction to: – tabular data thinking •  select + filter •  split-apply-combine – using a “language” to manipulate data – what we learn today will be revisited in the R lesson tomorrow

Slide 11

Slide 11 text

Like eating your vegetables... SQL provides a healthy foundation for using other tools, and in certain circumstances is delicious on its own.