Maps, Lies and Storytelling | P06 & P07| SQL & CartoCSS

Maps, Lies & Storytelling Class 4

Welcome!

homework

Maps, Lies & Storytelling Section 6 - Learning SQL

SQL in CartoDB Can be viewed as the entry point
for complex geospatial analysis, tool building, and data manipulation

Common SQL Statements SELECT UPDATE INSERT DELETE * there are
lots of that we wont get into ** these can (and often are) mixed

Breaking down a SELECT statement

SELECT * FROM us_states The FROM statement tells us which
table we are selecting data from

SELECT * FROM us_states The “*” here means, select ALL
ﬁelds (or columns)

SELECT cartodb_id, name FROM us_states Here, I’ve opted to select
only two columns speciﬁcally

SELECT cartodb_id, name FROM us_states The result of this query
will have just the two columns I ask for, but for every row in my dataset cartodb_id name 1 new york 2 alabama etc

Here, I can limit it to just one column. SELECT
cartodb_id FROM us_states cartodb_id 1 2 etc Notice that the result contains the column name

You can create an ‘alias’ for the result. Basically, change
the name it uses for the result. SELECT cartodb_id AS my_id FROM us_states my_id 1 2 etc Notice that the results are the same, just a diﬀerent name

You can modify values on the ﬂy using SQL, for
example, simple arithmetic. SELECT cartodb_id + 4 AS id FROM us_states id 5 6 etc The result isn’t stored anywhere permanently, it is derived on the ﬂy

You can use a type of function, called an ‘Aggregate’
that measures across all rows at once. SELECT min(cartodb_id) AS id FROM us_states id 1 Now, only 1 row is returned

You can use a type of function, called an ‘Aggregate’
that measures across all rows at once. SELECT sum(cartodb_id) FROM us_states sum 1328 Now, only 1 row is returned

You can make up columns out of thin air. !
Notice that andrew is single quoted. That is the standard for strings SELECT cartodb_id, ‘andrew’ FROM us_states notice that ‘andrew’ remains constant for all rows cartodb_id unknown 1 andrew 2 andrew etc

Filtering using WHERE

Using WHERE statements will allow you to narrow in on
exactly the rows you want SELECT cartodb_id FROM us_states WHERE cartodb_id = 1 cartodb_id 1 Here, only the single matching row is returned

Using WHERE statements work on all types of data SELECT
name FROM us_states WHERE name = ‘New York’ name New York See!

When using string columns, they case sensitive SELECT name FROM
us_states WHERE name = ‘new york’ name oh no! no results

Using WHERE statements will allow you to narrow in on
exactly the rows you want SELECT cartodb_id FROM us_states WHERE cartodb_id > 24 cartodb_id 25 26 etc

You aren’t limited to using values as they exist, you
can manipulate them on the ﬂy SELECT cartodb_id FROM us_states WHERE cartodb_id*4=20 cartodb_id 5

With strings, you can do things like case insensitive partial
matching. Here, ‘%’ is a wildcard, and ILIKE says use the wild card and ignore case SELECT name FROM us_states WHERE name ILIKE ‘new%’ name New Hampshire New Jersey etc Nice!

You can check if a value is in a set
of options SELECT cartodb_id FROM us_states WHERE cartodb_id IN (1,4,99) name 1 4 Not 99 states

Always watch your quotes! ‘text’

Constrain results with ORDER BY & LIMIT

The LIMIT restricts the number or rows you ask for
SELECT cartodb_id FROM us_states WHERE cartodb_id > 24 LIMIT 1 cartodb_id 25 see!

The LIMIT restricts the number or rows you ask for
SELECT cartodb_id FROM us_states WHERE cartodb_id > 24 LIMIT 2 cartodb_id 25 26 see!

The ORDER BY allows you to change the order of
rows in your result SELECT cartodb_id FROM us_states WHERE cartodb_id > 24 ORDER BY cartodb_id cartodb_id 25 26 etc The default is determined by the order on disk. no noticeable change here

ASC means, ascending. SELECT cartodb_id FROM us_states WHERE cartodb_id >
24 ORDER BY cartodb_id ASC cartodb_id 25 26 etc Here, ascending is the same as what we saw before

DESC means, descending SELECT cartodb_id FROM us_states WHERE cartodb_id >
24 ORDER BY cartodb_id DESC cartodb_id 52 51 etc See!

It works just as well on the alphabet SELECT name
FROM us_states ORDER BY name DESC name Wyoming Wisconsin etc See!

USING GROUP BY

Let’s do something crazy here, substring() SELECT substring(name,1,3) FROM us_states
substring Ala Ala Ari etc So we are asking for the ﬁrst 3 characters of the state name

We can add a GROUP BY, which will join all
rows with the same results SELECT substring(name,1,2) FROM us_states GROUP BY substring(name,1,2) substring Ala Ari Ark etc The second ‘Ala’ disappeared from our results

We can now use an aggregate that will work within
each group result, not across the who table SELECT substring(name,1,2), count(*) FROM us_states GROUP BY substring(name,1,2) substring count Ala 2 Ari 1 Ark 1 etc Now we can see where that second ‘Ala’ went to!

Let’s find out what the most common first three letters
of state names is by including an ORDER BY SELECT substring(name,1,2), count(*) FROM us_states GROUP BY substring(name,1,2) ORDER BY count(*) DESC substring count New 4 Nor 2 Ala 2 etc So ‘New’ has 4, which we all experience from trying to quickly fill online address forms…

SQL Joins to query across tables

Someone else has done a really nice job of describing
and illustrating this, so I’m going to shamelessly reuse

http://bit.ly/1vjVAUA

I also love nested queries

Here, I use a nested queries and compare against the
results for a state name SELECT name, state FROM us_cities WHERE state IN (SELECT name FROM us_states WHERE name ILIKE ‘new y%’) name state New York New York Albany New York Buffalo New York etc

You can also nest right in the column results! SELECT
name, (SELECT population FROM us_census WHERE name = u.name) pop FROM us_cities u ORDER BY pop DESC name pop California 38000000 Texas 26000000 New York 20000000 etc

The CTE or the “with” statement

You can use a WITH statement to treat a SQL
result as a new table WITH fakestuff AS (SELECT name, state FROM us_cities WHERE name = ‘New York’) SELECT * FROM fakestuff name state New York New York Here, I create a magical new table- like object I call “fakestuﬀ” When I select ‘*’ from the fakestuﬀ table- ish thing, the result is only one row

I use them to help organize my thinking in SQL
WITH fakestuff AS (SELECT ST_Centroid(the_geom) AS the_geom, name, pop FROM us_states) SELECT * FROM fakestuff WHERE pop > 5000000 name state New York New York

DELETES & UPDATES

UPDATES work with all the same properties of SELECT but
with a few diﬀerences. UPDATE us_states SET name = ‘New Amsterdam’ WHERE name = ‘New York’ For example, you will only get a conﬁrmation result, not actual rows. (typically)

Same goes for DELETE statements DELETE FROM us_states WHERE name
= ‘New York’

An very basic introduction to spatial SQL

SELECT the_geom FROM us_states In CartoDB, your geometries are always
standardized and stored in a column called the_geom the_geom MULTIPOLYGON… MULTIPOLYGON… etc

SELECT the_geom FROM us_states ORDER BY the_geom ASC You can
manipulate, ﬁlter, order, measure and lots of other things to geometries the_geom MULTIPOLYGON… MULTIPOLYGON… etc

SELECT the_geom FROM us_states WHERE ST_Intersects(the_geom, my_geometry) CartoDB uses PostGIS
to enable most geospatial functions. These are the functions that start with ‘ST_…’ the_geom MULTIPOLYGON…

SELECT the_geom FROM us_states WHERE ST_Intersects(the_geom, CDB_(42, -80)) We also
have a couple of helper functions only in CartoDB, these start with CDB_. For example, CDB_LatLng(num, num) creates a geometry from two coordinates the_geom MULTIPOLYGON…

SELECT the_geom FROM us_states WHERE ST_Intersects(the_geom, CDB_(42, -80)) ST_Intersects returns
TRUE if the two geometries intersect one another

SELECT the_geom FROM us_states WHERE ST_DWithin(the_geom, CDB_(42, -80), 1) ST_DWithin
returns TRUE if the two geometries are within a minimum distance (supplied as the 3rd variable)

SELECT the_geom FROM us_states WHERE ST_DWithin(the_geom, CDB_(42, -80), 1) In
CartoDB, we rely on projections. Using functions where we supply a unit measurement, that unit is always in the unit of the projection. Here, 1 degree lat/lng. Somewhat meaningless.

SELECT the_geom FROM us_states WHERE ST_DWithin(the_geom::geography, CDB_LatLng(42, -80)::geography, 10000) You
can use a spherical globe and measure in meters by re-casting the geometries to geography types (on the ﬂy)

SELECT the_geom FROM us_states ORDER BY the_geom <-> CDB_LatLng(44,-80) You
can ORDER BY the distance to a speciﬁed geometry really nicely in CartoDB using <->

SELECT *, ST_Distance(the_geom, CDB_LatLng(42,-80)) AS d FROM us_states You can
also use spatial measurements to derive a new or modiﬁed column result on the ﬂy

SELECT *, ST_Distance(the_geom, CDB_LatLng(42,-80)) AS d FROM us_states ST_Distance will
give me back the distance between two geometries. Here, the geometry in every row of my table would be measured from the point 42, -80

SELECT *, ST_Distance(the_geom::geography, CDB_LatLng(42,-80)::geography) AS d FROM us_states Remember, this
returns a unit, so we might want to ensure we get meters

SELECT name, ST_MakeLine(the_geom, CDB_LatLng(42,-80)) AS the_geom FROM us_cities You can
use ST_MakeLine to create lines between points on the ﬂy. Here, I’ll turn every city point, into a line between that point and my point at 42, -80

UPDATE us_cities SET the_geom = ST_MakeLine(the_geom, CDB_LatLng(42,-80)) AS the_geom You
can use all the stuﬀ in UPDATES to store results as well. If I ran this, my cities would no longer be points, but would be lines! Careful, there is no UNDO!

SELECT name, ST_Centroid(the_geom) AS the_geom FROM us_states Likewise, we can
turn polygons or lines into points. There are multiple ways to do this. ST_Centroid is one

SELECT name, ST_Envelope(the_geom) AS the_geom FROM us_cities GROUP BY state
Or you could turn points into a polygon. ST_Envelope will create the minimum box around a group of points

the_geom versus the_geom_webmercator

the_geom the_geom_webmercator the_geom_webmercator When you create new data, it goes
into, Behind the scenes, CartoDB also translates it to, When CartoDB draws a map, it comes from,

the_geom the_geom_webmercator Measure using Latitude, Longitude with,

the_geom the_geom_webmercator Select data to “Create new table” and include,

the_geom the_geom_webmercator Select or ﬁlter data you want on the
map, include,

the_geom the_geom_webmercator CartoDB can’t render the map without

the_geom the_geom_webmercator Anytime you modify and store data on CartoDB
will update so your maps are accurate,

the_geom the_geom_webmercator If you modify on the ﬂy and want
to map results, you need to create on the ﬂy also!

SELECT cartodb_id, the_geom, the_geom_webmercator FROM us_states ORDER BY the_geom <->
CDB_LatLng(44,-80) You can do any ﬁlter you want, just make sure you include the_geom_webmercator in your column results to show it on the map

SELECT cartodb_id, ST_Buffer(the_geom, 1) FROM us_cities This wont display anything
on your map, because it lacks the_geom_webmercator

SELECT cartodb_id, ST_Buffer(the_geom, 1), the_geom_webmercator FROM us_cities This would show
the original points, but it would fail to include the transformed (buﬀered) point because the_geom hasn’t been altered on disk

SELECT cartodb_id, ST_Buffer(the_geom, 1) AS the_geom_webmercator FROM us_cities Just aliasing
to the_geom_webmercator wont work, remember the_geom is a diﬀerent projection than the_geom_webmercator

SELECT cartodb_id, ST_Transform(ST_Buffer(the_geom, 1), 3857) AS the_geom_webmercator FROM us_cities ST_Transform
to the rescue!

SELECT cartodb_id, ST_Transform(ST_Buffer(the_geom, 1), 3857) AS the_geom_webmercator FROM us_cities ST_Transform
turns projects any geometry from one projection to another. We want to go from WGS84 (the_geom) to Web Mercator (the_geom_webmercator)

SELECT cartodb_id, ST_Transform(ST_Buffer(the_geom, 1), 3857) AS the_geom_webmercator FROM us_cities In
CartoDB, webmercator has an ID, or a SRID to be exact, that we can use to tell ST_Transform that we want webmercator from the_geom. The SRID is 3857 3857 = Web Mercator

SELECT cartodb_id, ST_Transform(ST_Buffer(the_geom, 1), 3857) AS the_geom_webmercator FROM us_cities Now
our result will be on the map dynamically!

You can update the SQL of any layer in your
visualizations with the setSQL function ! layer.setSQL(“SELECT * FROM us_states WHERE name = ‘New Hampshire’”); In CartoDB.js http://bit.ly/1t8vK1i

http://bit.ly/short-sql Don’t forget, more SQL examples

Let’s look at some of this live!

Maps, Lies & Storytelling Part 7 - More on CartoCSS

Once you’ve got the things you want to render, you
will deﬁne a set of rules in CartoCSS to deﬁne how you render it

CartoCSS in CartoDB Can be viewed as the entry point
for cartography and design

a few of the basics

#table{ polygon-fill: #FF6600; polygon-opacity: 0.7; line-color: #FFF; line-width: 1; line-opacity:
1; }

#table{ polygon-fill: red; polygon-opacity: 0.7; line-color: #FFF; line-width: 1; line-opacity:
1; }

#table{ polygon-fill: rgb(0,0,256); polygon-opacity: 0.7; line-color: #FFF; line-width: 1; line-opacity:
1; }

#table{ polygon-fill: #AEFF39; polygon-opacity: 0.2; line-color: #FFF; line-width: 1; line-opacity:
1; }

#table{ polygon-fill: #E94C6F; polygon-opacity: 0.9; line-color: #542733; line-width: 4; line-opacity:
1; }

#table{ marker-fill-opacity: 0.9; marker-line-color: #FFF; marker-line-width: 1.5; marker-line-opacity: 1; marker-placement:
point; marker-type: ellipse; marker-width: 10; marker-fill: #FF6600; marker-allow-overlap: true; }

#table{ marker-fill-opacity: 0.9; marker-line-color: #FFF; marker-line-width: 1.5; marker-line-opacity: 1; marker-placement:

#table{ marker-fill-opacity: 0.9; marker-line-color: yellow; marker-line-width: 3.5; marker-line-opacity: 1; marker-placement:

lines use the same rules as the line borders of
polygons

#table{ line-color: #FF6600; line-width: 2; line-opacity: 0.7; }

#table{ line-color: #FF6600; line-width: 15; line-opacity: 0.7; }

#table{ line-color: #FF6600; line-width: 15; line-opacity: 0.7; line-smooth: .5; }

#table{ line-color: #FF6600; line-width: 15; line-opacity: 0.7; line-smooth: 5.5; }

#table{ line-color: #FF6600; line-width: 15; line-opacity: 0.7; line-smooth: 5.5; line-dasharray:
5, 4; }

all of these are subject to composite operations. we’ll see
in the hands on

Using rules

Any column in your table (or SQL result) can be
used to deﬁne a rule

#table{ marker-fill-opacity: 0.9; marker-line-color: yellow; marker-line-width: 1.5; marker-line-opacity: 1; marker-width:
22; marker-fill: #FF6600; [atype = 'Specialized']{ marker-fill: white; } } ‘atype’ is a string column in my table

#table{ marker-fill-opacity: 0.9; marker-line-color: yellow; marker-line-width: 1.5; marker-line-opacity: 1; marker-width:
22; marker-fill: #FF6600; [atype = 'Specialized']{ marker-fill: white; } [atype = 'Institutional']{ marker-fill: green; } } ordered

#table{ marker-fill-opacity: 0.9; marker-line-color: blue; marker-line-width: 1.5; marker-line-opacity: 1; marker-width:
22; marker-fill: #FF6600; [atype = 'Institutional']{ marker-fill: green; [cartodb_id<12594]{ marker-line-color: red; } } } nested

22; marker-fill: #FF6600; [is_valid = True]{ marker-fill: green; } } Boolean columns a slightly diﬀerent in notation

Zoom as a parameter

22; marker-fill: #FF6600; [zoom > 7]{ marker-line-width: 3; } } The zoom parameter

22; marker-fill: #FF6600; [zoom > 7]{ marker-line-width: 3; [zoom > 8]{ marker-line-width: 6; } } } nested

A whole world of labels and text…

http://bit.ly/1rtx6pt

Client side CartoCSS | Torque

Map { -torque-frame-count:512; -torque-animation-duration:30; -torque-time- attribute:"cartodb_id"; -torque-aggregation- function:"count(cartodb_id)"; -torque-resolution:2; -torque-data-
aggregation:linear; } The ‘Map’ object sets up the entire canvas and settings of your temporal visualization

aggregation:linear; } frame-count is how many temporal bins your data will be broken into. if you are visualizing 31 days, you wouldn’t need more than 31…

aggregation:linear; } duration is how long the animation will play for before looping back to the beginning

aggregation:linear; } time-attribute is the numerical or temporal column in your dataset that you want to order the animation by, beginning to end

aggregation:linear; } the aggregation- function is actually SQL! It is how you want data in the same spatial cell, in the same temporal bin, to be combined. here, you just get a raw value

Map { -torque-frame-count:512; -torque-animation-duration:30; -torque-time- attribute:"cartodb_id"; -torque-aggregation- function:"sum(population)"; -torque-resolution:2; -torque-data-
aggregation:linear; } But here, you would get a totally diﬀerent value. The result of this is an attribute called value that we can use to style with in our CartoCSS later

aggregation:linear; } resolution is the size of each pixel you will style

#table{ comp-op: lighter; marker-fill-opacity: 0.9; marker-line-color: #FFF; marker-line-width: 1.5; marker-line-opacity:
1; marker-type: ellipse; marker-width: 6; marker-fill: #FF9900; } Much of the CartoCSS for the rest looks the same as we saw before.

#table{ comp-op: lighter; marker-fill-opacity: 0.9; marker-line-color: #FFF; marker-line-width: 1.5; marker-line-opacity:
1; marker-type: ellipse; marker-width: 6; marker-fill: #FF9900; [value > 3]{ marker-fill-color: blue; } } We can use the value attribute from our aggregation function here to change styles though

#table[frame-offset=1] { marker-width:8; marker-fill-opacity:0.45; } We can also use frame-oﬀsets
! frame-oﬀsets allow you to draw the marker a second (or more times) after the frame it initially hits your maps.

You can update the CartoCSS of any layer in your
visualizations with the setCartoCSS function (no line breaks) ! layer.setCartoCSS(“#tablename {marker- fill: red; }”); In CartoDB.js http://bit.ly/1vpF1pi

Maps, Lies & Storytelling Technology aside - Odyssey.js

hands on

Maps, Lies & Storytelling Guest speaker

Aurelia Moser Knight News Fellow Ushahidi / Internews http://aureliamoser.com/ Next
week

Maps, Lies & Storytelling Discussion

Maps, Lies & Storytelling Assignments

Readings 5 Favorite Maps: Bill Rankin | http://bit.ly/1nrXEsV 15 Maps
that explain ISIS | http://bit.ly/1pl02wx

I want you to show me the most far out
way you can use CartoCSS and/or SQL to create a visualization. It can be a multistep process (UPDATEs, Table from Query), but be sure to document and share every step of your process. Assignment 1

Example 1

Example 2

Time to document your lie. Begin creating a comprehensive map
and story of your lie. This should be a narrative story. During the narrative, you should highlight where your lie is exposed through maps. Assignment 2

Create a map to accompany a current news article or
a current event. This should be an original map, not some improvement of a map included in the article Assignment 3

See you next week!

Maps, Lies and Storytelling | P06 & P07| SQL & CartoCSS

Maps, Lies and Storytelling | P06 & P07| SQL & CartoCSS

More Decks by Andrew W Hill

Other Decks in Education

Featured

Transcript