Slide 1

Slide 1 text

Spatial data and web mapping with Python Paul Smith @paulsmith

Slide 2

Slide 2 text

EveryBlock

Slide 3

Slide 3 text

EveryBlock

Slide 4

Slide 4 text

EveryBlock

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

No content

Slide 8

Slide 8 text

The big picture • Concepts • Libraries • Applications • Data sources

Slide 9

Slide 9 text

The big picture • Concepts • Libraries • Applications • Data sources

Slide 10

Slide 10 text

Concepts • Types of spatial data • Features • Geometries • Representations • Operations • Predicates • Indexes • Projections

Slide 11

Slide 11 text

Spatial data … • … are data and collections of data with multiple (n > 1) dimensions (usually 2D or 3D (maybe 4D)) • … coordinates that represent a point or points in a space (in a plane, on a globe) Concepts

Slide 12

Slide 12 text

Sidebar: vs. scalar data • Lot of literature on sorting, indexing, slicing-and-dicing 1-dimensional, non- spatial data • Representation: • It’s the machine: machine primitives, fast pointer operations Concepts … 72 101 108 108 111 44 …

Slide 13

Slide 13 text

Sidebar: vs. scalar data • Scalar • Comparisons • Ordering • Ranges • Spatial • Nope • No … • Sort of … Concepts Think of spatial data more like objects instead of values

Slide 14

Slide 14 text

Types of spatial data • Geometry — planar, i.e., Euclidean, flat Earth • 2D: (x, y) • 3D: (x, y, z) • Geography — ellipsoidal, i.e., round Earth • 2D: (lng, lat) • 3D: (lng, lat, elevation) Concepts

Slide 15

Slide 15 text

Units • Geometry — units of the plane (pixels, inches, meters, etc.) • 2D: (x, y) • 3D: (x, y, z) • Geography — degrees (decimal or minutes/seconds) • 2D: (lng, lat) • 3D: (lng, lat, elevation) Concepts

Slide 16

Slide 16 text

• Objects with attribute data and 1 or more geometries Features Concepts {              "id":  40,        "name":  "Missouri",        "censusarea":  68741.522,        "abbrev":  "MO",        "fips":  "29" }

Slide 17

Slide 17 text

Geometries • Types • Point, LineString, Polygon • Multi-* • Bounds (aka MBR, bounding box, envelope) Concepts

Slide 18

Slide 18 text

Geometries • Types • Point, LineString, Linear Ring, Polygon • Multi-* • Bounds (aka MBR, bounding box, envelope) Concepts

Slide 19

Slide 19 text

Geometries • Representations • Python! array, list, tuple, dict, object • bounds: 2*n-tuple, list of params, polygon Concepts

Slide 20

Slide 20 text

Operations • Intersection • Difference • Union • Centroid • Buffer Concepts Produce new geometries

Slide 21

Slide 21 text

Operations • Intersection • Difference • Union • Centroid • Buffer Concepts Produce new geometries sets

Slide 22

Slide 22 text

Intersection Concepts

Slide 23

Slide 23 text

Difference Concepts

Slide 24

Slide 24 text

Union Concepts

Slide 25

Slide 25 text

Centroid Concepts

Slide 26

Slide 26 text

Buffer Concepts

Slide 27

Slide 27 text

Predicates • Contains • Within • Intersects • Disjoint Concepts Answer questions about relationships

Slide 28

Slide 28 text

Indexes • R-tree • Good for all types of geometries • Quad-tree • Best with point data Concepts Dealing with collections of geometries

Slide 29

Slide 29 text

R-tree • Node’s bounds is that of all subtrees’ bounds • Split on overflow • Algorithm for splits generally defines R- tree variants • Leaf nodes contain list of objects Concepts Good index for all types of geometries

Slide 30

Slide 30 text

R-tree Concepts

Slide 31

Slide 31 text

Quadtree • Recursively divide space into quadrants • Equivalent to geohash • Space-filling curves Concepts Best with point data

Slide 32

Slide 32 text

Quadtree Concepts

Slide 33

Slide 33 text

Indexed queries • Nearest neighbor—Given point, return k features closest to it • Bounding box—Given bounds, return all features within it • Point query—Given point, return all features at that point Concepts

Slide 34

Slide 34 text

Projections • Convert the globe to a flat map, and vice versa • Geographical coordinate (i.e., lng/lat of globe) → projected surface (i.e., x/y of map) Concepts

Slide 35

Slide 35 text

Spatial reference systems • Named references to specific projections • WGS 84 - EPSG:4326 (“fake” projection— lng/lat) • Spherical Mercator - EPSG:900913 (Google Maps et al.) • Texas-centric Albers Equal Area - EPSG:3083 (“US shield”) • Various state plane (LCC) Concepts

Slide 36

Slide 36 text

WGS 84

Slide 37

Slide 37 text

Spherical Mercator

Slide 38

Slide 38 text

Texas-centric AEA

Slide 39

Slide 39 text

Formats • Vector • ESRI Shapefile (*.{shp,shx,dbf,prf}) • GeoJSON (.json) • KML (.kml) • WKT • Raster • GeoTIFF (.tiff) Concepts

Slide 40

Slide 40 text

The big picture • Concepts • Libraries • Applications • Data sources

Slide 41

Slide 41 text

Libraries • Core & common dependencies • Python

Slide 42

Slide 42 text

Core & common dependencies • GDAL / OGR • GEOS • PROJ.4 • libspatialindex Libraries

Slide 43

Slide 43 text

Python • Shapely • Rtree • Mapnik • GeoDjango • Kartograph Libraries

Slide 44

Slide 44 text

Shapely • Manipulation and analytics of geometries • Nice wrapper around GEOS Libraries

Slide 45

Slide 45 text

Shapely Libraries >>> from shapely.geometry import Point >>> point = Point(0.0, 0.0) >>> poly = point.buffer(10) >>> poly.area 313.6548490545939 >>> poly.wkt 'POLYGON ((10.0000000000000000 0.0000000000000000, 9.9518472667219697 -0.9801714032956050, 9.80785280'

Slide 46

Slide 46 text

Rtree Libraries >>> states.items()[:1] [(1, {'attributes': {'abbrev': 'ME', 'id': 1, 'name': u'Maine'}, 'geom': })] >>> idx = index.Index() >>> for id, state in states.iteritems(): ... idx.insert(id, state['geom'].bounds) >>> pt = Point(-76.6, 39.2, -76.6, 39.2) >>> res = list(idx.intersection((pt.x, pt.y, pt.x, pt.y))) >>> res [18, 78] >>> states[18] {'attributes': {'abbrev': None, 'id': 18, 'name': u'Maryland'}, 'geom': }

Slide 47

Slide 47 text

Rtree Libraries >>> found = None >>> for id in res: ... state = states[id] ... if state['geom'].contains(pt): ... found = state ... break >>> found {'attributes': {'abbrev': None, 'id': 18, 'name': u'Maryland'}, 'geom': }

Slide 48

Slide 48 text

Mapnik • Library for creating maps • C++, Python, node.js bindings • Plugins for reading PostGIS, shapefiles, etc. Libraries

Slide 49

Slide 49 text

No content

Slide 50

Slide 50 text

Mapnik Libraries def render_map(self): map = mapnik.Map(WIDTH, HEIGHT, srs='+init=epsg:3083') map.background = mapnik.Color('#9ECAE1') # Create a base state style style = mapnik.Style() rule = mapnik.Rule() rule.symbols.append(mapnik.PolygonSymbolizer(mapnik.Color('#DEEBF7'))) rule.symbols.append(mapnik.LineSymbolizer(mapnik.Color('#3182BD'), 1.0)) style.rules.append(rule) map.append_style('states', style) …

Slide 51

Slide 51 text

Mapnik Libraries … # Add base states layer layer = mapnik.Layer('states') layer.datasource = mapnik.PostGIS(table='states', **pg_conn_dict) layer.styles.append('states') map.layers.append(layer) usa = mapnik.Envelope(-670107, 6819910, 4319357, 9699636) map.zoom_to_box(usa) image = mapnik.Image(map.width, map.height) mapnik.render(map, image) return image.tostring('png')

Slide 52

Slide 52 text

GeoDjango • Bundled with Django • Spatial-enable the ORM • Nice standalone libs for GEOS, GDAL, OGR, GeoIP—ctypes wrappers Applications

Slide 53

Slide 53 text

GeoDjango Applications # models.py class State(models.Model): fips = models.CharField(max_length=2) name = models.CharField(max_length=90) abbrev = models.CharField(max_length=2) censusarea = models.FloatField() geom = models.MultiPolygonField(srid=4326) objects = models.GeoManager() >>> cd = CngrDstrct.objects.get(geom__contains=pt)

Slide 54

Slide 54 text

Kartograph • Python library and CLI • Input: shapefile(s) • Output: SVG • Style with CSS, behavior with JavaScript Libraries

Slide 55

Slide 55 text

No content

Slide 56

Slide 56 text

Kartograph Libraries { "proj": { "id": "laea", "lon0": "auto", "lat0": "auto" }, "bounds": { "mode": "bbox", "data": [-130, 20, -60, 60], "padding": 0.01 }, "layers": [{ "id": "states", "src": "states.shp" }, { "special": "graticule", "latitudes": 10, "longitudes": 10 }], "export": { "width": 800, "height": 500,

Slide 57

Slide 57 text

No content

Slide 58

Slide 58 text

The big picture • Concepts • Libraries • Applications • Data sources

Slide 59

Slide 59 text

Applications • TileStache • web map tile serving and compositing • IPython (+ qtconsole + Descartes) • interactive console visualization • QGIS • desktop GIS

Slide 60

Slide 60 text

TileStache • Serves: • Rendered Mapnik tiles • Pre-rendered tiles • Vector data • Cached — S3, disk, Memcached • WSGI Applications

Slide 61

Slide 61 text

IPython • >= 0.12 • Qt Console • HTML notebook • Shapely + Descartes from PyPi • inline pylab/matplotlib plots Applications

Slide 62

Slide 62 text

$ ipython qtconsole --pylab=inline Applications

Slide 63

Slide 63 text

QGIS Applications

Slide 64

Slide 64 text

QGIS • Open source (GPL) • Cross-platform • Supports vector (OGR), raster (GDAL) layers • Write plugins in Python! Applications

Slide 65

Slide 65 text

The big picture • Concepts • Libraries • Applications • Data sources

Slide 66

Slide 66 text

Data sources • US Census TIGER/Line • NationalAtlas.gov • OpenStreetMap • Natural Earth • State & local GIS departments Somewhat US-centric!

Slide 67

Slide 67 text

What I left out • All the non-Python stuff you’ll probably need to get this to work :-) • Spatial databases (PostGIS, Spatialite) • JavaScript browser UIs (OpenLayers, Leaflet, Modest Maps, Wax, Polymaps) • Map design applications (TileMill)

Slide 68

Slide 68 text

Challenge • Who’s my congressman? • Congressional districts and states shapefiles from TIGER/Line • Sunlight API for representatives’ names • Google Maps geocoder API or HTML5 geolocation

Slide 69

Slide 69 text

Challenge • Who’s my congressman? • Build R-tree index • Query for all CDs where lng/lat is within bounding box • Loop over result set until find one and only one that contains the lng/lat

Slide 70

Slide 70 text

AMA Paul Smith @paulsmith Thanks!

Slide 71

Slide 71 text

Credits • R-tree diagram: http://www.rulabinsky.com/cavd/text/chap03-6.html • Mapnik map: http://mapbox.com/tour/#maps • Quadtree & geohash diagram: http://blog.notdot.net/2009/11/Damn-Cool- Algorithms-Spatial-indexing-with-Quadtrees-and-Hilbert-Curves • Descartes & Shapely & IPython for example geoms: http://pypi.python.org/pypi/ descartes