Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Spatial data and web mapping with Python

Spatial data and web mapping with Python

An overview of spatial data and web mapping concepts, and of libraries and applications in Python for working with spatial data and generating maps for the web. A talk given at PyCon 2012 in Santa Clara, CA.

Paul Smith

March 23, 2012
Tweet

Other Decks in Programming

Transcript

  1. Concepts • Types of spatial data • Features • Geometries

    • Representations • Operations • Predicates • Indexes • Projections
  2. Spatial data … • … are data and collections of

    data with multiple (n > 1) dimensions (usually 2D or 3D (maybe 4D)) • … coordinates that represent a point or points in a space (in a plane, on a globe) Concepts
  3. Sidebar: vs. scalar data • Lot of literature on sorting,

    indexing, slicing-and-dicing 1-dimensional, non- spatial data • Representation: • It’s the machine: machine primitives, fast pointer operations Concepts … 72 101 108 108 111 44 …
  4. Sidebar: vs. scalar data • Scalar • Comparisons • Ordering

    • Ranges • Spatial • Nope • No … • Sort of … Concepts Think of spatial data more like objects instead of values
  5. Types of spatial data • Geometry — planar, i.e., Euclidean,

    flat Earth • 2D: (x, y) • 3D: (x, y, z) • Geography — ellipsoidal, i.e., round Earth • 2D: (lng, lat) • 3D: (lng, lat, elevation) Concepts
  6. Units • Geometry — units of the plane (pixels, inches,

    meters, etc.) • 2D: (x, y) • 3D: (x, y, z) • Geography — degrees (decimal or minutes/seconds) • 2D: (lng, lat) • 3D: (lng, lat, elevation) Concepts
  7. • Objects with attribute data and 1 or more geometries

    Features Concepts {              "id":  40,        "name":  "Missouri",        "censusarea":  68741.522,        "abbrev":  "MO",        "fips":  "29" }
  8. Geometries • Types • Point, LineString, Polygon • Multi-* •

    Bounds (aka MBR, bounding box, envelope) Concepts
  9. Geometries • Types • Point, LineString, Linear Ring, Polygon •

    Multi-* • Bounds (aka MBR, bounding box, envelope) Concepts
  10. Geometries • Representations • Python! array, list, tuple, dict, object

    • bounds: 2*n-tuple, list of params, polygon Concepts
  11. Operations • Intersection • Difference • Union • Centroid •

    Buffer Concepts Produce new geometries sets
  12. Indexes • R-tree • Good for all types of geometries

    • Quad-tree • Best with point data Concepts Dealing with collections of geometries
  13. R-tree • Node’s bounds is that of all subtrees’ bounds

    • Split on overflow • Algorithm for splits generally defines R- tree variants • Leaf nodes contain list of objects Concepts Good index for all types of geometries
  14. Quadtree • Recursively divide space into quadrants • Equivalent to

    geohash • Space-filling curves Concepts Best with point data
  15. Indexed queries • Nearest neighbor—Given point, return k features closest

    to it • Bounding box—Given bounds, return all features within it • Point query—Given point, return all features at that point Concepts
  16. Projections • Convert the globe to a flat map, and

    vice versa • Geographical coordinate (i.e., lng/lat of globe) → projected surface (i.e., x/y of map) Concepts
  17. Spatial reference systems • Named references to specific projections •

    WGS 84 - EPSG:4326 (“fake” projection— lng/lat) • Spherical Mercator - EPSG:900913 (Google Maps et al.) • Texas-centric Albers Equal Area - EPSG:3083 (“US shield”) • Various state plane (LCC) Concepts
  18. Formats • Vector • ESRI Shapefile (*.{shp,shx,dbf,prf}) • GeoJSON (.json)

    • KML (.kml) • WKT • Raster • GeoTIFF (.tiff) Concepts
  19. Core & common dependencies • GDAL / OGR • GEOS

    • PROJ.4 • libspatialindex Libraries
  20. Shapely Libraries >>> from shapely.geometry import Point >>> point =

    Point(0.0, 0.0) >>> poly = point.buffer(10) >>> poly.area 313.6548490545939 >>> poly.wkt 'POLYGON ((10.0000000000000000 0.0000000000000000, 9.9518472667219697 -0.9801714032956050, 9.80785280'
  21. Rtree Libraries >>> states.items()[:1] [(1, {'attributes': {'abbrev': 'ME', 'id': 1,

    'name': u'Maine'}, 'geom': <shapely…>})] >>> idx = index.Index() >>> for id, state in states.iteritems(): ... idx.insert(id, state['geom'].bounds) >>> pt = Point(-76.6, 39.2, -76.6, 39.2) >>> res = list(idx.intersection((pt.x, pt.y, pt.x, pt.y))) >>> res [18, 78] >>> states[18] {'attributes': {'abbrev': None, 'id': 18, 'name': u'Maryland'}, 'geom': <shapely…>}
  22. Rtree Libraries >>> found = None >>> for id in

    res: ... state = states[id] ... if state['geom'].contains(pt): ... found = state ... break >>> found {'attributes': {'abbrev': None, 'id': 18, 'name': u'Maryland'}, 'geom': <shapely…>}
  23. Mapnik • Library for creating maps • C++, Python, node.js

    bindings • Plugins for reading PostGIS, shapefiles, etc. Libraries
  24. Mapnik Libraries def render_map(self): map = mapnik.Map(WIDTH, HEIGHT, srs='+init=epsg:3083') map.background

    = mapnik.Color('#9ECAE1') # Create a base state style style = mapnik.Style() rule = mapnik.Rule() rule.symbols.append(mapnik.PolygonSymbolizer(mapnik.Color('#DEEBF7'))) rule.symbols.append(mapnik.LineSymbolizer(mapnik.Color('#3182BD'), 1.0)) style.rules.append(rule) map.append_style('states', style) …
  25. Mapnik Libraries … # Add base states layer layer =

    mapnik.Layer('states') layer.datasource = mapnik.PostGIS(table='states', **pg_conn_dict) layer.styles.append('states') map.layers.append(layer) usa = mapnik.Envelope(-670107, 6819910, 4319357, 9699636) map.zoom_to_box(usa) image = mapnik.Image(map.width, map.height) mapnik.render(map, image) return image.tostring('png')
  26. GeoDjango • Bundled with Django • Spatial-enable the ORM •

    Nice standalone libs for GEOS, GDAL, OGR, GeoIP—ctypes wrappers Applications
  27. GeoDjango Applications # models.py class State(models.Model): fips = models.CharField(max_length=2) name

    = models.CharField(max_length=90) abbrev = models.CharField(max_length=2) censusarea = models.FloatField() geom = models.MultiPolygonField(srid=4326) objects = models.GeoManager() >>> cd = CngrDstrct.objects.get(geom__contains=pt)
  28. Kartograph • Python library and CLI • Input: shapefile(s) •

    Output: SVG • Style with CSS, behavior with JavaScript Libraries
  29. Kartograph Libraries { "proj": { "id": "laea", "lon0": "auto", "lat0":

    "auto" }, "bounds": { "mode": "bbox", "data": [-130, 20, -60, 60], "padding": 0.01 }, "layers": [{ "id": "states", "src": "states.shp" }, { "special": "graticule", "latitudes": 10, "longitudes": 10 }], "export": { "width": 800, "height": 500,
  30. Applications • TileStache • web map tile serving and compositing

    • IPython (+ qtconsole + Descartes) • interactive console visualization • QGIS • desktop GIS
  31. TileStache • Serves: • Rendered Mapnik tiles • Pre-rendered tiles

    • Vector data • Cached — S3, disk, Memcached • WSGI Applications
  32. IPython • >= 0.12 • Qt Console • HTML notebook

    • Shapely + Descartes from PyPi • inline pylab/matplotlib plots Applications
  33. QGIS • Open source (GPL) • Cross-platform • Supports vector

    (OGR), raster (GDAL) layers • Write plugins in Python! Applications
  34. Data sources • US Census TIGER/Line • NationalAtlas.gov • OpenStreetMap

    • Natural Earth • State & local GIS departments Somewhat US-centric!
  35. What I left out • All the non-Python stuff you’ll

    probably need to get this to work :-) • Spatial databases (PostGIS, Spatialite) • JavaScript browser UIs (OpenLayers, Leaflet, Modest Maps, Wax, Polymaps) • Map design applications (TileMill)
  36. Challenge • Who’s my congressman? • Congressional districts and states

    shapefiles from TIGER/Line • Sunlight API for representatives’ names • Google Maps geocoder API or HTML5 geolocation
  37. Challenge • Who’s my congressman? • Build R-tree index •

    Query for all CDs where lng/lat is within bounding box • Loop over result set until find one and only one that contains the lng/lat
  38. Credits • R-tree diagram: http://www.rulabinsky.com/cavd/text/chap03-6.html • Mapnik map: http://mapbox.com/tour/#maps •

    Quadtree & geohash diagram: http://blog.notdot.net/2009/11/Damn-Cool- Algorithms-Spatial-indexing-with-Quadtrees-and-Hilbert-Curves • Descartes & Shapely & IPython for example geoms: http://pypi.python.org/pypi/ descartes