At the 2013 State of the Map conference I gave a one hour workshop aimed at introducing OpenStreetMap's inner workings to those of us who work with GIS.
/ Management Data Licensing How OSM represents data in GIS Terms: Nodes / Ways / Relations Key / Value pairs TagInfo Contributing: GIS Based Editors Consuming: Tiles, Files and API’s: Tileservers – your own and/or others Files – Downloads, Databases & QGIS API’s – OverPass Turbo TileMill Spatialite Wider Issues: Comparing OSM and Commercial Data What now for Commercial Mapping? Privacy & Data Protection The future of Tiles Close: Contacting OSM Takeaways from this session Q & A
tile. Pre-rendering all tiles would use around 54,000 GB of storage. The majority of tiles are never viewed. In fact just 1.79% have ever been viewed. This is because the majority of tiles are at zoom level 18 and the vast majority of those contain nothing of interest (the sea, for example) By following an on-the-fly rendering approach we can avoid rendering these tiles unnecessarily. The tile view count column shows how many tiles have been produced on the OSM Tile server. Tile server disk usage: 1,272 GB used. (6 Jan 2012). (z0 to z15: 252 GB, z16 to z18: 1,020 GB)
module, serves and expires tiles • renderd: priority queuing system for rendering requests • mapnik: does the actual rendering • postgis: Spatially enabled PostgreSQL database • osm2pgsql: loads osm data into the postgis database
= 62MB, Planet is 21GB) Would be faster on a machine with more RAM and SSD disks Whole Planet can take days osm2pgsql -s -U username -d gis -C 100 /path/to/downloaded/osm/data -s = Slim mode -U = postgres username (owner of gis database) -d = database name (usually gis) -C = amount of memory in MB, usually 800, but my PC is a bit old
a read-only API that serves up custom selected parts of the OSM map data. It acts as a database over the web: the client sends a query to the API and gets back the data set that corresponds to the query.” wiki.openstreetmap.org/wiki/Overpass_API overpass-turbo.eu
a web page map.addLayers([ make_layer("http://overpass-api.de/api/interpreter?data= node[amenity=pub](bbox);out+skel; (way[amenity=pub](bbox);node(w););out+skel;", “Blue") ]);
Some examples of the things it can currently do are: • Generate planet dumps from a database • Load planet dumps into a database • Produce change sets using database history tables • Apply change sets to a local database • Compare two planet dump files and produce a change set • Re-sort the data contained in planet dump files • Extract data inside a bounding box or polygon wiki.openstreetmap.org/wiki/Osmosis
a sqlite format usable in a GIS spatialite_osm_raw – converts .osm to a sqlite format usable for stats spatialite_osm_filter – generates an OSM.XML file masked by an input polygon spatialite_osm_net – generates ‘routable’ data from roads
/ setup: Nominatim – OSM’s Geocoder. pgRouting – PostgreSQL based routing OSRM – Open Source Routing Machine, closely tied to OSM Using Osmosis to keep your tileserver up to date pgRouting.org project-osrm.org
limited, aimed at data editors • ‘Viral’ nature of data licence • Focus on map data • ‘Warm Geography’ • Aims to be of Public Interest • Instant data updates, Sporadic • ‘Depth’ Issues (coverage, completeness) • Weak Metadata / Standards Commercial GeoData Providers • Wide source of inputs • API’s extensive, aimed at data consumers • Derived data issues • Focus on services • ‘Cold Geography’ • Aims to be Authoritive • Moderated update process, Consistent • ‘Width’ issues (limited featureset) • Expensive
Area / Multipolygon handling is complex Usage Policies Desktop probably not a problem. Server, web different Build Your Own No apt-get install tileserver, yet Tilemill Great for designers, self hosting options exist Shapefiles Easy, but not complete Postgis Starting point for many possibilities Data APIs Great for snapshots in geoJSON or POI webpages Spatialite Spatial Database without the overhead of Postgis Data vs Services Is OSM commoditising data? Is that bad?