Geospatial Analysis Made Easy with meza

Geospatial Analysis Made Easy with meza

This talk was given at GeoPython and is about using meza for GeoJSON analysis. Code used in this talk is available at https://gist.github.com/reubano/5ba3a3b850fe4c1e5ee497f325111ba0.

869402f85dcbabcef3da1ee61b88a45a?s=128

Reuben Cummings

May 10, 2017
Tweet

Transcript

  1. GEOSPATIAL ANALYSIS MADE EASY WITH MEZA GeoPython — Basel, Switzerland

    — May 10, 2017 by Reuben Cummings @reubano
  2. WHO AM I? Managing Director, Nerevu Development Founder of Arusha

    Coders Author of several popular Python packages
  3. ME ZA ( GI TH UB .C OM / R

    E UB A NO /M E ZA )
  4. readers converters MEZA OVERVIEW records input output

  5. readers converters MEZA OVERVIEW records input output

  6. readers converters MEZA OVERVIEW records input output

  7. MEZA INPUT/OUTPUT Input Formats Output Formats Array CSV GeoJSON JSON

    GeoJSON MDB CSV/TSV SQLITE DBF XLS(X) JSON YAML HTML
  8. MT. K I L IMA NJ AR O (M OS

    HI , TAN ZA N I A ) Photo Credit: Reuben Cummings
  9. { "type": "FeatureCollection", "features": [ { "type": "Feature", "properties": {

    "peak": "uhuru", "id": 10 }, "geometry": { UHURU_PEAK.GEOJSON
  10. "type": "Point", "coordinates": [ 37.350666, -3.066465 ] } } ]

    } UHURU_PEAK.GEOJSON
  11. { "type": "FeatureCollection", "features": [ { "type": "Feature", "properties": {

    "peak": "kibo", "id": 11 }, "geometry": { KIBO_PEAK.GEOJSON
  12. "type": "Point", "coordinates": [ 37.353333, -3.075833 ] } } ]

    } KIBO_PEAK.GEOJSON
  13. MEZA DEMO

  14. >>> from meza import io >>> >>> records = io.read('kibo_peak.geojson')

    >>> next(records) {'id': 11, 'lat': Decimal('-3.075833'), 'lon': Decimal('37.353333'), 'peak': 'kibo', 'type': 'Point'} MEZA DEMO (READERS)
  15. CHALLENGE #1 MERGING

  16. CHALLENGE #1 MERGING

  17. MEZA DEMO (MERGING) >>> from meza import convert as cv

    >>> >>> paths = ( ... 'uhuru_peak.geojson', ... 'kibo_peak.geojson') >>> >>> records = io.join(*paths) >>> geojson = cv.records2geojson(records) >>> io.write('meza_peaks.geojson', geojson)
  18. { "type": "FeatureCollection", "bbox": [ 37.350666, -3.075833, 37.353333, -3.066465 ],

    "features": [ { MEZA_PEAKS.GEOJSON
  19. "type": "Feature", "id": 10, "geometry": { "type": "Point", "coordinates": [

    37.350666, -3.066465 ] }, "properties": { MEZA_PEAKS.GEOJSON
  20. "id": 10, "peak": "uhuru" } }, { "type": "Feature", "id":

    11, "geometry": { "type": "Point", MEZA_PEAKS.GEOJSON
  21. "coordinates": [ 37.353333, -3.075833 ] }, "properties": { "id": 11,

    "peak": "kibo" } } MEZA_PEAKS.GEOJSON
  22. ], "crs": { "type": "name", "properties": { "name": "urn:ogc:def:crs:OGC:1.3:CRS84" }

    } } MEZA_PEAKS.GEOJSON
  23. >>> records = io.read('meza_peaks.geojson') >>> csv = cv.records2csv(records) >>> io.write('meza_peaks.csv',

    csv) MEZA DEMO (MERGING) $ pip install --user csvkit $ csvlook meza_peaks.csv | id | type | lat | lon | peak | | -- | ----- | ------- | ------- | ----- | | 10 | Point | -3.066… | 37.350… | uhuru | | 11 | Point | -3.075… | 37.353… | kibo |
  24. CHALLENGE #2 SPLIT BY ID

  25. CHALLENGE #2 SPLIT BY ID

  26. >>> for _id, _records in groups: ... f = cv.records2geojson(_records)

    ... io.write(name.format(_id), f) >>> from meza import process as pr >>> >>> records = io.read('meza_peaks.geojson') >>> groups = pr.group(records, 'id') >>> name = 'peak_{}.geojson' >>> MEZA DEMO (SPLIT BY ID)
  27. $ ls peak_* peak_10.geojson peak_11.geojson MEZA DEMO (SPLIT BY ID)

  28. CHALLENGE #3 EXTRACT BY ID

  29. CHALLENGE #3 EXTRACT BY ID

  30. >>> records = io.read('peaks.geojson') >>> groups = pr.group(records, 'id') >>>

    group = next( ... g for g in groups if g[0] == 11) >>> MEZA DEMO (EXTRACT BY ID)
  31. >>> geojson = cv.records2csv(group[1]) >>> io.write('id_11_peaks.csv', geojson) >>> records =

    io.read('peaks.geojson') >>> groups = pr.group(records, 'id') >>> group = next( ... g for g in groups if g[0] == 11) >>> MEZA DEMO (EXTRACT BY ID)
  32. $ csvlook id_11_peaks.csv | id | type | lat |

    lon | peak | | -- | ----- | ------- | ------- | ---- | | 11 | Point | -3.076… | 37.353… | kibo | MEZA DEMO (EXTRACT BY ID)
  33. BUT WAIT, THERE'S MORE! ME ZA D E MO

  34. CHALLENGE #4 EXTRACT BY ID V2

  35. CHALLENGE #4 EXTRACT BY ID V2

  36. MEZA DEMO (EXTRACT BY ID V2) >>> from urllib.request import

    urlopen >>> >>> BASE = 'https://raw.githubusercontent.com' >>> REPO = 'drei01/geojson-world-cities' >>> path = '{}/{}/master/cities.geojson' >>> url = path.format(BASE, REPO) >>> f = urlopen(url) >>> records = io.read_geojson(f)
  37. MEZA DEMO (EXTRACT BY ID V2) >>> next(records) {'NAME': 'TORSHAVN',

    'id': None, 'lat': Decimal('62.015167236328125'), 'lon': Decimal('-6.758638858795166'), 'pos': 0, 'type': 'Polygon'}
  38. MEZA DEMO (EXTRACT BY ID V2) >>> clean = (

    ... r for r in records if r.get('NAME')) >>> >>> splits = pr.split( ... clean, 'NAME', chunksize=1024) >>> >>> b_splits = ( ... s for s in splits if 'BASE' in s[1]) >>> >>> name = 'base_cities.csv'
  39. MEZA DEMO (EXTRACT BY ID V2) >>> for pos, split

    in enumerate(b_splits): ... f = cv.records2csv( ... split[0], skip_header=pos) ... ... io.write(name, f, mode='ab+')
  40. $ csvstat base_cities.csv | tail -n12 6. "NAME" Unique values:

    4 Most common values: BASEL (102x) KABASELE-PANIA (23x) MATSUBASE (17x) WABBASEKA (10x) Row count: 152 MEZA DEMO (EXTRACT BY ID)
  41. MEZA DEMO (EXTRACT BY ID) $ csvcut -c NAME,lon,lat base_cities.csv

    \ | csvlook --max-rows 3 | NAME | lon | lat | | ----- | ------ | ------- | | BASEL | 7.549… | 47.544… | | BASEL | 7.544… | 47.545… | | BASEL | 7.539… | 47.547… | | ... | ... | ... |
  42. Reuben Cummings @reubano THANKS!