Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Geospatial Analysis Made Easy with meza

Geospatial Analysis Made Easy with meza

This talk was given at GeoPython and is about using meza for GeoJSON analysis. Code used in this talk is available at https://gist.github.com/reubano/5ba3a3b850fe4c1e5ee497f325111ba0.

Reuben Cummings

May 10, 2017
Tweet

More Decks by Reuben Cummings

Other Decks in Programming

Transcript

  1. WHO AM I? Managing Director, Nerevu Development Founder of Arusha

    Coders Author of several popular Python packages
  2. ME ZA ( GI TH UB .C OM / R

    E UB A NO /M E ZA )
  3. MEZA INPUT/OUTPUT Input Formats Output Formats Array CSV GeoJSON JSON

    GeoJSON MDB CSV/TSV SQLITE DBF XLS(X) JSON YAML HTML
  4. MT. K I L IMA NJ AR O (M OS

    HI , TAN ZA N I A ) Photo Credit: Reuben Cummings
  5. { "type": "FeatureCollection", "features": [ { "type": "Feature", "properties": {

    "peak": "uhuru", "id": 10 }, "geometry": { UHURU_PEAK.GEOJSON
  6. { "type": "FeatureCollection", "features": [ { "type": "Feature", "properties": {

    "peak": "kibo", "id": 11 }, "geometry": { KIBO_PEAK.GEOJSON
  7. >>> from meza import io >>> >>> records = io.read('kibo_peak.geojson')

    >>> next(records) {'id': 11, 'lat': Decimal('-3.075833'), 'lon': Decimal('37.353333'), 'peak': 'kibo', 'type': 'Point'} MEZA DEMO (READERS)
  8. MEZA DEMO (MERGING) >>> from meza import convert as cv

    >>> >>> paths = ( ... 'uhuru_peak.geojson', ... 'kibo_peak.geojson') >>> >>> records = io.join(*paths) >>> geojson = cv.records2geojson(records) >>> io.write('meza_peaks.geojson', geojson)
  9. "type": "Feature", "id": 10, "geometry": { "type": "Point", "coordinates": [

    37.350666, -3.066465 ] }, "properties": { MEZA_PEAKS.GEOJSON
  10. "id": 10, "peak": "uhuru" } }, { "type": "Feature", "id":

    11, "geometry": { "type": "Point", MEZA_PEAKS.GEOJSON
  11. >>> records = io.read('meza_peaks.geojson') >>> csv = cv.records2csv(records) >>> io.write('meza_peaks.csv',

    csv) MEZA DEMO (MERGING) $ pip install --user csvkit $ csvlook meza_peaks.csv | id | type | lat | lon | peak | | -- | ----- | ------- | ------- | ----- | | 10 | Point | -3.066… | 37.350… | uhuru | | 11 | Point | -3.075… | 37.353… | kibo |
  12. >>> for _id, _records in groups: ... f = cv.records2geojson(_records)

    ... io.write(name.format(_id), f) >>> from meza import process as pr >>> >>> records = io.read('meza_peaks.geojson') >>> groups = pr.group(records, 'id') >>> name = 'peak_{}.geojson' >>> MEZA DEMO (SPLIT BY ID)
  13. >>> records = io.read('peaks.geojson') >>> groups = pr.group(records, 'id') >>>

    group = next( ... g for g in groups if g[0] == 11) >>> MEZA DEMO (EXTRACT BY ID)
  14. >>> geojson = cv.records2csv(group[1]) >>> io.write('id_11_peaks.csv', geojson) >>> records =

    io.read('peaks.geojson') >>> groups = pr.group(records, 'id') >>> group = next( ... g for g in groups if g[0] == 11) >>> MEZA DEMO (EXTRACT BY ID)
  15. $ csvlook id_11_peaks.csv | id | type | lat |

    lon | peak | | -- | ----- | ------- | ------- | ---- | | 11 | Point | -3.076… | 37.353… | kibo | MEZA DEMO (EXTRACT BY ID)
  16. MEZA DEMO (EXTRACT BY ID V2) >>> from urllib.request import

    urlopen >>> >>> BASE = 'https://raw.githubusercontent.com' >>> REPO = 'drei01/geojson-world-cities' >>> path = '{}/{}/master/cities.geojson' >>> url = path.format(BASE, REPO) >>> f = urlopen(url) >>> records = io.read_geojson(f)
  17. MEZA DEMO (EXTRACT BY ID V2) >>> next(records) {'NAME': 'TORSHAVN',

    'id': None, 'lat': Decimal('62.015167236328125'), 'lon': Decimal('-6.758638858795166'), 'pos': 0, 'type': 'Polygon'}
  18. MEZA DEMO (EXTRACT BY ID V2) >>> clean = (

    ... r for r in records if r.get('NAME')) >>> >>> splits = pr.split( ... clean, 'NAME', chunksize=1024) >>> >>> b_splits = ( ... s for s in splits if 'BASE' in s[1]) >>> >>> name = 'base_cities.csv'
  19. MEZA DEMO (EXTRACT BY ID V2) >>> for pos, split

    in enumerate(b_splits): ... f = cv.records2csv( ... split[0], skip_header=pos) ... ... io.write(name, f, mode='ab+')
  20. $ csvstat base_cities.csv | tail -n12 6. "NAME" Unique values:

    4 Most common values: BASEL (102x) KABASELE-PANIA (23x) MATSUBASE (17x) WABBASEKA (10x) Row count: 152 MEZA DEMO (EXTRACT BY ID)
  21. MEZA DEMO (EXTRACT BY ID) $ csvcut -c NAME,lon,lat base_cities.csv

    \ | csvlook --max-rows 3 | NAME | lon | lat | | ----- | ------ | ------- | | BASEL | 7.549… | 47.544… | | BASEL | 7.544… | 47.545… | | BASEL | 7.539… | 47.547… | | ... | ... | ... |