>>> records = io.read('meza_peaks.geojson')
>>> csv = cv.records2csv(records)
>>> io.write('meza_peaks.csv', csv)
MEZA DEMO (MERGING)
$ pip install --user csvkit
$ csvlook meza_peaks.csv
| id | type | lat | lon | peak |
| -- | ----- | ------- | ------- | ----- |
| 10 | Point | -3.066… | 37.350… | uhuru |
| 11 | Point | -3.075… | 37.353… | kibo |
Slide 24
Slide 24 text
CHALLENGE #2 SPLIT BY ID
Slide 25
Slide 25 text
CHALLENGE #2 SPLIT BY ID
Slide 26
Slide 26 text
>>> for _id, _records in groups:
... f = cv.records2geojson(_records)
... io.write(name.format(_id), f)
>>> from meza import process as pr
>>>
>>> records = io.read('meza_peaks.geojson')
>>> groups = pr.group(records, 'id')
>>> name = 'peak_{}.geojson'
>>>
MEZA DEMO (SPLIT BY ID)
Slide 27
Slide 27 text
$ ls peak_*
peak_10.geojson peak_11.geojson
MEZA DEMO (SPLIT BY ID)
Slide 28
Slide 28 text
CHALLENGE #3 EXTRACT BY ID
Slide 29
Slide 29 text
CHALLENGE #3 EXTRACT BY ID
Slide 30
Slide 30 text
>>> records = io.read('peaks.geojson')
>>> groups = pr.group(records, 'id')
>>> group = next(
... g for g in groups if g[0] == 11)
>>>
MEZA DEMO (EXTRACT BY ID)
Slide 31
Slide 31 text
>>> geojson = cv.records2csv(group[1])
>>> io.write('id_11_peaks.csv', geojson)
>>> records = io.read('peaks.geojson')
>>> groups = pr.group(records, 'id')
>>> group = next(
... g for g in groups if g[0] == 11)
>>>
MEZA DEMO (EXTRACT BY ID)
Slide 32
Slide 32 text
$ csvlook id_11_peaks.csv
| id | type | lat | lon | peak |
| -- | ----- | ------- | ------- | ---- |
| 11 | Point | -3.076… | 37.353… | kibo |
MEZA DEMO (EXTRACT BY ID)
Slide 33
Slide 33 text
BUT WAIT,
THERE'S MORE!
ME ZA D E MO
Slide 34
Slide 34 text
CHALLENGE #4 EXTRACT BY ID V2
Slide 35
Slide 35 text
CHALLENGE #4 EXTRACT BY ID V2
Slide 36
Slide 36 text
MEZA DEMO (EXTRACT BY ID V2)
>>> from urllib.request import urlopen
>>>
>>> BASE = 'https://raw.githubusercontent.com'
>>> REPO = 'drei01/geojson-world-cities'
>>> path = '{}/{}/master/cities.geojson'
>>> url = path.format(BASE, REPO)
>>> f = urlopen(url)
>>> records = io.read_geojson(f)
Slide 37
Slide 37 text
MEZA DEMO (EXTRACT BY ID V2)
>>> next(records)
{'NAME': 'TORSHAVN',
'id': None,
'lat': Decimal('62.015167236328125'),
'lon': Decimal('-6.758638858795166'),
'pos': 0,
'type': 'Polygon'}
Slide 38
Slide 38 text
MEZA DEMO (EXTRACT BY ID V2)
>>> clean = (
... r for r in records if r.get('NAME'))
>>>
>>> splits = pr.split(
... clean, 'NAME', chunksize=1024)
>>>
>>> b_splits = (
... s for s in splits if 'BASE' in s[1])
>>>
>>> name = 'base_cities.csv'
Slide 39
Slide 39 text
MEZA DEMO (EXTRACT BY ID V2)
>>> for pos, split in enumerate(b_splits):
... f = cv.records2csv(
... split[0], skip_header=pos)
...
... io.write(name, f, mode='ab+')
Slide 40
Slide 40 text
$ csvstat base_cities.csv | tail -n12
6. "NAME"
Unique values: 4
Most common values: BASEL (102x)
KABASELE-PANIA (23x)
MATSUBASE (17x)
WABBASEKA (10x)
Row count: 152
MEZA DEMO (EXTRACT BY ID)