Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Geospatial three amigos: Python, Leaflet, and E...

Geospatial three amigos: Python, Leaflet, and ElasticSearch

Roberto Rosario

April 07, 2017
Tweet

More Decks by Roberto Rosario

Other Decks in How-to & DIY

Transcript

  1. Problems with monolithic solutions Some say software is more social

    than technical. If that’s the case then we have to deal with a social reality.
  2. Problems with monolithic solutions • Easy to get started, hard

    to maintain • Geospatial support in Django changes a lot • Packaging of geospatial libraries is well... • Only Postgres is feasible • Django ORM is too slow for real time geospatial • Serializing to GeoJSON in Django is slow • No native support for serializing to GeoJSON • No REST API querying solution • Indexing is a nightmare • Django ORM is not meant to be dynamic
  3. Problems with monolithic solutions In conclusion: Django is not a

    good platform for open data geospatial applications.
  4. Problems with monolithic solutions • Admin interface • Templating •

    URL router • Forms • Migrations • File storage • Test framework • Validators
  5. When it comes to open data geospatial apps, they are

    as useful as... Problems with monolithic solutions
  6. 95% of use cases Most spatial apps are lightweight on

    the frontend and the backend. Most spatial apps just do a simple fetch of geometries.
  7. 95% of use cases Do the heavy lifting during data

    loading, so that retrieval and usage are fast and lightweight.
  8. Layout Free software is not just an idea but an

    ecosystem with a lot of software available. How much you ask?
  9. Layout Use Docker as a packaging solution. All elements of

    our stack are already available as images.
  10. Layout Use GeoJSON. Simpler, better supported. CartoDB is cool, and

    probably better, but our frontend won’t be doing projection transformations.
  11. Layout Keep your frontend code and design simple, spatial is

    complicated enough. Calcite Maps and jQuery are your friends.
  12. Layout When it comes to data: Fetch only what you

    need. Your app may be fast but there is still network latency and throughput. If you can’t reduce data, compress. This can be done almost transparently.
  13. Layout Resist temptation to do data processing on the frontend,

    not everybody has your same portable mainframe as you.
  14. Layout The browser is not a compute node, it a

    process with inherent OS limits, it is just a viewer.
  15. Layout - Data loading Extract = Get the data Transform

    = Fix the data Load = Put in the your datastore
  16. Layout - Database • Easy to get started, hard to

    maintain Use Docker images • Geospatial support in Django changes a lot ElasticSearch • Packaging of geospatial libraries is well… ElasticSearch • Only Postgres is feasible ElasticSearch • Django ORM is too slow for real time geospatial ElasticSearch • Serializing to GeoJSON in Django is slow ElasticSearch • No native support for serializing to GeoJSON ElasticSearch • No REST API querying solution ElasticSearch • Indexing is a nightmare ElasticSearch • Django ORM is not meant to be dynamic ElasticSearch
  17. Layout - Database • Lucene based • Distributed • Multitenant

    • Full text search • HTTP native • Schema-free JSON documents • Supports basic geospatial searches
  18. Layout - Frontend • Simple • Fast • Stable •

    Good API documentation • Easily to extend via plugins
  19. Datastructure {"took":10,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits ":{"total":270966,"max_score":1.0,"hits":[{"_index":"location-app","_type":"per mit","_id":"AVsNr7iq_25KFlR_wOFL","_score":1.0,"_source":{"geojson": {"geometry": {"type": "Point", "coordinates": ["-66.438255310058494", "18.354152679443299"]},

    "type": "Feature", "properties": {"Status": "Expedido", "Catastro": "108-070-011-50", "Publico o Privado": "Privado", "Dueño del Proyecto": "EDWIN GONZÁLEZ VEGA", "Caso": "2012-008787-PCO-67440", "Trámite": "Permiso Construccion Cert.", "Nombre del Proyecto": "BAP Reconstrucción Edificio Comercial", "Costo Estimado": "$177076.00"}}, "coordinates": {"type": "point", "coordinates": ["-66.438255310058494", "18.354152679443299"]}}}]}}
  20. Datastructure {"took":10,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits ":{"total":270966,"max_score":1.0,"hits":[{"_index":"location-app","_type":"per mit","_id":"AVsNr7iq_25KFlR_wOFL","_score":1.0,"_source":{"geojson": {"geometry": {"type": "Point", "coordinates": ["-66.438255310058494", "18.354152679443299"]},

    "type": "Feature", "properties": {"Status": "Expedido", "Catastro": "108-070-011-50", "Publico o Privado": "Privado", "Dueño del Proyecto": "EDWIN GONZÁLEZ VEGA", "Caso": "2012-008787-PCO-67440", "Trámite": "Permiso Construccion Cert.", "Nombre del Proyecto": "BAP Reconstrucción Edificio Comercial", "Costo Estimado": "$177076.00"}}, "coordinates": {"type": "point", "coordinates": ["-66.438255310058494", "18.354152679443299"]}}}]}}
  21. Datastructure {"took":10,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits ":{"total":270966,"max_score":1.0,"hits":[{"_index":"location-app","_type":"per mit","_id":"AVsNr7iq_25KFlR_wOFL","_score":1.0,"_source":{"geojson": {"geometry": {"type": "Point", "coordinates": ["-66.438255310058494", "18.354152679443299"]},

    "type": "Feature", "properties": {"Status": "Expedido", "Catastro": "108-070-011-50", "Publico o Privado": "Privado", "Dueño del Proyecto": "EDWIN GONZÁLEZ VEGA", "Caso": "2012-008787-PCO-67440", "Trámite": "Permiso Construccion Cert.", "Nombre del Proyecto": "BAP Reconstrucción Edificio Comercial", "Costo Estimado": "$177076.00"}}, "coordinates": {"type": "point", "coordinates": ["-66.438255310058494", "18.354152679443299"]}}}]}}
  22. Datastructure {"took":10,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits ":{"total":270966,"max_score":1.0,"hits":[{"_index":"location-app","_type":"per mit","_id":"AVsNr7iq_25KFlR_wOFL","_score":1.0,"_source":{"geojson": {"geometry": {"type": "Point", "coordinates": ["-66.438255310058494", "18.354152679443299"]},

    "type": "Feature", "properties": {"Status": "Expedido", "Catastro": "108-070-011-50", "Publico o Privado": "Privado", "Dueño del Proyecto": "EDWIN GONZÁLEZ VEGA", "Caso": "2012-008787-PCO-67440", "Trámite": "Permiso Construccion Cert.", "Nombre del Proyecto": "BAP Reconstrucción Edificio Comercial", "Costo Estimado": "$177076.00"}}, "coordinates": {"type": "point", "coordinates": ["-66.438255310058494", "18.354152679443299"]}}}]}}
  23. Datastructure {"took":10,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits ":{"total":270966,"max_score":1.0,"hits":[{"_index":"location-app","_type":"per mit","_id":"AVsNr7iq_25KFlR_wOFL","_score":1.0,"_source":{"geojson": {"geometry": {"type": "Point", "coordinates": ["-66.438255310058494", "18.354152679443299"]},

    "type": "Feature", "properties": {"Status": "Expedido", "Catastro": "108-070-011-50", "Publico o Privado": "Privado", "Dueño del Proyecto": "EDWIN GONZÁLEZ VEGA", "Caso": "2012-008787-PCO-67440", "Trámite": "Permiso Construccion Cert.", "Nombre del Proyecto": "BAP Reconstrucción Edificio Comercial", "Costo Estimado": "$177076.00"}}, "coordinates": {"type": "point", "coordinates": ["-66.438255310058494", "18.354152679443299"]}}}]}}
  24. Python loader class Dataset(object): # Class methods def register, all,

    get, execute # Instance methods def get_filename(self): def download_file(self): def check_cache(self): def extract(self): raise NotImplementedError def transform(self): raise NotImplementedError def process(self): # extract(), transform(), load() def iterator(self): def load(self): # ES interface