Geospatial three amigos: Python, Leaflet, and ElasticSearch

Geospatial three amigos: Python, Leaflet, and ElasticSearch

943620c3bc4056a40ce132690f1d9ac1?s=128

Roberto Rosario

April 07, 2017
Tweet

Transcript

  1. Geospatial three amigos: Python, Leaflet, and ElasticSearch Roberto Rosario

  2. Guest appearance Docker

  3. Who am I?

  4. Who am I? robertorosario.com

  5. Who am I?

  6. Who am I?

  7. Who am I?

  8. My map work

  9. My map work

  10. Learned things the hard way.

  11. Problems with monolithic solutions https://www.youtube.com/watch?v =P4qCp_js2aA

  12. Problems with monolithic solutions

  13. Problems with monolithic solutions

  14. Problems with monolithic solutions

  15. Problems with monolithic solutions

  16. Problems with monolithic solutions

  17. Problems with monolithic solutions It was cool! And free! But...

  18. Problems with monolithic solutions

  19. Problems with monolithic solutions

  20. Problems with monolithic solutions

  21. Problems with monolithic solutions

  22. Problems with monolithic solutions Some say software is more social

    than technical. If that’s the case then we have to deal with a social reality.
  23. Problems with monolithic solutions Hard stuff makes people feel dumb.

    Flashy, easy stuff makes them feel smart.
  24. Problems with monolithic solutions 95% of your product users are

    of the second type.
  25. Problems with monolithic solutions Code for the 95% of use

    cases.
  26. Problems with monolithic solutions Forget about the stuff that is

    cool to you.
  27. Problems with monolithic solutions Even if it is a framework

    you like.
  28. Problems with monolithic solutions • Easy to get started, hard

    to maintain • Geospatial support in Django changes a lot • Packaging of geospatial libraries is well... • Only Postgres is feasible • Django ORM is too slow for real time geospatial • Serializing to GeoJSON in Django is slow • No native support for serializing to GeoJSON • No REST API querying solution • Indexing is a nightmare • Django ORM is not meant to be dynamic
  29. Problems with monolithic solutions In conclusion: Django is not a

    good platform for open data geospatial applications.
  30. Problems with monolithic solutions

  31. Problems with monolithic solutions Don’t kill me just yet… Here

    is one more :)
  32. Problems with monolithic solutions • Admin interface • Templating •

    URL router • Forms • Migrations • File storage • Test framework • Validators
  33. When it comes to open data geospatial apps, they are

    as useful as... Problems with monolithic solutions
  34. Problems with monolithic solutions

  35. Problems with monolithic solutions Can I have your watch when

    you are dead?
  36. What is the 95% of use cases?

  37. 95% of use cases Most spatial apps are lightweight on

    the frontend and the backend. Most spatial apps just do a simple fetch of geometries.
  38. 95% of use cases Do the heavy lifting during data

    loading, so that retrieval and usage are fast and lightweight.
  39. Project guidelines

  40. Layout Free software is not just an idea but an

    ecosystem with a lot of software available. How much you ask?
  41. Layout

  42. Layout Use Docker as a packaging solution. All elements of

    our stack are already available as images.
  43. Layout Use GeoJSON. Simpler, better supported. CartoDB is cool, and

    probably better, but our frontend won’t be doing projection transformations.
  44. Layout Use geometries for all elements.

  45. Layout Nowadays, there is no point in using just points

    :)
  46. Layout

  47. Layout Keep your frontend code and design simple, spatial is

    complicated enough. Calcite Maps and jQuery are your friends.
  48. Layout

  49. Layout When it comes to data: Fetch only what you

    need. Your app may be fast but there is still network latency and throughput. If you can’t reduce data, compress. This can be done almost transparently.
  50. Layout Resist temptation to do data processing on the frontend,

    not everybody has your same portable mainframe as you.
  51. Layout The browser is not a compute node, it a

    process with inherent OS limits, it is just a viewer.
  52. Layout Framework, Compiler, Transpiler, Code Translator, etc, etc, etc.

  53. Layout

  54. Layout For every library you link you also include unused

    code.
  55. Layout A.K.A. “Dead weight”

  56. Layout

  57. Layout

  58. Layout

  59. Layout Too many data points, use layers and clustering.

  60. Layout

  61. Layout

  62. Layout

  63. Layout

  64. Layout Don’t over engineer your app. It’s an app not

    a platform.
  65. Layout - Data loading ETL

  66. Layout - Data loading ETL = Extract, Transform, Load

  67. Layout - Data loading Extract = Get the data Transform

    = Fix the data Load = Put in the your datastore
  68. Layout - Data loading

  69. Layout - Database • Easy to get started, hard to

    maintain Use Docker images • Geospatial support in Django changes a lot ElasticSearch • Packaging of geospatial libraries is well… ElasticSearch • Only Postgres is feasible ElasticSearch • Django ORM is too slow for real time geospatial ElasticSearch • Serializing to GeoJSON in Django is slow ElasticSearch • No native support for serializing to GeoJSON ElasticSearch • No REST API querying solution ElasticSearch • Indexing is a nightmare ElasticSearch • Django ORM is not meant to be dynamic ElasticSearch
  70. Layout - Database

  71. Layout - Database • Lucene based • Distributed • Multitenant

    • Full text search • HTTP native • Schema-free JSON documents • Supports basic geospatial searches
  72. Layout - Frontend

  73. Layout - Frontend • Simple • Fast • Stable •

    Good API documentation • Easily to extend via plugins
  74. Layout - Final CSV Shape pETL Elastic Calcite jQuery Leaflet

    Webserver (NGINX) GeoJSON
  75. Datastructure

  76. Datastructure

  77. Datastructure

  78. Datastructure

  79. Datastructure

  80. Datastructure

  81. Datastructure

  82. Datastructure

  83. Datastructure http://127.0.0.1:9200/_search/?source={"query":{"geo_shape":{"coordinates": {"shape":{"type":"circle","coordinates":[-66.36703491210938,18.302380604 025146],"radius":1000}}}}}

  84. Datastructure {"took":10,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits ":{"total":270966,"max_score":1.0,"hits":[{"_index":"location-app","_type":"per mit","_id":"AVsNr7iq_25KFlR_wOFL","_score":1.0,"_source":{"geojson": {"geometry": {"type": "Point", "coordinates": ["-66.438255310058494", "18.354152679443299"]},

    "type": "Feature", "properties": {"Status": "Expedido", "Catastro": "108-070-011-50", "Publico o Privado": "Privado", "Dueño del Proyecto": "EDWIN GONZÁLEZ VEGA", "Caso": "2012-008787-PCO-67440", "Trámite": "Permiso Construccion Cert.", "Nombre del Proyecto": "BAP Reconstrucción Edificio Comercial", "Costo Estimado": "$177076.00"}}, "coordinates": {"type": "point", "coordinates": ["-66.438255310058494", "18.354152679443299"]}}}]}}
  85. Datastructure {"took":10,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits ":{"total":270966,"max_score":1.0,"hits":[{"_index":"location-app","_type":"per mit","_id":"AVsNr7iq_25KFlR_wOFL","_score":1.0,"_source":{"geojson": {"geometry": {"type": "Point", "coordinates": ["-66.438255310058494", "18.354152679443299"]},

    "type": "Feature", "properties": {"Status": "Expedido", "Catastro": "108-070-011-50", "Publico o Privado": "Privado", "Dueño del Proyecto": "EDWIN GONZÁLEZ VEGA", "Caso": "2012-008787-PCO-67440", "Trámite": "Permiso Construccion Cert.", "Nombre del Proyecto": "BAP Reconstrucción Edificio Comercial", "Costo Estimado": "$177076.00"}}, "coordinates": {"type": "point", "coordinates": ["-66.438255310058494", "18.354152679443299"]}}}]}}
  86. Datastructure {"took":10,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits ":{"total":270966,"max_score":1.0,"hits":[{"_index":"location-app","_type":"per mit","_id":"AVsNr7iq_25KFlR_wOFL","_score":1.0,"_source":{"geojson": {"geometry": {"type": "Point", "coordinates": ["-66.438255310058494", "18.354152679443299"]},

    "type": "Feature", "properties": {"Status": "Expedido", "Catastro": "108-070-011-50", "Publico o Privado": "Privado", "Dueño del Proyecto": "EDWIN GONZÁLEZ VEGA", "Caso": "2012-008787-PCO-67440", "Trámite": "Permiso Construccion Cert.", "Nombre del Proyecto": "BAP Reconstrucción Edificio Comercial", "Costo Estimado": "$177076.00"}}, "coordinates": {"type": "point", "coordinates": ["-66.438255310058494", "18.354152679443299"]}}}]}}
  87. Datastructure {"took":10,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits ":{"total":270966,"max_score":1.0,"hits":[{"_index":"location-app","_type":"per mit","_id":"AVsNr7iq_25KFlR_wOFL","_score":1.0,"_source":{"geojson": {"geometry": {"type": "Point", "coordinates": ["-66.438255310058494", "18.354152679443299"]},

    "type": "Feature", "properties": {"Status": "Expedido", "Catastro": "108-070-011-50", "Publico o Privado": "Privado", "Dueño del Proyecto": "EDWIN GONZÁLEZ VEGA", "Caso": "2012-008787-PCO-67440", "Trámite": "Permiso Construccion Cert.", "Nombre del Proyecto": "BAP Reconstrucción Edificio Comercial", "Costo Estimado": "$177076.00"}}, "coordinates": {"type": "point", "coordinates": ["-66.438255310058494", "18.354152679443299"]}}}]}}
  88. Datastructure {"took":10,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits ":{"total":270966,"max_score":1.0,"hits":[{"_index":"location-app","_type":"per mit","_id":"AVsNr7iq_25KFlR_wOFL","_score":1.0,"_source":{"geojson": {"geometry": {"type": "Point", "coordinates": ["-66.438255310058494", "18.354152679443299"]},

    "type": "Feature", "properties": {"Status": "Expedido", "Catastro": "108-070-011-50", "Publico o Privado": "Privado", "Dueño del Proyecto": "EDWIN GONZÁLEZ VEGA", "Caso": "2012-008787-PCO-67440", "Trámite": "Permiso Construccion Cert.", "Nombre del Proyecto": "BAP Reconstrucción Edificio Comercial", "Costo Estimado": "$177076.00"}}, "coordinates": {"type": "point", "coordinates": ["-66.438255310058494", "18.354152679443299"]}}}]}}
  89. Python loader class Dataset(object): # Class methods def register, all,

    get, execute # Instance methods def get_filename(self): def download_file(self): def check_cache(self): def extract(self): raise NotImplementedError def transform(self): raise NotImplementedError def process(self): # extract(), transform(), load() def iterator(self): def load(self): # ES interface
  90. Python loader

  91. Python loader

  92. Python loader

  93. Python loader

  94. Python loader

  95. Python loader

  96. Python loader

  97. Python loader

  98. Python loader Parse date and time Inverted coordinates

  99. Python loader Spatial data file? You need the mini three

    amigos.
  100. Python loader

  101. Python loader Fiona, PyProj, and Shapely

  102. Python loader

  103. Python loader

  104. Python loader

  105. Python loader

  106. GIS theory is not that hard

  107. Python loader

  108. Python loader

  109. Frontend

  110. Frontend - Template

  111. Frontend - Template

  112. Frontend - Template

  113. Frontend - Template

  114. Frontend - ElasticSearch Query client Simple wrapper of jQuery’s .getJSON()

  115. Frontend - Dataset

  116. Frontend - Dataset

  117. Frontend - Dataset

  118. Frontend - Dataset

  119. Frontend - Dataset

  120. Frontend - Dataset

  121. Frontend - Dataset

  122. Frontend - Dataset

  123. Frontend - Dataset

  124. Frontend - Dataset

  125. Frontend - Dataset

  126. Frontend - Dataset

  127. Frontend - Dataset

  128. Three Amigos finally in sync.

  129. None
  130. Next, delivery.

  131. Putting the three amigos in a single mule.

  132. None
  133. A.k.a. “Packaging”

  134. Packaging

  135. Packaging

  136. Packaging

  137. Packaging A bit of security to avoid unfortunate consequences.

  138. Packaging

  139. Packaging

  140. None
  141. We are almost there, hold on a bit longer.

  142. Deploying

  143. Deploying

  144. Deploying

  145. Deploying

  146. Deploying

  147. Deploying Easy to deploy, backup, restore, maintain, explain, and update.

  148. Deploying Happy dance time!

  149. Deploying

  150. Live performance a.k.a. “Demo”

  151. None
  152. Questions?

  153. Thank you! Roberto Rosario RobertoRosario.com