$30 off During Our Annual Pro Sale. View Details »

How to analyze and visualize geo-data with the Elastic Stack

Elastic Co
June 26, 2017
190

How to analyze and visualize geo-data with the Elastic Stack

Talk by Thomas Neirynck and Brandon Kobel at Code PaLOUsa on June 8, 2017.

Elastic Co

June 26, 2017
Tweet

Transcript

  1. 1
    Thomas Neirynck, Brandon Kobel
    [email protected],
    [email protected]
    How to analyze and
    visualize geo-data with the
    Elastic Stack

    View Slide

  2. 2
    What is the Elastic Stack?
    • Store and search data with Elasticsearch
    • Move data into Elasticsearch with
    − Logstash
    − Beats
    • Visualize data and administer the stack
    with Kibana

    View Slide

  3. 3
    What is the Elastic Stack used for?
    • Document search
    − Support for multiple languages
    • Log analytics
    − Server logs, application usage, time-based data
    • System monitoring
    − Real time health watches

    View Slide

  4. 4
    What will we do in this presentation?
    • Full round-trip
    − Ingest data into Elasticsearch with Logstash
    − Build Kibana application to generate insights
    • Pay attention to geo-features across stack
    • … and enrich analytical experience with machine learning
    Build application to analyze traffic accident data in NYC

    View Slide

  5. 5
    The data source
    • NYC traffic accident data
    − https://opendata.cityofnewyork.us/
    − +1,000,0000 traffic incidents, since July 2012
    • Tabular format
    − Fields indicate where and when, number of injuries and fatalities, type of
    vehicles involved

    View Slide

  6. 6
    … but Elasticsearch requires JSON documents
    • This document must conform to a `mapping`.
    − Field-values have to correspond to a datatype (date, numbers, text, ...)
    − Mapping informs how values are indexed at ingest-time (and this impacts
    if/how they can be searched for at query-time)

    View Slide

  7. 7
    Field datatypes for geo-data
    • geo_point
    − Several representations

    Numeric: Object with lon/lat keys, array with two numbers

    String: Lon-lat string, Geohash
    − Supported by Kibana
    • geo_shape
    − Simple Feature data model (point, line, polygon, collections), envelopes,
    circles
    − Not supported by Kibana

    View Slide

  8. 8
    Using Logstash for ingestion
    • What:
    − Transform data (e.g. tabular format → JSON document)
    − Ensure field values conform to the mapping
    − Store documents in Elasticsearch
    • `Pipeline`
    − Data source is a stream of events
    − Series of steps to transform these events (filters)
    − Configuration of this pipe is programmable

    View Slide

  9. 9
    ...
    filter {
    csv {
    Columns => ["date","time","borough","zip_code","latitude","longitude",
    ...]
    }
    ...
    #If the event contains latitude and lon
    if [latitude] and [longitude] {
    mutate { convert => {"latitude" => "float"} }
    mutate { convert => {"longitude" => "float"} }
    mutate { rename => {"latitude" => "[coords][lat]"} }
    mutate { rename => {"longitude" => "[coords][lon]"} }
    }
    "properties" : {
    "coords" : {
    "type" : "geo_point"
    },
    "_source": {
    ...
    "coords": {
    "lon": -73.825516,
    "lat": 40.753
    },
    ...
    }
    > cat sourcedata.csv | /path/to/logstash/bin/logstash -f logstash.conf

    View Slide

  10. 10
    Kibana is window into the Elastic Stack
    • Index Patterns
    − Points Kibana to one or more indices in Elasticsearch that share the same
    mappings
    − Manage

    Time-based values

    Formatting of values for display

    Scripted fields for calculating values at query-time (<> logstash
    transformation at ingest-time)

    View Slide

  11. 11
    Kibana Visualizations
    • Use the Elasticsearch _search API
    − REST-API with JSON-base query language
    − Can aggregate results (similar to “group by” in SQL)
    Kibana Visualizations display the result of aggregations, not the values of
    individual documents
    − This scales better
    − Different data-types have different type of roll-up

    e.g.

    seconds, minutes, hours for date values

    ranges for number values

    View Slide

  12. 12
    • Uses the “geohash” grid aggregation
    − string-hash of a location, with a notion of precision/scale
    − Corresponds to a grid-cell area on the earth

    dng18 +- 1.5 mile error

    dng18e8w +- 50ft error
    • Uses geo-centroid positioning
    − Weighted center of location of all the results in the geohash grid
    Coordinate Map Visualization

    View Slide

  13. 13
    Detour: the Elastic Geo Service and X-pack
    • Default map service and data used by Kibana
    − road map image service
    − example boundary data (world countries and US States)
    • Requires X-Pack install for access to all zoom levels
    • Link outside services to Kibana
    − image services
    ● TMS: http://my.map.service/{z}/{x}/{y}.png
    ● OGC-WMS
    − geojson boundary data

    View Slide

  14. 14
    Region Map Visualization
    • Create choropleth maps
    • Inner join of results of “terms” aggregation with reference shape data
    • Link custom data-service in config/kibana.yml (requires CORS support)
    regionmap:
    layers:
    - name: "NYC Boroughs (self-hosted)"
    url: "http://localhost/region_map/data/nyc_boroughs.json"
    fields:
    - name: "name"
    description: "Borough Name"
    - name: "NYC Council districts (self-hosted)"
    url: "http://localhost/region_map/data/nyc_councildistricts.json"
    fields:
    - name: "CounsilDist"
    description: "District #"

    View Slide

  15. Kibana - Dashboard Demo

    View Slide

  16. 16
    Elastic Cloud
    Hosted Elasticsearch and Kibana
    Latest versions of
    Elasticsearch and Kibana
    One-click scaling and
    upgrading; no downtime
    Built-in security (auth,
    encryption, role-access)
    Option for dedicated SLA
    support and X-Pack
    The only offering created
    and managed by Elastic
    Free Kibana and backups
    every 30 minutes

    View Slide

  17. QUESTIONS??????????
    @elastic
    www.elastic.co

    View Slide