Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Searching OpenStreetMap with Elasticsearch

Searching OpenStreetMap with Elasticsearch

Christoph Lingg

November 24, 2015
Tweet

Other Decks in Programming

Transcript

  1. Searching OpenStreetMap
    with Elasticsearch
    photon.komoot.de
    Christoph Lingg – [email protected]

    View full-size slide

  2. OpenStreetMap
    ● free
    ● editable map
    ● worldwide
    ● built by volunteers
    ● open-content licence

    View full-size slide

  3. OpenStreetMap

    View full-size slide

  4. OpenStreetMap

    View full-size slide

  5. Geocoding
    „Böcklerpark“
    Coordinate(10.1 50.2)
    Park
    Kreuzberg, Berlin, Germany

    View full-size slide

  6. Nominatim
    ● started many years ago
    ● standard solution for OSM
    ● based on Postgres/Postgis, SQL, PHP, C
    ● powerfull + mature
    ● drawbacks
    ○ slow
    ○ no typo tolerance
    ○ no partial match / search-as-you-type

    View full-size slide

  7. Closing the Gap
    ● suitable for mobile usage
    ● “built on the shoulder of a giant”
    → elasticsearch
    ● open source
    ● free public API
    ● enhance the OSM ecosystem

    View full-size slide

  8. {
    "name": "VIASKO",
    "housenumber": "33a",
    "amenity": "restaurant",
    "coordinate": [10, 52 ],
    "street": "Bahnhofstraße",
    "postcode": "10403",
    "city": "Berlin",
    "state": "Berlin",
    "country": "Germany",
    "context": "Kreuzberg",
    "importance": 0.1
    }
    OSM Entry
    {
    "name": "VIASKO",
    "housenumber": "33a",
    "amenity": "restaurant",
    "coordinate": [10, 52]
    }
    Photon Document

    View full-size slide

  9. Photon Importer
    VIASKO
    Bahnhofstraße
    Postcode
    10403
    City
    Berlin
    State
    Country
    Germany
    Berlin
    Nominatim DB
    Postgis
    Elasticsearch
    Index
    Photon
    Importer

    View full-size slide

  10. Type-ahead search
    "filter": {
    "photonngram": {
    "min_gram": "1",
    "type": "edgeNGram",
    "max_gram": "100"
    },...
    }

    View full-size slide

  11. Typo Tolerance "query": {
    "bool": {
    "must": {
    "match": {
    "collector.default": {
    "fuzziness": 1,
    "query": "${query}",
    "analyzer": "search_ngram",
    "prefix_length": 2
    }
    }
    },
    "should":{
    "match": {
    "collector.${lang}.raw": {
    "query": "${query}",
    "boost": 100,
    "analyzer": "search_raw"
    }
    }
    }
    }

    View full-size slide

  12. Location biased
    "score_mode": "multiply",
    "functions": [
    {
    "script_score": {
    "script": "1 + doc['importance'].value * 100"
    }
    },
    {
    "script_score": {
    "script": "
    dist = doc['coordinate'].distanceInKm(lat, lon);
    0.5 + ( 1.5 / (1.0 + dist/40.0) )",
    "params": {
    "lat": "${lat}",
    "lon": "${lon}"
    }
    }
    }
    ]

    View full-size slide

  13. Photon Project
    ● 100 Mio documents
    ● 60 GB search index
    ● easy install
    ● photon.komoot.de
    ● github.com/komoot/photon
    ● want be informed for the next sprint?
    [email protected]

    View full-size slide

  14. komoot.de/hack15

    View full-size slide