Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Day1-1410-Challenges in geonames and address extraction

sotm2017
September 01, 2017

Day1-1410-Challenges in geonames and address extraction

sotm2017

September 01, 2017
Tweet

More Decks by sotm2017

Other Decks in Research

Transcript

  1. Challenges in geonames
    and address extraction
    Prof. Stefan Keller

    Geometa Lab HSR

    University of Applied Sciences 

    Rapperswil (Switzerland)

    View full-size slide

  2. Agenda
    • Motivation

    • Geonames

    • Adresses

    • Issues

    View full-size slide

  3. Geoname search:
    Where is Aizu-Wakamatsu?

    Address geocoding:
    Tokyo Central Post Office
    5-3, Yaesu 1-Chome
    Chuo-ku, Tokyo 100-8994

    View full-size slide

  4. Search components
    • Data



    • Data pre-processing software


    • Search engine software

    View full-size slide

  5. Geonames with
    containment hierarchy

    View full-size slide

  6. Geoname Ex.: Aizu

    View full-size slide

  7. Nominatim.osm.org

    View full-size slide

  8. OSMNames.org

    View full-size slide

  9. Enriched Geonames Data
    "name": "会津若松市",
    "alternative_names": „Aizuwakamatsu, Айдзувакамацу"
    "street": "",
    "county": "Fukushima",
    "city": "",
    "state": "Fukushima",
    "country": "Japan",
    "boundingbox": [139.8389,37.3229,140.1133,37.5831],
    "osm_id": "4174424",
    "type": "administrative",
    "importance": 0.4,

    View full-size slide

  10. The power of hierarchy!
    1. country, national level (ev. main land!)

    2. state, subnational level

    3. city

    4. county/town

    5. village / suburb / neighborhood
    All administrative divisions are polygons (well: almost…)

    View full-size slide

  11. What’s a name anyway?
    • Toponymie: Endonym, Exonym.Example

    • name = ձ௡एদ৓

    • name:jp = ձ௡एদ৓ (Aizu-Wakamatsu-jō)

    • name:en = Aizu-Wakamatsu Castle

    • name:de = Burg Aizu-Wakamatsu

    • alt_name:en = Tsuruga Castle

    • alt_name:jp = ௽ϲ৓ (Tsuru-ga-jō)

    View full-size slide

  12. Issues in tagging geonames
    • Tag name: „Name is the name only“: Names are often misused to
    describe all kinds of things

    • Ranking of geonames!

    • Tag admin_level: There is no unified tagging yet in OSM for town
    parts and village parts

    • Issues in assigning hierarchy of city/town/village/suburb/
    neighborhood

    • How to deal with objects of larger aerial extent? Currently often
    captured as node: which to choose? what is the extent? what is
    the bbox?

    View full-size slide

  13. (Postal Building) Addresses
    • Given list of street geonames as processed before,
    including hierarchy

    • Select all OSM objects (node, way, relation) with key
    „addr:housenumber“ (Karlsruhe Schema)

    • Goal: Generate list of addresses pointing to a street
    (osm_id)

    View full-size slide

  14. Karlsruhe Schema
    Addresses can be tagged with key addr:housenumber and
    other addr keys on…

    • an node, isolated or other (e.g. shops)

    • a node on top of a building boundary with tag entrance=yes

    • a node on a polygon with key building

    • a node on a polygon representing the perimeter of a site

    • an relation with key associatedStreet

    • an invisible line (way) with key addr:interpolation

    View full-size slide

  15. Options to relate a house
    number to a street
    1. House number as part of relation exists? 



    2. Do addr:street or addr:place exists and match directly? 



    3. Apply fuzzy string/text search if street/place do not
    match.



    4. Apply street proximity search if there is no street/place


    View full-size slide

  16. Not covered here
    • Ways with addr:interpolation tag

    • Nodes with associated_street tag

    • POIs without addr:* get addr:* from buildings with addr:s

    • Zip codes

    • …

    View full-size slide

  17. Issues in tagging addresses
    • Misused name tags which introduce ambiguity

    • Sharing/deduplicating addr tags among objects inside a
    building polygon

    • addr nodes with entrance=yes sitting on top of a building
    (way): have to give addr to building (and building would
    have to give addy to everything inside

    • Special treatment of nodes with tag
    addr:housenumber=1;3;5 with values separated by
    semicolon

    View full-size slide

  18. Message to go
    • Addresses are an important asset of OSM

    • Geonames too

    • Wishlist:

    • A more consistent assignment of admin levels is needed

    • More consistent name content

    • Discussion on how to map geonames of larger aerial extent

    • …

    View full-size slide