Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Day1-1410-Challenges in geonames and address extraction

6e20486159c342aeb2f5092f61c4bf5c?s=47 sotm2017
September 01, 2017

Day1-1410-Challenges in geonames and address extraction



September 01, 2017

More Decks by sotm2017

Other Decks in Research


  1. Challenges in geonames and address extraction Prof. Stefan Keller

    Lab HSR
 University of Applied Sciences 
 Rapperswil (Switzerland)
  2. Agenda • Motivation • Geonames • Adresses • Issues

  3. Geoname search: Where is Aizu-Wakamatsu?
 Address geocoding: Tokyo Central Post

    Office 5-3, Yaesu 1-Chome Chuo-ku, Tokyo 100-8994
  4. Search components • Data
 • Data pre-processing software

 • Search engine software
  5. Geonames with containment hierarchy

  6. Geoname Ex.: Aizu

  7. Nominatim.osm.org

  8. OSMNames.org

  9. Enriched Geonames Data "name": "会津若松市", "alternative_names": „Aizuwakamatsu, Айдзувакамацу" "street": "",

    "county": "Fukushima", "city": "", "state": "Fukushima", "country": "Japan", "boundingbox": [139.8389,37.3229,140.1133,37.5831], "osm_id": "4174424", "type": "administrative", "importance": 0.4, …
  10. The power of hierarchy! 1. country, national level (ev. main

    land!) 2. state, subnational level 3. city 4. county/town 5. village / suburb / neighborhood All administrative divisions are polygons (well: almost…)
  11. What’s a name anyway? • Toponymie: Endonym, Exonym.Example • name

    = ձ௡एদ৓ • name:jp = ձ௡एদ৓ (Aizu-Wakamatsu-jō) • name:en = Aizu-Wakamatsu Castle • name:de = Burg Aizu-Wakamatsu • alt_name:en = Tsuruga Castle • alt_name:jp = ௽ϲ৓ (Tsuru-ga-jō)
  12. Issues in tagging geonames • Tag name: „Name is the

    name only“: Names are often misused to describe all kinds of things • Ranking of geonames! • Tag admin_level: There is no unified tagging yet in OSM for town parts and village parts • Issues in assigning hierarchy of city/town/village/suburb/ neighborhood • How to deal with objects of larger aerial extent? Currently often captured as node: which to choose? what is the extent? what is the bbox?
  13. Addresses

  14. (Postal Building) Addresses • Given list of street geonames as

    processed before, including hierarchy • Select all OSM objects (node, way, relation) with key „addr:housenumber“ (Karlsruhe Schema) • Goal: Generate list of addresses pointing to a street (osm_id)
  15. Karlsruhe Schema Addresses can be tagged with key addr:housenumber and

    other addr keys on… • an node, isolated or other (e.g. shops) • a node on top of a building boundary with tag entrance=yes • a node on a polygon with key building • a node on a polygon representing the perimeter of a site • an relation with key associatedStreet • an invisible line (way) with key addr:interpolation
  16. Options to relate a house number to a street 1.

    House number as part of relation exists? 
 2. Do addr:street or addr:place exists and match directly? 
 3. Apply fuzzy string/text search if street/place do not match.
 4. Apply street proximity search if there is no street/place

  17. Not covered here • Ways with addr:interpolation tag • Nodes

    with associated_street tag • POIs without addr:* get addr:* from buildings with addr:s • Zip codes • …
  18. Issues in tagging addresses • Misused name tags which introduce

    ambiguity • Sharing/deduplicating addr tags among objects inside a building polygon • addr nodes with entrance=yes sitting on top of a building (way): have to give addr to building (and building would have to give addy to everything inside • Special treatment of nodes with tag addr:housenumber=1;3;5 with values separated by semicolon
  19. Message to go • Addresses are an important asset of

    OSM • Geonames too • Wishlist: • A more consistent assignment of admin levels is needed • More consistent name content • Discussion on how to map geonames of larger aerial extent • …