Save 37% off PRO during our Black Friday Sale! »

Day1-1410-Challenges in geonames and address extraction

6e20486159c342aeb2f5092f61c4bf5c?s=47 sotm2017
September 01, 2017

Day1-1410-Challenges in geonames and address extraction

6e20486159c342aeb2f5092f61c4bf5c?s=128

sotm2017

September 01, 2017
Tweet

Transcript

  1. Challenges in geonames and address extraction Prof. Stefan Keller
 Geometa

    Lab HSR
 University of Applied Sciences 
 Rapperswil (Switzerland)
  2. Agenda • Motivation • Geonames • Adresses • Issues

  3. Geoname search: Where is Aizu-Wakamatsu?
 Address geocoding: Tokyo Central Post

    Office 5-3, Yaesu 1-Chome Chuo-ku, Tokyo 100-8994
  4. Search components • Data
 
 
 • Data pre-processing software


    
 • Search engine software
  5. Geonames with containment hierarchy

  6. Geoname Ex.: Aizu

  7. Nominatim.osm.org

  8. OSMNames.org

  9. Enriched Geonames Data "name": "会津若松市", "alternative_names": „Aizuwakamatsu, Айдзувакамацу" "street": "",

    "county": "Fukushima", "city": "", "state": "Fukushima", "country": "Japan", "boundingbox": [139.8389,37.3229,140.1133,37.5831], "osm_id": "4174424", "type": "administrative", "importance": 0.4, …
  10. The power of hierarchy! 1. country, national level (ev. main

    land!) 2. state, subnational level 3. city 4. county/town 5. village / suburb / neighborhood All administrative divisions are polygons (well: almost…)
  11. What’s a name anyway? • Toponymie: Endonym, Exonym.Example • name

    = ձ௡एদ৓ • name:jp = ձ௡एদ৓ (Aizu-Wakamatsu-jō) • name:en = Aizu-Wakamatsu Castle • name:de = Burg Aizu-Wakamatsu • alt_name:en = Tsuruga Castle • alt_name:jp = ௽ϲ৓ (Tsuru-ga-jō)
  12. Issues in tagging geonames • Tag name: „Name is the

    name only“: Names are often misused to describe all kinds of things • Ranking of geonames! • Tag admin_level: There is no unified tagging yet in OSM for town parts and village parts • Issues in assigning hierarchy of city/town/village/suburb/ neighborhood • How to deal with objects of larger aerial extent? Currently often captured as node: which to choose? what is the extent? what is the bbox?
  13. Addresses

  14. (Postal Building) Addresses • Given list of street geonames as

    processed before, including hierarchy • Select all OSM objects (node, way, relation) with key „addr:housenumber“ (Karlsruhe Schema) • Goal: Generate list of addresses pointing to a street (osm_id)
  15. Karlsruhe Schema Addresses can be tagged with key addr:housenumber and

    other addr keys on… • an node, isolated or other (e.g. shops) • a node on top of a building boundary with tag entrance=yes • a node on a polygon with key building • a node on a polygon representing the perimeter of a site • an relation with key associatedStreet • an invisible line (way) with key addr:interpolation
  16. Options to relate a house number to a street 1.

    House number as part of relation exists? 
 2. Do addr:street or addr:place exists and match directly? 
 3. Apply fuzzy string/text search if street/place do not match.
 4. Apply street proximity search if there is no street/place

  17. Not covered here • Ways with addr:interpolation tag • Nodes

    with associated_street tag • POIs without addr:* get addr:* from buildings with addr:s • Zip codes • …
  18. Issues in tagging addresses • Misused name tags which introduce

    ambiguity • Sharing/deduplicating addr tags among objects inside a building polygon • addr nodes with entrance=yes sitting on top of a building (way): have to give addr to building (and building would have to give addy to everything inside • Special treatment of nodes with tag addr:housenumber=1;3;5 with values separated by semicolon
  19. Message to go • Addresses are an important asset of

    OSM • Geonames too • Wishlist: • A more consistent assignment of admin levels is needed • More consistent name content • Discussion on how to map geonames of larger aerial extent • …