Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How Open Data is changing the Geo Landscape.

How Open Data is changing the Geo Landscape.

Presentation to the Eurogi imaGIne conference, March 8, 2013

Richard Cantwell

March 07, 2013
Tweet

More Decks by Richard Cantwell

Other Decks in Technology

Transcript

  1. 1 Good Morning, my name is Richard and I work

    for GAMMA, a GIS consultancy based here in Dublin. At GAMMA we have been working with and watching the impact of Open Data, and Open GeoData in particular, quite closely and I’m here today to talk about how we think it is changing the GI Landscape.
  2. 2 Remember these? Is it an instrument of torture? Back

    in the day if you survived a month or two using these you were cut out for a career in GIS. Used for data capture from paper, they don't even make them anymore as we have transitioned away from paper based sources. The default for people who are creating data has been reset.
  3. 3 Now our source data is digital, so we use

    heads-up techniques like this. A revolution from what went before. This shift happened years ago, of course.
  4. 4 But there is another shift going on right now.

    Ed Parsons recently made the point about those who really get their fingers dirty with GIS (hardhats) are now a small minority. Most users of GeoData (hipsters) have never heard of GIS, but they're happy doing things with Google Earth etc. This too is well known, and has been going on since 2005 or so.
  5. 5 Not only are there lots of new users of

    Geodata (and new types of user) but there are lots of new sources of Geodata enabling new things like this: geocoded flickr photos showing where tourists and locals take pictures. Many of those new to GeoData don’t come from traditional backgrounds – geography etc – and the directions they are taking, things they are doing and methods they are using are new to many of us, often refreshingly so.
  6. 6 A significant contributor to this flood of data: open

    data. It’s not the sole reason, of course, but it is an increasingly important one. Here’s the standard definition of Open Data: You can do what you like with Open Data, as long as you say where you got the data and don’t wrap up your work in a new license.
  7. 7 How Open Data relates to Govt data (Public Sector

    Information, PSI) Geo is a key part of each of these. Governments are major publishers of data, of course, so there has been much focus on their role – via the PSI directive and so forth, but it is important to recognise that there are other players too – from the public to private business.
  8. 8 Those who support open data, if such a disparate

    group could be said to have a unifying principle... believe that our society, our environment, our lives can be improved by greater access to, and crucially, understanding of, information.
  9. 9 There are different opens, with different requirements, aims and

    outcomes For example. Transparency in Govt (or business) shines a light, is focussed on improving public services or trust in an organisation. This means certain kinds of data – govt spending, health outcomes, sustainability targets etc. For profit often needs guarantees around data (formats, updates, utility) but also has a different focus on data, data that is pervasive, granular, regularly updated, useful and close to real time - transport classic case. I see the needs of the Geo Sector more closely aligned with the for profit model.
  10. 10 Open Geodata is popular & widely applicable. we can

    see this in repositories of open data - spatial is often the most commonly accessed data type. Various studies by consultancies like Deloitte and Cap Gemini have shown that Geo is key – most broadly applicable data type in the UK, most popular domain for companies in Spain, and there are many other examples like these. There is broad recognition of the value of Geospatial analysis and the data that powers it.
  11. 11 This is resulting in explosion of data availability and

    also in tools & techniques. Addressing these is a key challenge for those working in GI.
  12. 12 So that's where we are: we've got more data,

    it’s popular and open data is driving a lot of this. How is this changing things?
  13. 13 There has been an amount of research into the

    emerging business models based on Open Data, depending on whose report you read there are somewhere between 5 and 15 emerging business models for Open Data. This one is fairly simple, but there are more detailed visions and discussion set out on the URLs below. There is a copy of this presentation available online on the gamma.ie website. Data Suppliers – not just govt data Aggregators – perhaps the largest currently, lots of small and medium businesses operating Developers – lots of activity, this is the archetype we tend to hear most about Enrichers – Possibly with most potential impact on the medium/long term Enablers – essential, under the hood work being done here
  14. 14 But can we apply these archetypes to the Geo

    sector? Here's an attempt, setting out some Opportunities and threats. Data Suppliers – NMAs and Base Data producers facing ever increasing competition from open sources like OSM, the opportunity here seems to be a shift from simple data supply to a service based model. Also being ‘Authoritive’, while expensive will be essential. Aggregators – I think there is a real business opportunity here, the data marketplace is currently fractured and likely to remain so for some time. Developers – again lots of activity, particularly in Open Source. Enrichers – Another big business opportunity – educating and enabling current clients and GIS users to refine their business processes, uncover new insights and develop new models which describe their business problem Enablers – There is an amount of activity here, with a focus on Platforms and technologies, things like running geoservices in the cloud which we do a lot of at GAMMA. Again I can only see this expanding quickly in the near term.
  15. 15 If we look at Open and Closed GeoData there

    are a number of key differentiators, we've seen that to be called 'open' data must meet conditions (reuse, redistribute, share alike etc.) while closed data is more restrictive – because they have expensively gathered intellectual capital to protect. The reality is more complex – there are various shades of ‘Open’, and all sorts of different licenses in place over commercial data.
  16. 16 I'm going to use OSM as an exemplar Open

    GeoData project (wonderful resource, not without it's issues) and Closed data. Key ones include depth/width, licensing, warm/cold (as in aimed at being of public interest vs authoritive) Visible/invisible
  17. 17 It's also worth noting that OSM does things very

    differently from the likes of Top Down Initiatives like Inspire. OSM's philosophy since Day 1 has been 'build it and get out of the way' - don't make contributors jump through hoops to add data. As such, consensus on terminologies, tagging etc. has been built by the user community over time. For example, adding buildings to the OSM DB is fairly simple, if you want to get more detail than is offered by the default editor then there are a number of Wiki pages (documentation is not a strength of OSM however) by comparison Inspire's Data Specification on Buildings published last month is a 309 page document! No OSM contributor is going to read something like that before adding a building. That’s both a strength and a weakness. The database schema for OSM may seem anarchic, but it’s easy to add data. Inspire schema make it more complex to add compliant data, but when followed the data is more ‘authoritive’
  18. 18 But gathering GeoData is expensive - look at the

    acquisition costs of these two. At time of acquisition Navteq's data gathering costs were about €400m per annum. We know Google has 1100 staff and another 6000 contractors (many based over the river from here) working on maps, how much does that cost?
  19. 19 OSM's numbers are 3 or 4 orders of magnitude

    less, but huge investment of 'free' time and donated bandwidth etc. It’s also focussed on the kind of GeoData that *can* be crowdsourced of course.
  20. 20 So harness the crowd - Google are, TomTom are,

    and there are many others, like 4sq for example, who are building large scale operations based on crowdsourced GeoData.
  21. 21 Not everybody agrees - one of the leading lights

    of the OSM movement here, making a serious point in a humorous way.
  22. 22 In any case, we deal with all sorts of

    GeoData that can't be crowdsourced - things like administrative boundaries or geology. So crowdsourcing is important, but not the be-all and end-all.
  23. 23 Licensing is complex - 4sq are, rightly, 'terrified' of

    the odbl - they'll use the likes of OSM as far as they can, but no further if it threatens their bottom line. That bottom line is a current valuation of about €500m. Not bad for a medium quality POI database with some nice bells and whistles on top.
  24. 24 There are a range of licenses out there -

    PSI, ODbL, CC-BY-SA and so on. As GeoData users we are not lawyers but we might need some help in future to wade through the implications of the license wrappers around the data we use.
  25. 25 Why? because GIS people could be considered the DJs

    of data - we do remixes - combining and recombining in search of a better model.
  26. 26 I have mentioned ‘Authoritive’ a number of times already.

    I see this as a key concern. What is 'Good Enough'? Open Geodata can meet many requirements, but it won’t ever meet them all. There is lots of commercially sensitive information out there that won’t ever be made open. We've seen already that Crowdsourced data has some limitations. The opportunity exists for data producers to be recognised as the 'authoritive source' for data, but we've seen that this is expensive. But how accurate does the data you're working with need to be? All data has errors but is it 'good enough‘? My view is that the range of business processes for which open geodata are suited is increasing by the day, but still has a long way to go.
  27. 27 The new data landscape? Variety, Volume, Velocity – these

    are commonly used in relation to Big Data. In GIS we have always dealt with huge volumes of data, and we have all seen this increase. The data becoming available to us changes rapidly and fast responses are required – geocoding for Facebook for example, which is being done by Pitney Bowes has a 20 millisecond response built into the spec. We’re also seeing more forms of data than ever before, different sources, different formats and so on. But as I’ve just argued, Veracity is key. Open data affecting all of these: it's resulting in more data, it's changing what is available and when, it’s impacting on how to get it and how quickly.
  28. 28 So, to conclude.. Open GeoData is improving the data

    which we work with, it’s broadening the user base for GIS as a whole and it’s opening up all sorts of new possibilities. We are seeing organisations embrace the principle ‘Open by Default’ and I think this will become much more common and that is a very good thing indeed.
  29. 29 Thanks for your attention. A copy of this presentation

    will be on the gamma.ie website shortly and if you have any specific questions you can get in touch with me by email or, in a more open manner, via Twitter.