Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How Open Data is changing the Geo Landscape.

How Open Data is changing the Geo Landscape.

Presentation to the Eurogi imaGIne conference, March 8, 2013

Richard Cantwell

March 07, 2013
Tweet

More Decks by Richard Cantwell

Other Decks in Technology

Transcript

  1. 1
    Good Morning, my name is Richard and I work for
    GAMMA, a GIS consultancy based here in Dublin. At
    GAMMA we have been working with and watching
    the impact of Open Data, and Open GeoData in
    particular, quite closely and I’m here today to talk
    about how we think it is changing the GI Landscape.

    View Slide

  2. 2
    Remember these? Is it an instrument of torture?
    Back in the day if you survived a month or two
    using these you were cut out for a career in GIS.
    Used for data capture from paper, they don't even
    make them anymore as we have transitioned away
    from paper based sources. The default for people
    who are creating data has been reset.

    View Slide

  3. 3
    Now our source data is digital, so we use heads-up
    techniques like this. A revolution from what went
    before. This shift happened years ago, of course.

    View Slide

  4. 4
    But there is another shift going on right now. Ed
    Parsons recently made the point about those who
    really get their fingers dirty with GIS (hardhats)
    are now a small minority. Most users of GeoData
    (hipsters) have never heard of GIS, but they're
    happy doing things with Google Earth etc.
    This too is well known, and has been going on since
    2005 or so.

    View Slide

  5. 5
    Not only are there lots of new users of Geodata (and
    new types of user) but there are lots of new
    sources of Geodata enabling new things like this:
    geocoded flickr photos showing where tourists and
    locals take pictures.
    Many of those new to GeoData don’t come from
    traditional backgrounds – geography etc – and the
    directions they are taking, things they are doing
    and methods they are using are new to many of
    us, often refreshingly so.

    View Slide

  6. 6
    A significant contributor to this flood of data: open
    data. It’s not the sole reason, of course, but it is an
    increasingly important one. Here’s the standard
    definition of Open Data:
    You can do what you like with Open Data, as long as
    you say where you got the data and don’t wrap up
    your work in a new license.

    View Slide

  7. 7
    How Open Data relates to Govt data (Public Sector
    Information, PSI) Geo is a key part of each of
    these. Governments are major publishers of data,
    of course, so there has been much focus on their
    role – via the PSI directive and so forth, but it is
    important to recognise that there are other players
    too – from the public to private business.

    View Slide

  8. 8
    Those who support open data, if such a disparate
    group could be said to have a unifying principle...
    believe that our society, our environment, our lives
    can be improved by greater access to, and
    crucially, understanding of, information.

    View Slide

  9. 9
    There are different opens, with different
    requirements, aims and outcomes
    For example. Transparency in Govt (or business)
    shines a light, is focussed on improving public
    services or trust in an organisation. This means
    certain kinds of data – govt spending, health
    outcomes, sustainability targets etc.
    For profit often needs guarantees around data
    (formats, updates, utility) but also has a different
    focus on data, data that is pervasive, granular,
    regularly updated, useful and close to real time -
    transport classic case.
    I see the needs of the Geo Sector more closely
    aligned with the for profit model.

    View Slide

  10. 10
    Open Geodata is popular & widely applicable. we
    can see this in repositories of open data - spatial is
    often the most commonly accessed data type.
    Various studies by consultancies like Deloitte and
    Cap Gemini have shown that Geo is key – most
    broadly applicable data type in the UK, most
    popular domain for companies in Spain, and there
    are many other examples like these.
    There is broad recognition of the value of Geospatial
    analysis and the data that powers it.

    View Slide

  11. 11
    This is resulting in explosion of data availability and
    also in tools & techniques. Addressing these is a
    key challenge for those working in GI.

    View Slide

  12. 12
    So that's where we are: we've got more data, it’s
    popular and open data is driving a lot of this. How
    is this changing things?

    View Slide

  13. 13
    There has been an amount of research into the emerging
    business models based on Open Data, depending on
    whose report you read there are somewhere between 5
    and 15 emerging business models for Open Data. This
    one is fairly simple, but there are more detailed visions
    and discussion set out on the URLs below. There is a
    copy of this presentation available online on the
    gamma.ie website.
    Data Suppliers – not just govt data
    Aggregators – perhaps the largest currently, lots of small
    and medium businesses operating
    Developers – lots of activity, this is the archetype we tend
    to hear most about
    Enrichers – Possibly with most potential impact on the
    medium/long term
    Enablers – essential, under the hood work being done here

    View Slide

  14. 14
    But can we apply these archetypes to the Geo sector? Here's an
    attempt, setting out some Opportunities and threats.
    Data Suppliers – NMAs and Base Data producers facing ever
    increasing competition from open sources like OSM, the
    opportunity here seems to be a shift from simple data supply to
    a service based model. Also being ‘Authoritive’, while expensive
    will be essential.
    Aggregators – I think there is a real business opportunity here, the
    data marketplace is currently fractured and likely to remain so
    for some time.
    Developers – again lots of activity, particularly in Open Source.
    Enrichers – Another big business opportunity – educating and
    enabling current clients and GIS users to refine their business
    processes, uncover new insights and develop new models which
    describe their business problem
    Enablers – There is an amount of activity here, with a focus on
    Platforms and technologies, things like running geoservices in
    the cloud which we do a lot of at GAMMA. Again I can only see
    this expanding quickly in the near term.

    View Slide

  15. 15
    If we look at Open and Closed GeoData there are a
    number of key differentiators, we've seen that to be
    called 'open' data must meet conditions (reuse,
    redistribute, share alike etc.) while closed data is
    more restrictive – because they have expensively
    gathered intellectual capital to protect.
    The reality is more complex – there are various
    shades of ‘Open’, and all sorts of different licenses in
    place over commercial data.

    View Slide

  16. 16
    I'm going to use OSM as an exemplar Open GeoData
    project (wonderful resource, not without it's issues)
    and Closed data. Key ones include depth/width,
    licensing, warm/cold (as in aimed at being of public
    interest vs authoritive) Visible/invisible

    View Slide

  17. 17
    It's also worth noting that OSM does things very differently from
    the likes of Top Down Initiatives like Inspire. OSM's philosophy
    since Day 1 has been 'build it and get out of the way' - don't make
    contributors jump through hoops to add data. As such, consensus
    on terminologies, tagging etc. has been built by the user
    community over time.
    For example, adding buildings to the OSM DB is fairly simple, if you
    want to get more detail than is offered by the default editor then
    there are a number of Wiki pages (documentation is not a strength
    of OSM however) by comparison Inspire's Data Specification on
    Buildings published last month is a 309 page document! No OSM
    contributor is going to read something like that before adding a
    building.
    That’s both a strength and a weakness. The database schema for
    OSM may seem anarchic, but it’s easy to add data. Inspire schema
    make it more complex to add compliant data, but when followed
    the data is more ‘authoritive’

    View Slide

  18. 18
    But gathering GeoData is expensive - look at the
    acquisition costs of these two. At time of
    acquisition Navteq's data gathering costs were
    about €400m per annum. We know Google has
    1100 staff and another 6000 contractors (many
    based over the river from here) working on maps,
    how much does that cost?

    View Slide

  19. 19
    OSM's numbers are 3 or 4 orders of magnitude less,
    but huge investment of 'free' time and donated
    bandwidth etc. It’s also focussed on the kind of
    GeoData that *can* be crowdsourced of course.

    View Slide

  20. 20
    So harness the crowd - Google are, TomTom are,
    and there are many others, like 4sq for example,
    who are building large scale operations based on
    crowdsourced GeoData.

    View Slide

  21. 21
    Not everybody agrees - one of the leading lights of
    the OSM movement here, making a serious point in
    a humorous way.

    View Slide

  22. 22
    In any case, we deal with all sorts of GeoData that
    can't be crowdsourced - things like administrative
    boundaries or geology. So crowdsourcing is
    important, but not the be-all and end-all.

    View Slide

  23. 23
    Licensing is complex - 4sq are, rightly, 'terrified' of
    the odbl - they'll use the likes of OSM as far as
    they can, but no further if it threatens their bottom
    line. That bottom line is a current valuation of
    about €500m. Not bad for a medium quality POI
    database with some nice bells and whistles on top.

    View Slide

  24. 24
    There are a range of licenses out there - PSI, ODbL,
    CC-BY-SA and so on.
    As GeoData users we are not lawyers but we might
    need some help in future to wade through the
    implications of the license wrappers around the data
    we use.

    View Slide

  25. 25
    Why? because GIS people could be considered the
    DJs of data - we do remixes - combining and
    recombining in search of a better model.

    View Slide

  26. 26
    I have mentioned ‘Authoritive’ a number of times
    already. I see this as a key concern.
    What is 'Good Enough'? Open Geodata can meet many
    requirements, but it won’t ever meet them all. There is
    lots of commercially sensitive information out there that
    won’t ever be made open. We've seen already that
    Crowdsourced data has some limitations. The
    opportunity exists for data producers to be recognised
    as the 'authoritive source' for data, but we've seen that
    this is expensive.
    But how accurate does the data you're working with
    need to be? All data has errors but is it 'good enough‘?
    My view is that the range of business processes for
    which open geodata are suited is increasing by the day,
    but still has a long way to go.

    View Slide

  27. 27
    The new data landscape? Variety, Volume, Velocity – these
    are commonly used in relation to Big Data.
    In GIS we have always dealt with huge volumes of data,
    and we have all seen this increase. The data becoming
    available to us changes rapidly and fast responses are
    required – geocoding for Facebook for example, which is
    being done by Pitney Bowes has a 20 millisecond
    response built into the spec. We’re also seeing more
    forms of data than ever before, different sources,
    different formats and so on. But as I’ve just argued,
    Veracity is key.
    Open data affecting all of these: it's resulting in more data,
    it's changing what is available and when, it’s impacting
    on how to get it and how quickly.

    View Slide

  28. 28
    So, to conclude..
    Open GeoData is improving the data which we work
    with, it’s broadening the user base for GIS as a
    whole and it’s opening up all sorts of new
    possibilities. We are seeing organisations embrace
    the principle ‘Open by Default’ and I think this will
    become much more common and that is a very
    good thing indeed.

    View Slide

  29. 29
    Thanks for your attention.
    A copy of this presentation will be on the gamma.ie
    website shortly and if you have any specific
    questions you can get in touch with me by email or,
    in a more open manner, via Twitter.

    View Slide