Presentation to the Eurogi imaGIne conference, March 8, 2013
Good Morning, my name is Richard and I work for
GAMMA, a GIS consultancy based here in Dublin. At
GAMMA we have been working with and watching
the impact of Open Data, and Open GeoData in
particular, quite closely and I’m here today to talk
about how we think it is changing the GI Landscape.
Remember these? Is it an instrument of torture?
Back in the day if you survived a month or two
using these you were cut out for a career in GIS.
Used for data capture from paper, they don't even
make them anymore as we have transitioned away
from paper based sources. The default for people
who are creating data has been reset.
Now our source data is digital, so we use heads-up
techniques like this. A revolution from what went
before. This shift happened years ago, of course.
But there is another shift going on right now. Ed
Parsons recently made the point about those who
really get their fingers dirty with GIS (hardhats)
are now a small minority. Most users of GeoData
(hipsters) have never heard of GIS, but they're
happy doing things with Google Earth etc.
This too is well known, and has been going on since
2005 or so.
Not only are there lots of new users of Geodata (and
new types of user) but there are lots of new
sources of Geodata enabling new things like this:
geocoded flickr photos showing where tourists and
locals take pictures.
Many of those new to GeoData don’t come from
traditional backgrounds – geography etc – and the
directions they are taking, things they are doing
and methods they are using are new to many of
us, often refreshingly so.
A significant contributor to this flood of data: open
data. It’s not the sole reason, of course, but it is an
increasingly important one. Here’s the standard
definition of Open Data:
You can do what you like with Open Data, as long as
you say where you got the data and don’t wrap up
your work in a new license.
How Open Data relates to Govt data (Public Sector
Information, PSI) Geo is a key part of each of
these. Governments are major publishers of data,
of course, so there has been much focus on their
role – via the PSI directive and so forth, but it is
important to recognise that there are other players
too – from the public to private business.
Those who support open data, if such a disparate
group could be said to have a unifying principle...
believe that our society, our environment, our lives
can be improved by greater access to, and
crucially, understanding of, information.
There are different opens, with different
requirements, aims and outcomes
For example. Transparency in Govt (or business)
shines a light, is focussed on improving public
services or trust in an organisation. This means
certain kinds of data – govt spending, health
outcomes, sustainability targets etc.
For profit often needs guarantees around data
(formats, updates, utility) but also has a different
focus on data, data that is pervasive, granular,
regularly updated, useful and close to real time -
transport classic case.
I see the needs of the Geo Sector more closely
aligned with the for profit model.
Open Geodata is popular & widely applicable. we
can see this in repositories of open data - spatial is
often the most commonly accessed data type.
Various studies by consultancies like Deloitte and
Cap Gemini have shown that Geo is key – most
broadly applicable data type in the UK, most
popular domain for companies in Spain, and there
are many other examples like these.
There is broad recognition of the value of Geospatial
analysis and the data that powers it.
This is resulting in explosion of data availability and
also in tools & techniques. Addressing these is a
key challenge for those working in GI.
So that's where we are: we've got more data, it’s
popular and open data is driving a lot of this. How
is this changing things?
There has been an amount of research into the emerging
business models based on Open Data, depending on
whose report you read there are somewhere between 5
and 15 emerging business models for Open Data. This
one is fairly simple, but there are more detailed visions
and discussion set out on the URLs below. There is a
copy of this presentation available online on the
Data Suppliers – not just govt data
Aggregators – perhaps the largest currently, lots of small
and medium businesses operating
Developers – lots of activity, this is the archetype we tend
to hear most about
Enrichers – Possibly with most potential impact on the
Enablers – essential, under the hood work being done here
But can we apply these archetypes to the Geo sector? Here's an
attempt, setting out some Opportunities and threats.
Data Suppliers – NMAs and Base Data producers facing ever
increasing competition from open sources like OSM, the
opportunity here seems to be a shift from simple data supply to
a service based model. Also being ‘Authoritive’, while expensive
will be essential.
Aggregators – I think there is a real business opportunity here, the
data marketplace is currently fractured and likely to remain so
for some time.
Developers – again lots of activity, particularly in Open Source.
Enrichers – Another big business opportunity – educating and
enabling current clients and GIS users to refine their business
processes, uncover new insights and develop new models which
describe their business problem
Enablers – There is an amount of activity here, with a focus on
Platforms and technologies, things like running geoservices in
the cloud which we do a lot of at GAMMA. Again I can only see
this expanding quickly in the near term.
If we look at Open and Closed GeoData there are a
number of key differentiators, we've seen that to be
called 'open' data must meet conditions (reuse,
redistribute, share alike etc.) while closed data is
more restrictive – because they have expensively
gathered intellectual capital to protect.
The reality is more complex – there are various
shades of ‘Open’, and all sorts of different licenses in
place over commercial data.
I'm going to use OSM as an exemplar Open GeoData
project (wonderful resource, not without it's issues)
and Closed data. Key ones include depth/width,
licensing, warm/cold (as in aimed at being of public
interest vs authoritive) Visible/invisible
It's also worth noting that OSM does things very differently from
the likes of Top Down Initiatives like Inspire. OSM's philosophy
since Day 1 has been 'build it and get out of the way' - don't make
contributors jump through hoops to add data. As such, consensus
on terminologies, tagging etc. has been built by the user
community over time.
For example, adding buildings to the OSM DB is fairly simple, if you
want to get more detail than is offered by the default editor then
there are a number of Wiki pages (documentation is not a strength
of OSM however) by comparison Inspire's Data Specification on
Buildings published last month is a 309 page document! No OSM
contributor is going to read something like that before adding a
That’s both a strength and a weakness. The database schema for
OSM may seem anarchic, but it’s easy to add data. Inspire schema
make it more complex to add compliant data, but when followed
the data is more ‘authoritive’
But gathering GeoData is expensive - look at the
acquisition costs of these two. At time of
acquisition Navteq's data gathering costs were
about €400m per annum. We know Google has
1100 staff and another 6000 contractors (many
based over the river from here) working on maps,
how much does that cost?
OSM's numbers are 3 or 4 orders of magnitude less,
but huge investment of 'free' time and donated
bandwidth etc. It’s also focussed on the kind of
GeoData that *can* be crowdsourced of course.
So harness the crowd - Google are, TomTom are,
and there are many others, like 4sq for example,
who are building large scale operations based on
Not everybody agrees - one of the leading lights of
the OSM movement here, making a serious point in
a humorous way.
In any case, we deal with all sorts of GeoData that
can't be crowdsourced - things like administrative
boundaries or geology. So crowdsourcing is
important, but not the be-all and end-all.
Licensing is complex - 4sq are, rightly, 'terrified' of
the odbl - they'll use the likes of OSM as far as
they can, but no further if it threatens their bottom
line. That bottom line is a current valuation of
about €500m. Not bad for a medium quality POI
database with some nice bells and whistles on top.
There are a range of licenses out there - PSI, ODbL,
CC-BY-SA and so on.
As GeoData users we are not lawyers but we might
need some help in future to wade through the
implications of the license wrappers around the data
Why? because GIS people could be considered the
DJs of data - we do remixes - combining and
recombining in search of a better model.
I have mentioned ‘Authoritive’ a number of times
already. I see this as a key concern.
What is 'Good Enough'? Open Geodata can meet many
requirements, but it won’t ever meet them all. There is
lots of commercially sensitive information out there that
won’t ever be made open. We've seen already that
Crowdsourced data has some limitations. The
opportunity exists for data producers to be recognised
as the 'authoritive source' for data, but we've seen that
this is expensive.
But how accurate does the data you're working with
need to be? All data has errors but is it 'good enough‘?
My view is that the range of business processes for
which open geodata are suited is increasing by the day,
but still has a long way to go.
The new data landscape? Variety, Volume, Velocity – these
are commonly used in relation to Big Data.
In GIS we have always dealt with huge volumes of data,
and we have all seen this increase. The data becoming
available to us changes rapidly and fast responses are
required – geocoding for Facebook for example, which is
being done by Pitney Bowes has a 20 millisecond
response built into the spec. We’re also seeing more
forms of data than ever before, different sources,
different formats and so on. But as I’ve just argued,
Veracity is key.
Open data affecting all of these: it's resulting in more data,
it's changing what is available and when, it’s impacting
on how to get it and how quickly.
So, to conclude..
Open GeoData is improving the data which we work
with, it’s broadening the user base for GIS as a
whole and it’s opening up all sorts of new
possibilities. We are seeing organisations embrace
the principle ‘Open by Default’ and I think this will
become much more common and that is a very
good thing indeed.
Thanks for your attention.
A copy of this presentation will be on the gamma.ie
website shortly and if you have any specific
questions you can get in touch with me by email or,
in a more open manner, via Twitter.