Slide 1

Slide 1 text

Richard Cantwell GAMMA [email protected] www.gamma.ie Twitter: @GeoGraphicIE - Geo Related @ManAboutCouch - Personal What’s Special about Spatial? Hello. My name is Richard and I make maps. I work for GAMMA, one of Ireland’s leading GIS, and Location Intelligence companies. If you don’t know what those terms mean, don’t worry, I’ll explain them over the next half hour or so. What I do is help clients understand the market they’re operating in, and to do this I draw on a wide range of data from all sorts of places, from central and local govt, semi-states, the private sector, crowd sourced and of course client provided data. But the data I work with is special – it’s 'spatial' or 'geographical', that is it is information that can be located in space. I can combine data from all of these sources because they all have a locational element, meaning I can put them on a map. Once I have the data in my mapping system (or GIS) I can combine them and perform all sorts of operations on them to gain insight. In this presentation I’m going to talk about what it is that makes spatial special, what are the unique challenges and opportunities of working with this data.

Slide 2

Slide 2 text

Mercator, World Map, 1569 Maps from cartography’s ‘Golden Age’ are works of art; owned and seen by the few, made by the even fewer. This particular map is very important – over 400 years old and elements still in use today – projections, more of which anon.

Slide 3

Slide 3 text

Griffith Valuation, 1850-1858 As time marched on maps have become more ubiquitous, becoming part of the machinery of the state, such as this ‘Griffith Valuation’ map which was used for Tax gathering in Ireland 150 years ago. At the time Ireland was almost certainly the best mapped territory on the planet. The British applied the lessons learned while mapping Ireland between the 1820's and 1840's to their efforts at home and elsewhere.

Slide 4

Slide 4 text

www.nuim.ie/staff/dpringle/qsms.shtml Early Computer mapping – Belfast 1972 Being able to store, manage, analyse and produce data in digital form has changed the face of many, or even most, disciplines, and geographyis no different. This very early example of computer mapping looks a lot different from the old maps we've already seen, and probably needed just as much work to make.

Slide 5

Slide 5 text

Computer Mapping – 2012 Version But fast forward to now. Give me a few minutes and I can make a map like this, the backdrop mapping comes from OSM (more about that from Dermot later) and can be updated every minute. I can drop data from other sources on top, do some analysis and output to a dizzying range of formats. (most of which will be obsolete long before paper is – but there you go)

Slide 6

Slide 6 text

County Population 2011 Dublin 1273069 Antrim 616384 Cork 519032 Down 489588 Galway 250653 Derry 211669 Kildare 210312 Limerick 191809 Meath 184135 Donegal 161137 Tipperary 158754 Tyrone 158460 Kerry 145502 Wexford 145320 Wicklow 136640 Mayo 130638 Armagh 126803 Louth 122897 Clare 117196 Waterford 113795 Kilkenny 95419 Westmeath 86164 Laois 80559 Offaly 76687 Cavan 73183 Sligo 65393 Roscommon 64065 Monaghan 60483 Fermanagh 57527 Carlow 54612 Longford 39000 Leitrim 31796 But why map things at alll? Here’s a table – some patterns can be discerned, but what about distribution – the ‘Where’ question. To answer those I need to map this data..

Slide 7

Slide 7 text

Voila! I can join that table of information to a spatial dataset of county boundaries which I have to hand with a couple of clicks. A couple more clicks and I've colour coded the data.

Slide 8

Slide 8 text

Lets look a little deeper at this - maybe raw population isn’t the full story – some counties, like Cork, are very big but have a large population, while Dublin is small, but has a huge population. Mapping the density might be a better option. To do this I need to take the area of the county into account – GIS systems can do this at the push of a button – even if the original table doesn’t contain the area of the county – the software can figure out the size. Another couple of clicks and my map has changed.

Slide 9

Slide 9 text

We can even distort the geography to convey a different message – this is a cartogram.

Slide 10

Slide 10 text

10 10 Maybe we’re getting ahead of ourselves a bit. So we know one reason why we map things – easy visualisation, but that’s far from all there is to it. The 'where' question is asked all of the time, whether it's 'routing' or more complicated questions – and we've been asking complicated questions for a long time.

Slide 11

Slide 11 text

11 11 John Snow's 'Cholera Map' from 1854 www.TheGhostMap.com Now, this is probably the most overused example in any presentation about 'Geo' or 'Spatial' the famous John Snow map from 1854. It illustrates the power of combining different datasets into the one map. In this case it's water pumps and deaths from cholera.

Slide 12

Slide 12 text

12 12 Henry Drury Harness:1837 Flow Map and Population Density Map indiemaps.com/blog/2009/11/the-first-thematic-maps To make up for that overused example, two earlier and underused examples – the first flow map and one of the first ‘statistical maps’ both produced by the same man – Henry Drury Harness in a report for the Irish Railway Commissioners in 1837. The quality of the available images isn't great, but here we can see examples of maps being used to convey more than just things that exist on the ground. Whether it's 'how many trains per hour are needed on this corridor' or 'where are large concentrations of people' we've moved beyond the simple recording of location and onto a more abstract representation.

Slide 13

Slide 13 text

13 13 In the modern era we’re well used to being able to get our hands on statistical data and map it – here I'm drawing from CSO data, and I can combine this with customer locations or any other dataset and hopefully gain some insight into what's happening.

Slide 14

Slide 14 text

14 14 Layering Data to gain insights. The key thing is overlaying different data – mash-ups if you like – and there is a huge range of questions you can answer with this approach.

Slide 15

Slide 15 text

15 15 What’s Special about Spatial? www.blanz.net/geo13.html But at the core of this is the data – what is spatial data? Firstly in this day and age it's digital – or if not then it's data that has somehow been digitised, whether by scanning paper maps or tracing over them with specialised hardware.

Slide 16

Slide 16 text

16 16 Spatial data – data that is 'located in space' A definition – some kind of referencing system – lat long, ING, X, Y etc. Etc. If you can locate your data in space, it is spatial.

Slide 17

Slide 17 text

17 17 Represent the curved earth on a flat screen or paper: Projections The earth isn’t flat, our screens are though, most of the data we work with is too – so far, 3d is very cool but isn’t required for much of the analysis we do. Indeed you could consider the 'attribute' data to be the 3rd dimension. Height, or 'z' is just another attribute. (yes, I'm oversimplifying)

Slide 18

Slide 18 text

18 18 Projections http://xkcd.com/977 Lots of different projections – Web Mercator commonly used on the web. Irish Grid also

Slide 19

Slide 19 text

19 19 Projections Here’s IG and Lat Long (WGS 84) A complicating factor - ITM

Slide 20

Slide 20 text

20 20 Vector Data: Point But what does the data look like? Two main forms: Vector and raster. Vector data stores individual features as objects, while raster can be likened to a photo – an area is covered in a regualr grid, with each cell (or pixel) having a value. Lets look at Vector data first – here's the simplest type – point data. Just an X and Y coordinate, and maybe some attributes – town names in this case. If I had more attributes, population say, I could change the size of the points to reflect the pops of the towns.

Slide 21

Slide 21 text

21 21 Vector Data: Line Line data – a series of connected points. Stored as a single object, again with attribute data – road classification in this case.

Slide 22

Slide 22 text

22 22 Vector Data: Region Finally, region or area data, a series of connected lines enclosing an area, also stored as a single object. Modern software can shift data from type to type as required.

Slide 23

Slide 23 text

23 23 Raster data. Raster data – height – NASA, Usually seen with a weather forcaster in front of it – in the public domain, so freely downloadable and reusable. Aerial photos – usually from satellites. Can be v expensive for recent & cloud free scenes, but Gmaps, Bing etc pay to use and make available under licence.

Slide 24

Slide 24 text

24 24 Tile Data: .../{z}/{x}/{y}.png mike.teczno.com Tile data – a special case. Gaining a lot of attention due to it's use by Gmaps, OSM, Bing etc. Typically vector and raster data is assembled by these orgs and rendered, either on the fly or in advance.

Slide 25

Slide 25 text

25 25 Tile Data Pyramid tile.openstreetmap.org/0/0/0.png tile.openstreetmap.org/10/494/331.png tile.openstreetmap.org/18/126521/84975.png Lot's and lots of tiles – some pre render – gmaps, other's render on demand – osm. OSM's tilestore – 1.2TB of .png files for the whole planet, the top 16 zoom levels add up to about 200GB of that, and the bottom 3 add the other 1TB.

Slide 26

Slide 26 text

maps.stamen.com/watercolor Some amazing things being done with tiles – this is actually a tileset, from a San Fran company called Stamen who do lots of cool things with OSM data.

Slide 27

Slide 27 text

I can use these tiles in my GIS software, as a backdrop

Slide 28

Slide 28 text

izeize.com/openmaps Or even see them on my phone. This particular tileset is a bit of a victim of it's own popularity, quite a lot of compute power (Amazon EC2 I understand) required to generate, done pretty much on the fly, so if you visit when the internet is awake some tiles might not render

Slide 29

Slide 29 text

www.mapbox.com/tilemill Bake your own You can make your own tiles too. Server based GIS software has this capability, but there is a new breed of desktop apps like TileMill which is built from Open Source SW which can do this too.

Slide 30

Slide 30 text

30 30 30 30 GeoCoded Address data www.autoaddress.ie There is also address data. We've already heard about postcodes today, but it is now possible to convert Irish addresses to points on a map, and this is crucial to unlocking the 'locational' element of most organisations data holdings – anywhere in the org that uses address data: CRM, asset registers and so on.

Slide 31

Slide 31 text

31 31 31 31 GeoCoded Address data www.autoaddress.ie This is what it might look like – you start off with addresses and end up with points on a map ready for analysis. More about this in a few minutes.

Slide 32

Slide 32 text

32 32 Unintentional errors Intentional errors Projection issues Data Classification Scale Generalisation Interpretation Modifiable Areal Unit Problem (MAUP) We've looked at the nature of spatial data, and seen some of the things you can do with it, but spatial data has it's own set of unique challenges. For some reason about 90% of the examples in this classic book are from the planning trade, draw from that what you will.

Slide 33

Slide 33 text

33 33 An Unintentional Error Here's an example of an unintentional error – google's map of Letterkenny from a couple of years ago. They took about 6 months to fix this. That's what the HSE have been spending all that money on – buying Letterkenny. Donegal people must get very sick

Slide 34

Slide 34 text

34 34 Common data creation errors Overshoot Undershoot Pseudonode Dangling Node Missing Label Point Missing Node Polygon Label Point If you're creating vector data from scratch then there are a range of errors that can creep in. The Letterkenny example was probably an undershoot – leading to a region that wasn't properly closed – but as you can see there are plenty of other types of error. Modern GIS software has tools built in for detecting and fixing these, of course, and most casual users just work with available data rather than making their own.

Slide 35

Slide 35 text

35 35 The Map. The reality. An Intentional Error Don't copy things unless you have a license. Big case against the AA by OSGB about 10 yrs ago – precision of co-ordinates in the data – the same to 10 decimal places – or the atoms were in the same place.

Slide 36

Slide 36 text

36 36 Projection issues: Offset Imagery Sometimes things get projected incorrectly, or imagery rectification goes awry. In this case, from OSM, the streets are very accurate – multiple verified GPS readings, but the backdrop imagery is offset – due to a production error. This is common enough with global datasources like this. On the other hand OSi ortho data is super accurate – an example of you get what you pay for perhaps.

Slide 37

Slide 37 text

37 37 Classifying Data: here be dragons.. colorbrewer2.org Can open, worms everywhere. Classifying data is a bog standard function. May have a continuous variable, but showing each one in it's own colour isn't practical – human eye has difficulty with more than 7 or 8 classes. But this isn't without issues...

Slide 38

Slide 38 text

38 38 Consider this map. I've classified the areas according to the proportion of vacant houses in last years census. I chose to split the data into 5 classes and pretty much let my GIS software come up with the ranges. You can see there is quite a lot of variation. Also note that I have included the number of areas in each class – just for illustration purposes.

Slide 39

Slide 39 text

39 39 Same data, different classification, very different message – this map seems to be saying – ok, there is a vacant housing problem in some areas, typically where there are a lot of holiday homes, but mostly we're ok.

Slide 40

Slide 40 text

40 40 Again, same data, very different message – this map is saying we're all fecked, time to emigrate if you don't live near a big city.

Slide 41

Slide 41 text

41 41 Why are these bastards lying to me So, when you see a map with classified data, think – what is this map trying to tell me, why were those classes picked?

Slide 42

Slide 42 text

42 42 Scale Issues Spatial data is typically captured at a certain scale, and modern software makes it very easy to use data captured at one scale at another scale, which can cause a lot of problems. These two maps were generated at the different scales, but are being shown at the same. If I want to find my way around NewcastleWest, the one on the left is useless. On the other hand if I'm doing some sort of national level project the one on the right is far too detailed to be practical / legible. If I wanted to capture roads, say, it would take forever.

Slide 43

Slide 43 text

43 43 Generalisation / Selection Similarly, we often work with very large datasets – like this roads dataset from NavTeq. If I'm working at this scale do I really need every boreen in the country – does it add to my model?

Slide 44

Slide 44 text

44 Continuous Variance: Surface Often we work with data that continuously varies – like soil types for example. The tendancy is to draw lines between different types, but the reality on the ground is that there are no hard lines – one soil type gradually merges into the next.

Slide 45

Slide 45 text

45 Racial origin in a US City www.flickr.com/photos/walkingsf/4982034696/ We don't see hard boundaries like this very often, especially when dealing with humans.

Slide 46

Slide 46 text

46 Even in regimented areas like this (in Mexico city – this is actually a photo, I thought it was a render) there is continuous variance .

Slide 47

Slide 47 text

47 47 The Modifiable Areal Unit Problem Which brings us to MAUP – different results depending on how you slice and dice your data

Slide 48

Slide 48 text

48 48 48 48 48 Blanchardstown – Blakestown: Pop 32,305 (2006) Moving from 3,400 DEDs to approx 19,000 SAs – much more detailed and granular data analysis possible.

Slide 49

Slide 49 text

49 49 Tobler’s First Law of Geography: "Everything is related to everything else, but near things are more related than distant things." Concludes section on errors / issues. Most important thing about geography: 1st law, and models, data, processes all need to take this into account.

Slide 50

Slide 50 text

50 50 Tobler’s First Law of Geography: “These are small, those are far away." Or to put it in Irish terms

Slide 51

Slide 51 text

51 51 GeoDirectory: Ireland’s National Address Database Talking of Fr Ted – he's official, his gaff is in GeoDir What is GeoDir – official address db, over 2M recs, including co-ords for every building in the country.

Slide 52

Slide 52 text

52 52 52 GeoDirectory Structure http://www.geodirectory.ie It is a complex database

Slide 53

Slide 53 text

53 53 Empty Ireland We can use it for simple analyses like this – empty Ireland – places where there are no residential houses..

Slide 54

Slide 54 text

54 54 Empty Ireland Matches quite well the roads map we looked at earlier.

Slide 55

Slide 55 text

The upshot is that we can use this to geocode address data, as we saw earlier, and do things like drivetime analysis

Slide 56

Slide 56 text

Or combine it with things like measures of disposable income

Slide 57

Slide 57 text

57 57 Autoaddress.ie We use a service called autoaddress to do this (this is a sales pitch btw)

Slide 58

Slide 58 text

58 58 58 Widely used, market leader, gaining customers (/end sales pitch)

Slide 59

Slide 59 text

Vs creativecommons.org Very interesting battle going on between traditional copyright controlled data sources and a new breed of openly licensed data, Dermot is going to talk about this in more detail shortly.

Slide 60

Slide 60 text

O p e n S t r e e t M a p Here's OSM – richly detailed

Slide 61

Slide 61 text

Frequently updated

Slide 62

Slide 62 text

Map contains data Copyright NavTeq Map contains data Copyright NavTeq Comparable to commercial data in many areas, and catching up quickly.

Slide 63

Slide 63 text

63 63 63 63 New sources of GeoData www.flickr.com/photos/walkingsf/4672160490 Other changes include the ever increasing range of source data – social media, geotagged photos, etc etc.

Slide 64

Slide 64 text

64 New Geographies http://www.casa.ucl.ac.uk/urbantick/maps/london_ncl_100628.html

Slide 65

Slide 65 text

65 65 Social Networks – a rich source

Slide 66

Slide 66 text

66 Augmented Reality Browser www.walkspace.org/namaland/

Slide 67

Slide 67 text

67 Open Data http://www.oobrien.com/vis/bikes/

Slide 68

Slide 68 text

68 Joint Efforts being established www.dublinked.com

Slide 69

Slide 69 text

69 Web Mapping

Slide 70

Slide 70 text

70 Pre rendered tiles – Range of data sources AJAX (Asynchronous JavaScript and XML) Open API's - Mashups Rapid development – releases every 7-10 days Google Earth Google Maps - Key Innovations http://berglondon.com/projects/hat/

Slide 71

Slide 71 text

71 Web mapping: Gaining Functionality map.project-osrm.org

Slide 72

Slide 72 text

Google Earth – Data Sharing, Visualisation & (limited) Analysis www.avex-asso.org

Slide 73

Slide 73 text

73 Google Fusion Tables www.google.com/fusiontables/public/tour/index.html

Slide 74

Slide 74 text

74 Google Fusion Tables: Customised Maps submarinecablemap.com

Slide 75

Slide 75 text

75 www.gamma.ie/CSO/Population2006-2011.html Google Fusion Tables: Customised Maps

Slide 76

Slide 76 text

No content

Slide 77

Slide 77 text

No content

Slide 78

Slide 78 text

Richard Cantwell GAMMA [email protected] www.gamma.ie Twitter: @GeoGraphicIE - Geo Related @ManAboutCouch - Personal Thank You.