Slide 1

Slide 1 text

! www.alex-singleton.com! @alexsingleton! Advances in Geographic Data Science: Open Data and Systems Alex Singleton Reader Geographic Information Science Department of Geography and Planning

Slide 2

Slide 2 text

Overview • Open “Big” Data • Geographic Data Science (Systems) • CO2 Emissions and the School Commute • 2011 Census Open Atlas • Future Prospects

Slide 3

Slide 3 text

Geographic Data Science • Intersection of Geographic Information Science, Spatial Analysis, Applied Geocomputation and visualisation. • Couples burgeoning new and dynamic data sources with advanced quantitative and computational methodology to advance debates around problems of global social and economic importance.

Slide 4

Slide 4 text

Open Data Open data is information that is available for anyone to use, for any purpose, at no cost. http://theodi.org/guides/what-open-data http://www.flickr.com/photos/mayhem/3899818862/

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

No content

Slide 8

Slide 8 text

School Data University Data SECURE SANITISED Singleton, A. (2010). The Geodemographics of Educational Progression and their Implications for Widening Participation in Higher Education. Environment and Planning A, 42(11):2560–2580.!

Slide 9

Slide 9 text

BUT (c) Stan Openshaw

Slide 10

Slide 10 text

BUT (c) Stan Openshaw

Slide 11

Slide 11 text

Open Data • Open data has to have a licence that says it is open data. Without a licence, the data can’t be reused. The licence might also say: • that people who use the data must credit whoever is publishing it (this is called attribution) • that people who mix the data with other data have to also release the results as open data (this is called share-alike) http://theodi.org/guides/what-open-data

Slide 12

Slide 12 text

BUT (c) Stan Openshaw

Slide 13

Slide 13 text

No content

Slide 14

Slide 14 text

Free Data is not Open Data

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

http://www.flickr.com/photos/x-ray_delta_one/8184264475/ http://www.flickr.com/photos/neoporcupine/1866929252/

Slide 17

Slide 17 text

http://www.flickr.com/photos/x-ray_delta_one/8184264475/ http://www.flickr.com/photos/neoporcupine/1866929252/ Fantasy Reality

Slide 18

Slide 18 text

No content

Slide 19

Slide 19 text

No content

Slide 20

Slide 20 text

http://streetbump.org/

Slide 21

Slide 21 text

Eric Fischer (Twitter Map)

Slide 22

Slide 22 text

http://upload.wikimedia.org/wikipedia/commons/2/23/Noahs_Ark.jpg

Slide 23

Slide 23 text

http://upload.wikimedia.org/wikipedia/commons/2/23/Noahs_Ark.jpg

Slide 24

Slide 24 text

Mike Batty, UCL Anything that won’t fit! on a spreadsheet!

Slide 25

Slide 25 text

No content

Slide 26

Slide 26 text

Areas Attributes PCA “Pre” clustering - contiguity constraint

Slide 27

Slide 27 text

Areas Attributes PCA “Pre” clustering - contiguity constraint

Slide 28

Slide 28 text

Shevky and Bell (1949) Social Area Analysis

Slide 29

Slide 29 text

Census 2001 Census 2011 Release 2003 Admin. Data Census 2022? Social Media.Data Aggregate Decades Individual Seconds Open! Data Closed! Data Business Data ?! Data

Slide 30

Slide 30 text

Grand Challenges • Of what are these new data representative? • What should be captured (or not) and how? • What are the ethical considerations - privacy / surveillance? • What new problems can be explored through imaginative use of data and software?

Slide 31

Slide 31 text

http://www.flickr.com/photos/epsos/5591761716/sizes/o/in/photostream/ CO2 Emissions and the School Commute

Slide 32

Slide 32 text

CO2 Emissions • ~7.5 million school trips • 2007-2012 - Usual Travel Mode • Data Department for Education; Department for Transport (DVLA) • Suite of open source software Singleton, A. (2013) A GIS Approach to Modelling CO2 Emissions Associated with the Pupil-School Commute. International Journal of Geographical Information Science, 28(2):256–273.

Slide 33

Slide 33 text

CO2 Emissions • ~7.5 million school trips • 2007-2012 - Usual Travel Mode • Data Department for Education; Department for Transport (DVLA) • Suite of open source software Singleton, A. (2013) A GIS Approach to Modelling CO2 Emissions Associated with the Pupil-School Commute. International Journal of Geographical Information Science, 28(2):256–273.

Slide 34

Slide 34 text

CO2 Emissions • d distance • p pupil • i pupil home postcode • j school postcode • e CO2g/km • t transport mode • g location k p = 2 d i p j p t p ( )e t p g p ( )w t p ( ) ( )

Slide 35

Slide 35 text

Transport Mode Average CO2g / km Taxi 150.3 Bus (London) 85.7 Bus (Non London) 184.3 Coach 30.0 Light Rail - Average 71 London (DLR) 68.3 Birmingham / Midlands 70.5 Newcastle 103.0 Croydon 44.3 Manchester 39.5 Nottingham # Sheffield 96.8 National Rail 53.4 London Underground 73.1 Cycling 8.3 Walking 11.4 (Coley 2002, DEFRA 2011, Tranter 2012)

Slide 36

Slide 36 text

Data Processing OpenStreetMap Meridian2 Roads / Paths .osm XML Light Rail / Tube railway=light_rail railway=subway Routino QGIS Railways Cleaning 1) Single lines 2) Nodes join 3) Nodes at stations QGIS

Slide 37

Slide 37 text

Software Infrastructure

Slide 38

Slide 38 text

Software Infrastructure Query Pupil (origin, destination, mode)

Slide 39

Slide 39 text

Software Infrastructure

Slide 40

Slide 40 text

Software Infrastructure Mode? Tube / Light Rail Road

Slide 41

Slide 41 text

Software Infrastructure

Slide 42

Slide 42 text

Software Infrastructure Car based?

Slide 43

Slide 43 text

Software Infrastructure Car based? Yes

Slide 44

Slide 44 text

Software Infrastructure Car based? Return LSOA average CO2g/km Yes

Slide 45

Slide 45 text

Software Infrastructure Car based?

Slide 46

Slide 46 text

Software Infrastructure Car based? No Use national averages

Slide 47

Slide 47 text

Software Infrastructure Car based?

Slide 48

Slide 48 text

0.00 0.25 0.50 0.75 0−0.5km 0.5−1km 1−1.5km 1.5−2km 2−2.5km 2.5−3km 3−3.5km 3.5−4km 4−4.5km 4.5−5km 5−5.5km 5.5−6km 6−6.5km 6.5−7km 7−7.5km 7.5−8km 8−8.5km 8.5−9km 9−9.5km 9.5−10km 10−10.5km 10.5−11km 11−11.5km 11.5−12km 12−12.5km 12.5−13km 13−13.5km 13.5−14km 14−14.5km 14.5−15km 15−15.5km 15.5−16km 16−16.5km 16.5−17km 17−17.5km 17.5−18km 18−18.5km 18.5−19km 19−19.5km 19.5−20km > 20km Distance Percentage Mode BUS CAR NON TRA

Slide 49

Slide 49 text

Results Versus a simple model (straight line, vehicle national averages) +ve = simple model overestimating

Slide 50

Slide 50 text

2010 Census of Japan Open Atlas Alex Singleton [www.alex-singleton.com] Chris Brunsdon, Tomoki Nakaya, Keiji Yano Version 1.0 ! 2011 Census Open Atlas Alex Singleton (www.alex-singleton.com) Version 2.0

Slide 51

Slide 51 text

No content

Slide 52

Slide 52 text

1) Download and process all OA data EW 2) Download OA and Ward boundaries 3) Render maps and legends for LAD 4) Write a latex file pdfcrop PDFTK

Slide 53

Slide 53 text

Version 2 Scale Bars Legend label List of Figures Place label suppress Wales only variables

Slide 54

Slide 54 text

No content

Slide 55

Slide 55 text

Version Control

Slide 56

Slide 56 text

No content

Slide 57

Slide 57 text

No content

Slide 58

Slide 58 text

No content

Slide 59

Slide 59 text

No content

Slide 60

Slide 60 text

No content

Slide 61

Slide 61 text

James Reid - Northern Ireland Atlas (http://ukbdev.edina.ac.uk/Census2011/)

Slide 62

Slide 62 text

Savings • A manual map might typically take 5 minutes to create • 134,567 maps • 467.2 days (no breaks!) • 35 hour working week for 46 weeks of a year (6 weeks holiday) - 6.9 years. • Median wages of a GIS Technician at £20,030 then the “cost” of all these maps would be 6.9 X £20,030 = £138,207.

Slide 63

Slide 63 text

Some Speculation… • Big Data is not a new phenomenon: disjunction between available data and ability process it • New methodology will emerge • Great Opportunity: Should begin with a problem to solve not a technology or infrastructure D ata

Slide 64

Slide 64 text

Some speculation… http://www.theguardian.com/technology/2014/jan/29/uk-government-plans-switch-to-open-source-from-microsoft-office-suite https://www.whatdotheyknow.com/request/133909/response/323829/attach/3/RESPONSE%2028509371.pdf • Opensource GIS • Reduce market share commercial desktop GIS • Commercial GIS, refocus on cloud - services M arket

Slide 65

Slide 65 text

Some speculation… http://www.theguardian.com/technology/2014/jan/29/uk-government-plans-switch-to-open-source-from-microsoft-office-suite https://www.whatdotheyknow.com/request/133909/response/323829/attach/3/RESPONSE%2028509371.pdf • Opensource GIS • Reduce market share commercial desktop GIS • Commercial GIS, refocus on cloud - services M arket

Slide 66

Slide 66 text

Some Speculation… Sweave (.Rnw) Markdown (.Rmd) Rpubs R eproducibility

Slide 67

Slide 67 text

Some Speculation… The ability to code relates to basic programming and database skills that enable students to manipulate large and small geographic data sets, and to analyse them in automated and transparent ways. Although it might seem odd for a geographer to want to learn programming languages, we only have to look at geography curriculums from the 1980s to realise that these skills used to be taught. For example, it wouldn’t have been unusual for an undergraduate geographer to learn how to programme a basic statistical model (for example, regression) from base principles in Fortran (a programming language popular at the time) as part of a methods course. But during the 1990s, the popularisation of graphical user interfaces in software design enabled many statistical, spatial analysis and mapping operations to be wrapped up within visual and menu-driven interfaces, which were designed to lower the barriers of entry for users of these techniques. Gradually, much GIS teaching has transformed into learning how these software package, they increasingly look like advertisements for computer scientists, with expected skills and experience that wouldn’t traditionally be part of an undergraduate geography curriculum. Many of the problems that GIS set out to address can now be addressed with mainstream software or shared online services that are, as such, much easier to use. If I want to determine the most efficient route between two locations, a simple website query can give a response within seconds, accounting for live traffic-volume data. If I want to view the distribution of a census attribute over a given area, there are multiple free services that offer street-level mapping. Such tasks used to be far more complex, involving specialist software and technical skills. There are now far fewer job advertisements for GIS technicians than there were ten years ago. Much traditional GIS-type analysis is now sufficiently non-technical that it requires little specialist skill, or has been automated through software services, with a subscription replacing the employment of a technician. The market has moved on. Geographers shouldn’t become computer scientists; however, we need to reassert our role in the development and critique of existing and new GIS. For example, we need to ask questions such as which type of geographic representation might be most appropriate for a given dataset. Today’s geographers may be able to talk in general terms about such a question, but they need to be able to provide a more effective answer that encapsulates the technologies that are used for display. Understanding what is and isn’t possible in technical terms is as important as understanding the underlying cartographic principles. Such insights will be more available to a geographer who has learnt how to code. Within the area of GIS, technological change has accelerated at an alarming rate in the past decade and geography curriculums need to ensure that they embrace these developments. This does, however, come with challenges. Academics must ensure that they are up to date with market developments and also that there’s sufficient capacity within the system to make up-skilling possible. Prospective geography undergraduates should also consider how the university curriculums have adapted to modern market conditions and whether they offer the opportunity to learn how to code. software systems operate, albeit within a framework of geographic information science (GISc) concerned with the social and ethical considerations of building representations from geographic data. Some Masters degrees in GISc still require students to code, but few undergraduate courses do so. The good news is that it’s never been more exciting to be a geographer. Huge volumes of spatial data about how the world looks and functions are being collected and disseminated. However, translating such data safely into useful information is a complex task. During the past ten years, there has been an explosion in new platforms through which geographic data can be processed and visualised. For example, the advent of services such as Google Maps has made it easier for people to create geographical representations online. However, both the analysis of large volumes of data and the use of these new methods of representation or analysis do require some level of basic programming ability. Furthermore, many of these developments haven’t been led by geographers, and there’s a real danger that our skill set will be seen as superfluous to these activities in the future without some level of intervention. Indeed, it’s a sobering experience to look through the pages of job advertisements for GIS-type roles in the UK and internationally. Whereas these might once have required knowledge of a particular I N M Y O P I N I O N, a geography curriculum should require students to learn how to code, ensuring that they’re equipped for a changed job market that’s increasingly detached from geographic information systems (GIS) as they were originally conceived. January 2014 | 77 Learning to code A L E X S I N G L E T O N is a lecturer in geographic information science at the University of Liverpool P O I N T O F V I E W January 2014 | UK £4.50 www.geographical.co.uk M A G A Z I N E O F T H E R O YA L G E O G R A P H I C A L S O C I E T Y ( W I T H I B G ) Geographical HOW INDUSTRIAL FISHING IS EMPTYING THE SEAS AROUND THAILAND Can carbon capture and storage save the world? Deep disposal Manchester is my orchard Turning Moss Side's unwanted fruit into a thriving cider business Net loss "TDFOTJPO*TMBOEq/FQBMq"VSFM4UFJO PLUS www.geographical.co.uk Education

Slide 68

Slide 68 text

Many Thanks Any questions?