Reproducible Research: Open Methods and Data

9609e8b5ecb752618cbb0db56b5c0a46?s=47 alexsingleton
October 29, 2015

Reproducible Research: Open Methods and Data

Regional Studies Association, 29/10/15 Sheffield

9609e8b5ecb752618cbb0db56b5c0a46?s=128

alexsingleton

October 29, 2015
Tweet

Transcript

  1. www.cdrc.ac.uk www.alex-singleton.com geographicdatascience.com @alexsingleton Consumer Data Research Centre An ESRC

    Data Investment Alex Singleton Professor of Geographic Information Science University of Liverpool Reproducible Research: Open Methods and Data
  2. A long time ago in a galaxy far, far away…

  3. • Versioning • Which data files? • Models / Graphs

    / Maps • How were these made? • Which data? • Sharing Data / Results? • Revisions • Returning to work after review…
  4. • Makes the case for pro-austerity policies http://scholar.harvard.edu/files/rogoff/files/growth_in_time_debt_aer.pdf

  5. 2009 Email Hack - “Climategate”

  6. Reproducible Research Data Methods Results Findings / Conclusions

  7. • Help mitigate potentially erroneous conclusions • Give public greater

    assurance • Publicly funded should mean public • It is happening already… Reproducible Research
  8. Reproducible Research Number of initiatives to test reproducible research http://validation.scienceexchange.com

  9. http://www.nature.com/news/first-results-from-psychology-s- largest-reproducibility-test-1.17433

  10. Data

  11. None
  12. None
  13. http://adrn.ac.uk/

  14. None
  15. http://meredithmmyers.com/ratmap/#/

  16. None
  17. https://www.openstreetmap.org/#map=19/53.38631/-2.91964

  18. z https://vimeo.com/9182869

  19. None
  20. https://data.cdrc.ac.uk/

  21. Methods

  22. https://www.r-project.org/

  23. R as a GIS

  24. R as a GIS breaks <- classIntervals(variable_to_map, n = 6,

    style = “fisher”)
  25. R as a GIS my_colours <- c(“#FFFFB2”,”#FED976”,”#FEB24C”,
 ”#FD8D3C”,”#F03B20","#BD0026")

  26. R as a GIS my_colours[findInterval(variable_to_map, breaks)]

  27. R as a GIS plot(LSOA, col = my_colours[findInterval(variable_to_map, breaks)], axes

    = FALSE, border = NA)
  28. http://www.alex-singleton.com/r/2014/02/05/2011-census-open-atlas-project-version-two/

  29. 134,567 maps; 6.9 years; £138,207

  30. 2010 Census of Japan Open Atlas Alex Singleton [www.alex-singleton.com] Chris

    Brunsdon, Tomoki Nakaya, Keiji Yano Version 1.0 ! 2011 Census Open Atlas Alex Singleton (www.alex-singleton.com) Version 2.0 The ability to code relates to basic programming and database skills that enable students to manipulate large and small geographic data sets, and to analyse them in automated and transparent ways. Although it might seem odd for a geographer to want to learn programming languages, we only have to look at geography curriculums from the 1980s to realise that these skills used to be taught. For example, it wouldn’t have been unusual for an undergraduate geographer to learn how to programme a basic statistical model (for example, regression) from base principles in Fortran (a programming language popular at the time) as part of a methods course. But during the 1990s, the popularisation of graphical user interfaces in software design enabled many statistical, spatial analysis and mapping operations to be wrapped up within visual and menu-driven interfaces, which were designed to lower the barriers of entry for users of these techniques. Gradually, much GIS teaching has transformed into learning how these software package, they increasingly look like advertisements for computer scientists, with expected skills and experience that wouldn’t traditionally be part of an undergraduate geography curriculum. Many of the problems that GIS set out to address can now be addressed with mainstream software or shared online services that are, as such, much easier to use. If I want to determine the most efficient route between two locations, a simple website query can give a response within seconds, accounting for live traffic-volume data. If I want to view the distribution of a census attribute over a given area, there are multiple free services that offer street-level mapping. Such tasks used to be far more complex, involving specialist software and technical skills. There are now far fewer job advertisements for GIS technicians than there were ten years ago. Much traditional GIS-type analysis is now sufficiently non-technical that it requires little specialist skill, or has been automated through software services, with a subscription replacing the employment of a technician. The market has moved on. Geographers shouldn’t become computer scientists; however, we need to reassert our role in the development and critique of existing and new GIS. For example, we need to ask questions such as which type of geographic representation might be most appropriate for a given dataset. Today’s geographers may be able to talk in general terms about such a question, but they need to be able to provide a more effective answer that encapsulates the technologies that are used for display. Understanding what is and isn’t possible in technical terms is as important as understanding the underlying cartographic principles. Such insights will be more available to a geographer who has learnt how to code. Within the area of GIS, technological change has accelerated at an alarming rate in the past decade and geography curriculums need to ensure that they embrace these developments. This does, however, come with challenges. Academics must ensure that they are up to date with market developments and also that there’s sufficient capacity within the system to make up-skilling possible. Prospective geography undergraduates should also consider how the university curriculums have adapted to modern market conditions and whether they offer the opportunity to learn how to code. software systems operate, albeit within a framework of geographic information science (GISc) concerned with the social and ethical considerations of building representations from geographic data. Some Masters degrees in GISc still require students to code, but few undergraduate courses do so. The good news is that it’s never been more exciting to be a geographer. Huge volumes of spatial data about how the world looks and functions are being collected and disseminated. However, translating such data safely into useful information is a complex task. During the past ten years, there has been an explosion in new platforms through which geographic data can be processed and visualised. For example, the advent of services such as Google Maps has made it easier for people to create geographical representations online. However, both the analysis of large volumes of data and the use of these new methods of representation or analysis do require some level of basic programming ability. Furthermore, many of these developments haven’t been led by geographers, and there’s a real danger that our skill set will be seen as superfluous to these activities in the future without some level of intervention. Indeed, it’s a sobering experience to look through the pages of job advertisements for GIS-type roles in the UK and internationally. Whereas these might once have required knowledge of a particular I N M Y O P I N I O N, a geography curriculum should require students to learn how to code, ensuring that they’re equipped for a changed job market that’s increasingly detached from geographic information systems (GIS) as they were originally conceived. January 2014 | 77 Learning to code A L E X S I N G L E T O N is a lecturer in geographic information science at the University of Liverpool P O I N T O F V I E W January 2014 | UK £4.50 www.geographical.co.uk M AG A Z I N E O F T H E R OYA L G E O G R A P H I C A L S O C I E T Y ( W I T H I B G ) Geographical HOW INDUSTRIAL FISHING IS EMPTYING THE SEAS AROUND THAILAND Can carbon capture and storage save the world? Deep disposal Manchester is my orchard Turning Moss Side's unwanted fruit into a thriving cider business Net loss "TDFOTJPO*TMBOEq/FQBMq"VSFM4UFJO PLUS
  31. Version Control

  32. None
  33. None
  34. James Reid - Northern Ireland Atlas (http://ukbdev.edina.ac.uk/Census2011/)

  35. Results

  36. https://www.rstudio.com/

  37. http://www.tandfonline.com/loi/rsrs20

  38. None
  39. Example

  40. http://www.google.co.uk/intl/en_uk/earth/ 52: POORER FAMILIES, MANY CHILDREN, TERRACED HOUSING 51: YOUNG

    PEOPLE IN SMALL, LOW COST TERRACES 59: DEPRIVED AREAS AND HIGH- RISE FLATS 11: SETTLED SUBURBIA, OLDER PEOPLE Urban Adversity Affluent Achievers
  41. 1 − Rural Residents 2 − Cosmopolitans 3 − Ethnicity

    Central 4 − Multicultural Metropolitans 5 − Urbanites 6 − Suburbanites 7 − Constrained City Dwellers 8 − Hard−Pressed Living
  42. maps.cdrc.ac.uk

  43. https://data.cdrc.ac.uk

  44. 1 − Rural Residents 2 − Cosmopolitans 3 − Ethnicity

    Central 4 − Multicultural Metropolitans 5 − Urbanites 6 − Suburbanites 7 − Constrained City Dwellers 8 − Hard−Pressed Living 1−Family Terraces 2−Students and University 3−Constrained and Aging 4−Central Diversity 5−Affluent Suburbs 6−Struggling Families 7−City and Central
  45. 1-Family Terraces 2-Students and University 3-Constrained and Aging 4-Central Diversity

    5-Affluent Suburbs 6-Struggling Families 7-City and Central 2-Cosmopolitans 7.2 34.3 1.8 1.8 0 0 54.8 3-Ethnicity Central 0.9 3.7 0 80.4 0 0 15 4-Multicultural Metropolitans 15 8.8 0 69.9 1.8 4.4 0 5-Urbanites 41.7 0.5 5.9 0 51.5 0 0.5 6-Suburbanites 0 0.5 0 0 99.5 0 0 7-Constrained City Dwellers 15.8 0 67.7 7.6 0.3 8.2 0.3 8-Hard-Pressed Living 23.5 0 9.3 0 6.6 60.6 0 Liverpool Classification National Classification • Central core areas split in Liverpool: Professionals / Students • Although some areas are less affluent, for Liverpool these are not the most distinctive features
  46. Family Terraces Within these predominantly terraced areas, there are many

    families with young children, however, fewer ethnic minorities than the Liverpool average. Most property is owner occupied or rented from the private sector. L13 2AY Colwyn Road Affluent Suburbs These affluent suburban areas feature larger detached and semi- detached houses, many of which are owner occupied. Residents are typically well qualified and in the latter stages of successful careers in the public sector, finance or education. Families who have had children are typically old enough to no longer be dependent. L12 3HB Blackmoor Drive
  47. None
  48. Regions and Cities Series

  49. Many thanks….