Enhancing Geospatial Data Collection and Visualisation via Custom Toolkits, Consumer Devices and Mass Participation

0ae09092606666ea375bdd12052fd77a?s=47 Steven Gray
December 04, 2013

Enhancing Geospatial Data Collection and Visualisation via Custom Toolkits, Consumer Devices and Mass Participation

PhD Upgrade Seminar - Steven Gray - 04/12/2013

0ae09092606666ea375bdd12052fd77a?s=128

Steven Gray

December 04, 2013
Tweet

Transcript

  1. Enhancing Geospatial Data Collection and Visualisation via Custom Toolkits, Consumer

    Devices and Mass Participation Steven Gray, Research Associate steven.gray@ucl.ac.uk UCL Centre for Advanced Spatial Analysis
  2. About Me Research Associate (UCL CASA) September 2009 -- Present

    Research Associate (University of Glasgow - GIST Dept Computing Science) January 2008 -- September 2009 Txt Part-time PhD - Started January 2011 Projects worked on at CASA National e-Infrastructure for Social Science (NeISS) JISC (GEMMA) Talisman
  3. Research Question Can targeted data collection and aggregation enhance data

    visualisation? Can mining data from multiple sources derive meaningful patterns in social behaviour? Sub Questions Can we mine large data sets in realtime for specific insights to reduce the problem set before building visualisations? Main Question
  4. Background Data Collection Utilising the Cloud Analysing Data New Methods

  5. Background Data Collection Utilising the Cloud Analysing Data New Methods

  6. Collecting data from the crowd.

  7. Enter the Social Revolution Rise of Social Media

  8. None
  9. Technological Traditional Approach: Build an application to collect specific data

    Mappiness EpiCollect Nature Locator Collecting Apps
  10. Turning API’s into meaning is challenging

  11. New Insights for Social Science http://www.cosmosproject.net

  12. Background Data Collection Utilising the Cloud Analysing Data New Methods

  13. None
  14. Talisman Project Goals • Develop and extend state of the

    art geospatial methods in the form of new data analysis techniques and new simulation models. • Build new methods of data acquisition and visualisation that will help illuminate and address key policy challenges at local, national and global levels. ! ! • Improve the uptake and dissemination of skills in geospatial analysis through a comprehensive suite of training and capacity-building activities. • Contribute to the success of the NCRM programme and participate fully in its activities. See our Past Events and Upcoming Events pages for some examples. • Build new methods of data acquisition and visualisation that will help illuminate and address key policy challenges at local, national and global levels.
  15. None
  16. Custom Endpoint Carling Cup Internet of Schools iPad Video Wall

    ESRA2013 Tweet-o-Meter CityDashboard Internet of Me QRator UKSnow Maps Physical TOM New City Landscapes AV Referendum GEMMA Textal Analogies Olympic Collection AWS 200 server collection Usage of Toolkit Mobile Websites Twitter EE Project DataSift Data Comparison 10 Cities Collection Collections SurveyMapper #5Acts London Mayor Park Survey BBC Old Age Scottish Water Services Grant Petrie Popup Brands
  17. BigDataToolkit Aims • A single toolkit for collecting and analysing

    data • Easy to setup, run and collect data • Leverage Cloud Computing to power advanced analytics • Create a toolkit for the public to collect and process data • Analysing unstructured and unlinked data • Feed data into models, large processing platforms for further analysis Open Source Data Collection Platform which is platform agnostic and easy to use
  18. None
  19. None
  20. None
  21. None
  22. None
  23. SurveyMapper.com World/Nation/City/Borough/Ward/Street Survey Anything Realtime Mapping/Data Download ! ! Used

    by The Mayor of London BBC - 5000 responses in 1 hour Scottish Water The Public.... ! ! ! ! !
  24. BBC Look East Survey - Broadband Speed Test ! !

    ! ! ! ! !
  25. 3 hours - Search: Walkman 7,948 tweets Monday 25th 2010

    16:00 - Monday 25th 2010 18:00 Raw Data - Tweets per minute Search API vs Streaming API running mean Search API Streaming API
  26. 3 hours - Search: Walkman 7,948 tweets Monday 25th 2010

    15:00 - Monday 25th 2010 18:00 Mean Results - Tweets per minute Search API vs Streaming API running mean Search API Streaming API
  27. None
  28. In more detail

  29. None
  30. None
  31. None
  32. None
  33. Stats Service (Node JS) Web Interface Desktop App Wrapper Local

    Database Twitter Facebook Google+ Foursquare Collector Modules Process Proxy (Node JS) Local Server (Node JS) What is the BigDataToolkit Collection of tools to mine data from API’s
  34. Stats Service (Node JS) Web Interface Desktop App Wrapper Local

    Database Twitter Facebook Google+ Foursquare Collector Modules Process Proxy (Node JS) Local Server (Node JS) Twitter Collector PID: 6198 Facebook Collector PID: 5390 What is the BigDataToolkit
  35. Background Why are we doing this Utilising the Cloud Analysing

    Data New Methods
  36. None
  37. Collecting on Local Cloud Our Setup - Inside the Virtual

    Machine Manager
  38. Collecting on the Local Cloud EE Collection 32 Collectors on

    8 servers Olympic Collection 24 Collectors on 6 servers 9,647,651 records 1,497,696 records
  39. None
  40. None
  41. None
  42. None
  43. None
  44. 4 collectors per machine - 200 machines on Amazon EC2

  45. Stats Service (Node JS) Local Database Twitter Facebook Google+ Foursquare

    Collector Modules Process Proxy (Node JS) Local Server (Node JS) Twitter Collector PID: 6198 Facebook Collector PID: 5390 BDTK Host Proxy BDTK Job Server Web Interface Desktop App Wrapper BigData Toolkit in the Cloud
  46. Background Why are we doing this Utilising the Cloud Analysing

    Data New Methods
  47. None
  48. What is Textal? • iPhone App for Text Analysis •

    Explore the relationships between words in the text • Tool for the Public (non experts) • Launched July 2013 http://www.textal.org
  49. • Create Word Clouds from Text • Websites • Twitter

    + Social Media • Books • Own text (Emails,Documents, etc.) What is Textal?
  50. What is Textal? • More than just a Word Cloud

    • Interactive and Dynamic • Generates Stats for Each Word • Collocations • Common Pairs • Scrabble Scores • Frequency Counts !
  51. What is Textal? • More than just a Word Cloud

    • Interactive and Dynamic • Generates Stats for Each Word • Collocations • Common Pairs • Scrabble Scores • Frequency Counts !
  52. What is Textal? • More than just a Word Cloud

    • Interactive and Dynamic • Generates Stats for Each Word • Collocations • Common Pairs • Scrabble Scores • Frequency Counts !
  53. None
  54. None
  55. So what’s the connection?

  56. None
  57. None
  58. None
  59. Introducing Smart Collectors Data Feedback Loop collect data process data

    from each area collected alert user to changes in collection
  60. None
  61. Background Why are we doing this Utilising the Cloud Analysing

    Data New Methods
  62. New Methods for Data Collection

  63. SurveyMapperLive

  64. SurveyMapperLive

  65. None
  66. None
  67. Why are Smart Collectors Important?

  68. Realtime Processing of Large Datasets Stats Service (Node JS) Local

    Database Twitter Facebook Google+ Foursquare Collector Modules Process Proxy (Node JS) Local Server (Node JS) Twitter Collector PID: 6198 Facebook Collector PID: 5390 BDTK Host Proxy BDTK Job Server Web Interface Desktop App Wrapper
  69. *RRJ Mixer 0 Mixer 1 Mixer 1 Leaf Leaf Leaf

    Leaf Distributed Storage SELECT collector_message, PBLG 10 GB / s COUNT (id) GROUP BY collector_message WHERE timestamp > CUTOFF 25'(5%<WLPHVWDPS'(6& &2817 PBLG *5283%<FROOHFWRUBPHVVDJH &2817 PBLG *5283%<FROOHFWRUBPHVVDJH Spanner: a Globally-Distributed Database James C. Corbett, Jeffrey Dean, et. al Published in the Proceedings of OSDI'12: Tenth Symposium on Operating System Design and Implementation, October, 2012 Realtime Processing of Large Datasets
  70. *RRJOHFRQȴGHQWLDO_'RQRWGLVWULEXWH Mixer 0 Mixer 1 Mixer 1 Leaf Leaf Leaf

    Leaf Distributed Storage SELECT collector_message, PBLG 10 GB / s COUNT (id) GROUP BY collector_message WHERE timestamp > CUTOFF 25'(5%<WLPHVWDPS'(6& &2817 PBLG *5283%<FROOHFWRUBPHVVDJH &2817 PBLG *5283%<FROOHFWRUBPHVVDJH Realtime Processing of Large Datasets Spanner: a Globally-Distributed Database James C. Corbett, Jeffrey Dean, et. al Published in the Proceedings of OSDI'12: Tenth Symposium on Operating System Design and Implementation, October, 2012
  71. Ethical Issues

  72. Evaluation of Success

  73. Chapter Outline 1.Introduction 1.1 A brief history of Geospatial systems

    1.2 Where do they come from 1.3 Visualisations, Infographics and the web 1.4 Rise of Open Data and Crowd Sourcing 1.5 Open Data movement and effects on the Geo community 1.6 The advent of the API 1.7 Restrictions of collection data via API's 1.8 Need for Open Data 2. Public Participation and Data Collection 2.1 The problems associated with Data collection 2.2 Affecting policy change with participation 2.3 Growth of the Web 2.4 Traditional Data collection to the Social Network 2.5 The rise of the social network 2.6 Commercial Services for collecting social data 2.7 The problems associated with collection 2.8 Data Pervasiveness and Automated Participation 2.9 History of Big Data
  74. Chapter Outline 3. Data Collection, Data Analysis, and Mining 3.1

    Introduction to Applications 3.2 Application Type Overview 3.3 Automating Collection 3.4 Linking to Public Participation 3.5 Updating the Ladder of Participation 3.6 Introducing the Geography Engine 4. Geography Engine 4.1 Building a Geography Engine 4.2 Data behind the Geography Engine 4.3 How the system was built 5. Applications of the Engine 4.1 Tweet-o-Meter 4.2 SurveyMapper 4.3 How the system was built 4.4 SurveyMapper Live 4.5 SurveyMapper Mobile 4.6 Social Media Collection Suite 4.7 Gemma - Geospatial Engine for Mass Mapping Applications
  75. Chapter Outline 6. Leveraging Big Data and the Cloud 6.1

    Distributed Systems 6.2 Feedback Loop 6.3 Pulling together the Engine and Cloud Computing 6.4 Communication Methods between Servers 6.5 Real-time analysis of live data to influence collection 7. Humanities and the Engine 7.1 QRator 7.2 Textal 7.3 Feedback loop into the Engine 8. Real-time Data and Exhibition Visualisation 8.1 CityDashboard 8.2 iPad Video Wall 8.3 Tweet-o-Meter wall 8.4 Real-time Video to Policy change
  76. Chapter Outline 9. Making Sense of Data and the System

    9.1 Impact of Policy 9.2 Impact of Data 9.3 Impact of Applications 10. Conclusions and Future Work
  77. Publications Exploring the Geography of Communities in Social Networks A

    Comber, M Batty, C Brunsdon, A Hudson-Smith, F Neuhaus, S Gray ! Calibration of a spatial simulation model with volunteered geographical information M Birkin, N Malleson, A Hudson-Smith, S Gray, R Milton International Journal of Geographical Information Science 25 (8), 1221-1239 ! Geographic Analysis of Social Network Data M Batty, A Hudson-Smith, F Neuhaus, S Gray Proceedings of the Agile 2012 International Conference on Geographic Information Science, 2012 ! A data and analysis resource for an experiment in text mining a collection of micro-blogs on a political topic. A Comber, M Batty, C Brunsdon, A Hudson-Smith, F Neuhaus, S Gray ! Text mining with Textal S Gray, M Terras National Centre for Research Methods ! GEMMA–Making Maps Even Easier O O’Brien, S Gray, A Hudson-Smith 1st European State of the Map
  78. Publications Enhancing Museum Narratives: Tales of Things and UCL's Grant

    Museum C Ross, M Carnall, A Hudson-Smith, C Warwick, M Terras, S Gray Routledge (Book Chapter) ! Engaging the Museum Space: Mobilising Visitor Engagement with Digital Content Creation C Ross, S Gray, C Warwick, A Hudson-Smith, M Terras 24th Joint International Conference of the Association for Literacy and Linguistic Computing and the Association for Computers and the Humanities - Digital Humanities 2012 ! Experiments with the internet of things in museum space: QRator A Hudson-Smith, S Gray, C Ross, R Barthel, M de Jode, C Warwick, M Terras Proceedings of the 2012 ACM Conference on Ubiquitous Computing, 1183-1184 ! Enhancing Museum Narratives with the QRator Project: a Tasmanian devil, a Platypus and a Dead Man in a Box S Gray, C Ross, A Hudson-Smith, M Terras, C Warwick Museums and the Web 2012 ! The QRator Project: Promoting Personal Meaning Making in Museums S Gray, C Ross, A Hudson-Smith, M Terras, C Warwick Dimensions May-June 2013
  79. Thank you steven.gray@ucl.ac.uk Twitter: @frogo Google+: +StevenGray http://www.stevenjamesgray.com Any Questions

    ?