Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Enhancing Geospatial Data Collection and Visualisation via Custom Toolkits, Consumer Devices and Mass Participation

Steven Gray
December 04, 2013

Enhancing Geospatial Data Collection and Visualisation via Custom Toolkits, Consumer Devices and Mass Participation

PhD Upgrade Seminar - Steven Gray - 04/12/2013

Steven Gray

December 04, 2013
Tweet

More Decks by Steven Gray

Other Decks in Research

Transcript

  1. Enhancing Geospatial Data Collection and Visualisation via Custom Toolkits, Consumer

    Devices and Mass Participation Steven Gray, Research Associate [email protected] UCL Centre for Advanced Spatial Analysis
  2. About Me Research Associate (UCL CASA) September 2009 -- Present

    Research Associate (University of Glasgow - GIST Dept Computing Science) January 2008 -- September 2009 Txt Part-time PhD - Started January 2011 Projects worked on at CASA National e-Infrastructure for Social Science (NeISS) JISC (GEMMA) Talisman
  3. Research Question Can targeted data collection and aggregation enhance data

    visualisation? Can mining data from multiple sources derive meaningful patterns in social behaviour? Sub Questions Can we mine large data sets in realtime for specific insights to reduce the problem set before building visualisations? Main Question
  4. Talisman Project Goals • Develop and extend state of the

    art geospatial methods in the form of new data analysis techniques and new simulation models. • Build new methods of data acquisition and visualisation that will help illuminate and address key policy challenges at local, national and global levels. ! ! • Improve the uptake and dissemination of skills in geospatial analysis through a comprehensive suite of training and capacity-building activities. • Contribute to the success of the NCRM programme and participate fully in its activities. See our Past Events and Upcoming Events pages for some examples. • Build new methods of data acquisition and visualisation that will help illuminate and address key policy challenges at local, national and global levels.
  5. Custom Endpoint Carling Cup Internet of Schools iPad Video Wall

    ESRA2013 Tweet-o-Meter CityDashboard Internet of Me QRator UKSnow Maps Physical TOM New City Landscapes AV Referendum GEMMA Textal Analogies Olympic Collection AWS 200 server collection Usage of Toolkit Mobile Websites Twitter EE Project DataSift Data Comparison 10 Cities Collection Collections SurveyMapper #5Acts London Mayor Park Survey BBC Old Age Scottish Water Services Grant Petrie Popup Brands
  6. BigDataToolkit Aims • A single toolkit for collecting and analysing

    data • Easy to setup, run and collect data • Leverage Cloud Computing to power advanced analytics • Create a toolkit for the public to collect and process data • Analysing unstructured and unlinked data • Feed data into models, large processing platforms for further analysis Open Source Data Collection Platform which is platform agnostic and easy to use
  7. SurveyMapper.com World/Nation/City/Borough/Ward/Street Survey Anything Realtime Mapping/Data Download ! ! Used

    by The Mayor of London BBC - 5000 responses in 1 hour Scottish Water The Public.... ! ! ! ! !
  8. 3 hours - Search: Walkman 7,948 tweets Monday 25th 2010

    16:00 - Monday 25th 2010 18:00 Raw Data - Tweets per minute Search API vs Streaming API running mean Search API Streaming API
  9. 3 hours - Search: Walkman 7,948 tweets Monday 25th 2010

    15:00 - Monday 25th 2010 18:00 Mean Results - Tweets per minute Search API vs Streaming API running mean Search API Streaming API
  10. Stats Service (Node JS) Web Interface Desktop App Wrapper Local

    Database Twitter Facebook Google+ Foursquare Collector Modules Process Proxy (Node JS) Local Server (Node JS) What is the BigDataToolkit Collection of tools to mine data from API’s
  11. Stats Service (Node JS) Web Interface Desktop App Wrapper Local

    Database Twitter Facebook Google+ Foursquare Collector Modules Process Proxy (Node JS) Local Server (Node JS) Twitter Collector PID: 6198 Facebook Collector PID: 5390 What is the BigDataToolkit
  12. Collecting on the Local Cloud EE Collection 32 Collectors on

    8 servers Olympic Collection 24 Collectors on 6 servers 9,647,651 records 1,497,696 records
  13. Stats Service (Node JS) Local Database Twitter Facebook Google+ Foursquare

    Collector Modules Process Proxy (Node JS) Local Server (Node JS) Twitter Collector PID: 6198 Facebook Collector PID: 5390 BDTK Host Proxy BDTK Job Server Web Interface Desktop App Wrapper BigData Toolkit in the Cloud
  14. What is Textal? • iPhone App for Text Analysis •

    Explore the relationships between words in the text • Tool for the Public (non experts) • Launched July 2013 http://www.textal.org
  15. • Create Word Clouds from Text • Websites • Twitter

    + Social Media • Books • Own text (Emails,Documents, etc.) What is Textal?
  16. What is Textal? • More than just a Word Cloud

    • Interactive and Dynamic • Generates Stats for Each Word • Collocations • Common Pairs • Scrabble Scores • Frequency Counts !
  17. What is Textal? • More than just a Word Cloud

    • Interactive and Dynamic • Generates Stats for Each Word • Collocations • Common Pairs • Scrabble Scores • Frequency Counts !
  18. What is Textal? • More than just a Word Cloud

    • Interactive and Dynamic • Generates Stats for Each Word • Collocations • Common Pairs • Scrabble Scores • Frequency Counts !
  19. Introducing Smart Collectors Data Feedback Loop collect data process data

    from each area collected alert user to changes in collection
  20. Realtime Processing of Large Datasets Stats Service (Node JS) Local

    Database Twitter Facebook Google+ Foursquare Collector Modules Process Proxy (Node JS) Local Server (Node JS) Twitter Collector PID: 6198 Facebook Collector PID: 5390 BDTK Host Proxy BDTK Job Server Web Interface Desktop App Wrapper
  21. *RRJ Mixer 0 Mixer 1 Mixer 1 Leaf Leaf Leaf

    Leaf Distributed Storage SELECT collector_message, PBLG 10 GB / s COUNT (id) GROUP BY collector_message WHERE timestamp > CUTOFF 25'(5%<WLPHVWDPS'(6& &2817 PBLG *5283%<FROOHFWRUBPHVVDJH &2817 PBLG *5283%<FROOHFWRUBPHVVDJH Spanner: a Globally-Distributed Database James C. Corbett, Jeffrey Dean, et. al Published in the Proceedings of OSDI'12: Tenth Symposium on Operating System Design and Implementation, October, 2012 Realtime Processing of Large Datasets
  22. *RRJOHFRQȴGHQWLDO_'RQRWGLVWULEXWH Mixer 0 Mixer 1 Mixer 1 Leaf Leaf Leaf

    Leaf Distributed Storage SELECT collector_message, PBLG 10 GB / s COUNT (id) GROUP BY collector_message WHERE timestamp > CUTOFF 25'(5%<WLPHVWDPS'(6& &2817 PBLG *5283%<FROOHFWRUBPHVVDJH &2817 PBLG *5283%<FROOHFWRUBPHVVDJH Realtime Processing of Large Datasets Spanner: a Globally-Distributed Database James C. Corbett, Jeffrey Dean, et. al Published in the Proceedings of OSDI'12: Tenth Symposium on Operating System Design and Implementation, October, 2012
  23. Chapter Outline 1.Introduction 1.1 A brief history of Geospatial systems

    1.2 Where do they come from 1.3 Visualisations, Infographics and the web 1.4 Rise of Open Data and Crowd Sourcing 1.5 Open Data movement and effects on the Geo community 1.6 The advent of the API 1.7 Restrictions of collection data via API's 1.8 Need for Open Data 2. Public Participation and Data Collection 2.1 The problems associated with Data collection 2.2 Affecting policy change with participation 2.3 Growth of the Web 2.4 Traditional Data collection to the Social Network 2.5 The rise of the social network 2.6 Commercial Services for collecting social data 2.7 The problems associated with collection 2.8 Data Pervasiveness and Automated Participation 2.9 History of Big Data
  24. Chapter Outline 3. Data Collection, Data Analysis, and Mining 3.1

    Introduction to Applications 3.2 Application Type Overview 3.3 Automating Collection 3.4 Linking to Public Participation 3.5 Updating the Ladder of Participation 3.6 Introducing the Geography Engine 4. Geography Engine 4.1 Building a Geography Engine 4.2 Data behind the Geography Engine 4.3 How the system was built 5. Applications of the Engine 4.1 Tweet-o-Meter 4.2 SurveyMapper 4.3 How the system was built 4.4 SurveyMapper Live 4.5 SurveyMapper Mobile 4.6 Social Media Collection Suite 4.7 Gemma - Geospatial Engine for Mass Mapping Applications
  25. Chapter Outline 6. Leveraging Big Data and the Cloud 6.1

    Distributed Systems 6.2 Feedback Loop 6.3 Pulling together the Engine and Cloud Computing 6.4 Communication Methods between Servers 6.5 Real-time analysis of live data to influence collection 7. Humanities and the Engine 7.1 QRator 7.2 Textal 7.3 Feedback loop into the Engine 8. Real-time Data and Exhibition Visualisation 8.1 CityDashboard 8.2 iPad Video Wall 8.3 Tweet-o-Meter wall 8.4 Real-time Video to Policy change
  26. Chapter Outline 9. Making Sense of Data and the System

    9.1 Impact of Policy 9.2 Impact of Data 9.3 Impact of Applications 10. Conclusions and Future Work
  27. Publications Exploring the Geography of Communities in Social Networks A

    Comber, M Batty, C Brunsdon, A Hudson-Smith, F Neuhaus, S Gray ! Calibration of a spatial simulation model with volunteered geographical information M Birkin, N Malleson, A Hudson-Smith, S Gray, R Milton International Journal of Geographical Information Science 25 (8), 1221-1239 ! Geographic Analysis of Social Network Data M Batty, A Hudson-Smith, F Neuhaus, S Gray Proceedings of the Agile 2012 International Conference on Geographic Information Science, 2012 ! A data and analysis resource for an experiment in text mining a collection of micro-blogs on a political topic. A Comber, M Batty, C Brunsdon, A Hudson-Smith, F Neuhaus, S Gray ! Text mining with Textal S Gray, M Terras National Centre for Research Methods ! GEMMA–Making Maps Even Easier O O’Brien, S Gray, A Hudson-Smith 1st European State of the Map
  28. Publications Enhancing Museum Narratives: Tales of Things and UCL's Grant

    Museum C Ross, M Carnall, A Hudson-Smith, C Warwick, M Terras, S Gray Routledge (Book Chapter) ! Engaging the Museum Space: Mobilising Visitor Engagement with Digital Content Creation C Ross, S Gray, C Warwick, A Hudson-Smith, M Terras 24th Joint International Conference of the Association for Literacy and Linguistic Computing and the Association for Computers and the Humanities - Digital Humanities 2012 ! Experiments with the internet of things in museum space: QRator A Hudson-Smith, S Gray, C Ross, R Barthel, M de Jode, C Warwick, M Terras Proceedings of the 2012 ACM Conference on Ubiquitous Computing, 1183-1184 ! Enhancing Museum Narratives with the QRator Project: a Tasmanian devil, a Platypus and a Dead Man in a Box S Gray, C Ross, A Hudson-Smith, M Terras, C Warwick Museums and the Web 2012 ! The QRator Project: Promoting Personal Meaning Making in Museums S Gray, C Ross, A Hudson-Smith, M Terras, C Warwick Dimensions May-June 2013