Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Mining and Mapping Social Big Data

Mining and Mapping Social Big Data

A workshop I gave at DH2015 on collecting and mapping Tweets quickly and easily using existing web tools.


Steven Gray

June 29, 2015

More Decks by Steven Gray

Other Decks in Research



                                                                                    Mining and Mapping Social Big Data DH2015 Workshop Steven Gray Monday 29th June 2015 - 1:30pm - 4:30pm
  2. None
  3. None
  4. None
  5. None



  9. None
  10. None
  11. None
  12. Lets  start  coding!

  13. Mac   -­‐-­‐  Install  Xcode  Command  line  tools  for  your

     Mac  OS  Version   -­‐-­‐  h$ps://developer.apple.com/downloads/   -­‐-­‐  Install  Homebrew  (h>p://brew.sh/)   -­‐-­‐  Run  on  terminal:     ruby  -­‐e  "$(curl  -­‐fsSL  h>ps://raw.github.com/mxcl/homebrew/go)"   -­‐-­‐  Follow  h>p://madebyhoundstooth.com/blog/install-­‐node-­‐with-­‐homebrew-­‐on-­‐os-­‐x/   -­‐-­‐  brew  install  node   NPM  will  be  installed  at  the  same  Mme  as  node!   Install  NodeJS  +  NPM
  14. Linux  (Ubuntu)   -­‐-­‐    Follow  h>ps://github.com/joyent/node/wiki/Installing-­‐Node.js-­‐via-­‐package-­‐manager   Windows  

      -­‐-­‐  Download  and  follow  instrucMons   Install  NodeJS  +  NPM h>p://nodejs.org/download/ Open  Node.js  from  Start  Menu  -­‐  Programs  -­‐  Node.js  -­‐  Node.js  Command  Prompt   Or  search  for  “Node.js  Command  Prompt”  on  your  system  (Windows  8,  8.1)  
  15. Collec8ng  Social  Media  Data  (Twi=er)  -­‐  A  Recipe What  you

     will  need:   1  x  Twi>er  Account   1  x  Developer  Key  from  h=ps://dev.twi=er.com/   Sprinkle  with  some  paMence  and  Mme  and  in  Mme  you’ll  be  collecMng   data  from  Twi>er.   InstrucMons  coming  up  soon  a^er  a  small  explanaMon.
  16. Understanding  what’s  happening http://dev.twitter.com/doc User Profiles Tweet Database

  17. HTTP G ET Request Results {"results":[ {"text":"@twitterapi is great!", "to_user_id":396524,

    "to_user":"TwitterAPI", "from_user":"jkoum", "metadata": { "result_type":"popular", "recent_retweets": 100 }, "id":1478555574, "from_user_id":1833773, "iso_language_code":"nl", ....} } Returns tweets that: • match specified query • located in Lat/Lon box • Historical Tweets • User Profile Information • Return Public Timeline Service is rate limited JSON or Atom Making  Requests  to  Twi=er
  18. Authen8ca8on 2  Methods  of  AuthenMcaMon   -­‐-­‐  Basic  HTTP  AuthenMcaMon

      -­‐-­‐  oAuth    (More  Complex)   Most  modern  day  web  services  use  oAuth  (including  Twi>er  and  Google  Services)   Keeps  informaMon  safe  from  the  wrong  people  but  also  allows  providers  to  ban  people   if  they  are  using  too  many  resources  or  making.  
  19. So  how  does  oAuth  work Server ApplicaMon Request  a  Session

     Token   Pass  a  ID  and  a  Secret Asks  Login  with  ID     Server  gives  a  Pin  number  for  session
  20. So  how  does  oAuth  work Server ApplicaMon Login  is  successful

      :) Sends  App  Tokens   These  need  to  be  saved!
  21. So  how  does  oAuth  work Server ApplicaMon Request  using  tokens

      Server  checks  if  it’s  valid Data  Returned   :)
  22. Go  to  h=ps://dev.twi=er.com/   Login  using  your  current  Twi>er  account

      Click  your  profile  picture  (top  right  corner)  and  click  My  Applica8ons   Click  Create  New  Applica8on   Fill  in  your  details  for  the  app   You’ll  be  presented  with  your  Consumer  Key  and  Consumer  Secret   Save  these  keys  for  the  next  step Collec8ng  Social  Media  Data  (Twi=er)   Crea8ng  your  Twi=er  Applica8on  Keys
  23. Lets  Collect  some  data Download  the  Collector   h>ps://gist.github.com/sjg/6712144

  24. Install  Modules  via  NPM    (All  Systems) Use  NPM  to

     install  the  following  modules  used  by  the  collector.     Check  that  you  have  npm  installed  by  running  “npm  -­‐v”  (it  should  report  1.3.*)     Run  the  following  to  install   npm  install  oauth  colors  
  25. Lets  Collect  some  data A^er  downloading  the  script  sMck  in

     your  keys  and  secret  in  the  script  between  the   quotaMon  marks  and  save  the  file   Consumer  Key Consumer  Secret
  26. Lets  Collect  some  data Copy  the  script  into  a  folder

     that  you  can  run  the  script  from  and  navigate  to  that   folder  in  your  system  terminal.   You  can  edit  the  locaMon  of  the  collector  by  changing  these  values  in  the  file:   To  run  the  collector  type:     node  twi=er_collect.js   “SearchTag”,  “LaMtude”,  “Longitude”,  “Radius”    -­‐  NB:    “”  will  return  all  tweets  in  that  area
  27. Lets  Collect  some  data If  all  goes  well  and  everything

     is  working  you  should  see  the  following  screen   Go  to  the  link  in  a  browser,  login  to  twi>er  and  type  the  pin  number  in.    Press  return   and  tweets  will  start  collecMng  in  a  file  called  “tweets.csv”  in  the  same  folder.  
  28. Visualising  the  output   with  Google  Map  Engine

  29. Go  to  h=ps://mapsengine.google.com   Click  on  New  Map   Name

     your  map  by  clicking  on  “Un8tled  Map”   Click  “Import”  and  upload  your  CSV  file   Click  Base  Map  to  change  the  map  style   Crea8ng  your  first  Visualisa8on  with  the  Data  Collected  
  30. Crea8ng  your  first  Visualisa8on  with  the  Data  Collected   Once

     the  file  has  uploaded,  choose  the  correct  LaMtude  and  Longitude  columns  and   then  click  ConMnue.    Click  the  ID  field  (le^  over)  as  the  marker  Mtle  and  press  finish.   The  updated  output  file  will,  hopefully,  auto  detect  the  lat/lng  fields.  
  31. Crea8ng  your  first  Visualisa8on  with  the  Data  Collected   Style

     your  markers  by  pressing  the  paint  bucket  (highlighted  in  red)  and  select  the   circle  shape  and  colour.    Experiment  with  styling  your  map.  
  32. Crea8ng  your  first  Visualisa8on  with  the  Data  Collected   If

     all  went  well  your  map  will  look  something  like  this.    Why  not  share  it  with  the   world!  
  33. Crea8ng  your  first  Visualisa8on  with  the  Data  Collected   And

     you  can  click  the  markers  and  see  the  tweets  
  34. Visualising  the  output   with  Google  Map  Engine

  35. Visualising  the  output   with  CartoDB

  36. Go  to  h=ps://cartodb.com/   Click  on  Login  or  Sign  Up

      Click  on  New  Map   Drag  your  CSV  file  onto  the  Map  to  upload  your  data  file   Click  Base  Map  to  change  the  map  style   Crea8ng  your  first  Visualisa8on  with  the  Data  Collected  
  37. Crea8ng  your  first  Visualisa8on  with  the  Data  Collected   Once

     the  file  has  uploaded,  your  data  should  be  automaMcally  be  Geocoded  and  displayed  on   the  map.    Now  you  can  edit  the  metadata,  view  the  data  contained  within  the  tweets  and  play   with  the  visualisaMons.    
  38. Crea8ng  your  first  Visualisa8on  with  the  Data  Collected   Style

     your  markers  and  change  the  visualisaMons.    Experiment  with  the  different  opMons  for   styling  your  map.    You  can  also  publish  your  map  at  this  stage.  
  39. Different  Sources  you  can  use  to  Visualise  your  data Google

     Fusion  Tables   -­‐  h>p://www.google.com/drive/apps.html#fusiontables   Google  Map  Engine   -­‐  h>ps://mapsengine.google.com   Google  Earth   -­‐  h>p://www.google.com/earth/index.html   MapTube   - h>p://www.maptube.org   CartoDB   - h>p://www.cartodb.com
  40. h>p://www.bigdatatoolkit.org

  41. None
  42. None
  43. None
  44. None
  45. None

                                                                                    Thanks Steven Gray steven.gray@ucl.ac.uk @frogo +StevenGray