Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Mining and Mapping Social Big Data

Mining and Mapping Social Big Data

A workshop I gave at DH2015 on collecting and mapping Tweets quickly and easily using existing web tools.

Steven Gray

June 29, 2015
Tweet

More Decks by Steven Gray

Other Decks in Research

Transcript





























































































  1. Mining and Mapping Social
    Big Data DH2015 Workshop
    Steven Gray
    Monday 29th June 2015 - 1:30pm - 4:30pm

    View full-size slide

  2. Lets  start  coding!

    View full-size slide

  3. Mac  
    -­‐-­‐  Install  Xcode  Command  line  tools  for  your  Mac  OS  Version  
    -­‐-­‐  h$ps://developer.apple.com/downloads/  
    -­‐-­‐  Install  Homebrew  (h>p://brew.sh/)  
    -­‐-­‐  Run  on  terminal:    
    ruby  -­‐e  "$(curl  -­‐fsSL  h>ps://raw.github.com/mxcl/homebrew/go)"  
    -­‐-­‐  Follow  h>p://madebyhoundstooth.com/blog/install-­‐node-­‐with-­‐homebrew-­‐on-­‐os-­‐x/  
    -­‐-­‐  brew  install  node  
    NPM  will  be  installed  at  the  same  Mme  as  node!  
    Install  NodeJS  +  NPM

    View full-size slide

  4. Linux  (Ubuntu)  
    -­‐-­‐    Follow  h>ps://github.com/joyent/node/wiki/Installing-­‐Node.js-­‐via-­‐package-­‐manager  
    Windows    
    -­‐-­‐  Download  and  follow  instrucMons  
    Install  NodeJS  +  NPM
    h>p://nodejs.org/download/
    Open  Node.js  from  Start  Menu  -­‐  Programs  -­‐  Node.js  -­‐  Node.js  Command  Prompt  
    Or  search  for  “Node.js  Command  Prompt”  on  your  system  (Windows  8,  8.1)  

    View full-size slide

  5. Collec8ng  Social  Media  Data  (Twi=er)  -­‐  A  Recipe
    What  you  will  need:  
    1  x  Twi>er  Account  
    1  x  Developer  Key  from  h=ps://dev.twi=er.com/  
    Sprinkle  with  some  paMence  and  Mme  and  in  Mme  you’ll  be  collecMng  
    data  from  Twi>er.  
    InstrucMons  coming  up  soon  a^er  a  small  explanaMon.

    View full-size slide

  6. Understanding  what’s  happening
    http://dev.twitter.com/doc
    User Profiles Tweet Database

    View full-size slide

  7. HTTP
    G
    ET
    Request
    Results
    {"results":[
    {"text":"@twitterapi is great!",
    "to_user_id":396524,
    "to_user":"TwitterAPI",
    "from_user":"jkoum",
    "metadata":
    {
    "result_type":"popular",
    "recent_retweets": 100
    },
    "id":1478555574,
    "from_user_id":1833773,
    "iso_language_code":"nl",
    ....}
    }
    Returns tweets that:
    • match specified query
    • located in Lat/Lon box
    • Historical Tweets
    • User Profile Information
    • Return Public Timeline
    Service is rate limited
    JSON or Atom
    Making  Requests  to  Twi=er

    View full-size slide

  8. Authen8ca8on
    2  Methods  of  AuthenMcaMon  
    -­‐-­‐  Basic  HTTP  AuthenMcaMon  
    -­‐-­‐  oAuth    (More  Complex)  
    Most  modern  day  web  services  use  oAuth  (including  Twi>er  and  Google  Services)  
    Keeps  informaMon  safe  from  the  wrong  people  but  also  allows  providers  to  ban  people  
    if  they  are  using  too  many  resources  or  making.  

    View full-size slide

  9. So  how  does  oAuth  work
    Server
    ApplicaMon
    Request  a  Session  Token  
    Pass  a  ID  and  a  Secret
    Asks  Login  with  ID    
    Server  gives  a  Pin  number  for  session

    View full-size slide

  10. So  how  does  oAuth  work
    Server
    ApplicaMon
    Login  is  successful  
    :)
    Sends  App  Tokens  
    These  need  to  be  saved!

    View full-size slide

  11. So  how  does  oAuth  work
    Server
    ApplicaMon
    Request  using  tokens  
    Server  checks  if  it’s  valid
    Data  Returned  
    :)

    View full-size slide

  12. Go  to  h=ps://dev.twi=er.com/  
    Login  using  your  current  Twi>er  account  
    Click  your  profile  picture  (top  right  corner)  and  click  My  Applica8ons  
    Click  Create  New  Applica8on  
    Fill  in  your  details  for  the  app  
    You’ll  be  presented  with  your  Consumer  Key  and  Consumer  Secret  
    Save  these  keys  for  the  next  step
    Collec8ng  Social  Media  Data  (Twi=er)  
    Crea8ng  your  Twi=er  Applica8on  Keys

    View full-size slide

  13. Lets  Collect  some  data
    Download  the  Collector  
    h>ps://gist.github.com/sjg/6712144

    View full-size slide

  14. Install  Modules  via  NPM    (All  Systems)
    Use  NPM  to  install  the  following  modules  used  by  the  collector.    
    Check  that  you  have  npm  installed  by  running  “npm  -­‐v”  (it  should  report  1.3.*)    
    Run  the  following  to  install  
    npm  install  oauth  colors  

    View full-size slide

  15. Lets  Collect  some  data
    A^er  downloading  the  script  sMck  in  your  keys  and  secret  in  the  script  between  the  
    quotaMon  marks  and  save  the  file  
    Consumer  Key
    Consumer  Secret

    View full-size slide

  16. Lets  Collect  some  data
    Copy  the  script  into  a  folder  that  you  can  run  the  script  from  and  navigate  to  that  
    folder  in  your  system  terminal.  
    You  can  edit  the  locaMon  of  the  collector  by  changing  these  values  in  the  file:  
    To  run  the  collector  type:    
    node  twi=er_collect.js  
    “SearchTag”,  “LaMtude”,  “Longitude”,  “Radius”  
     -­‐  NB:    “”  will  return  all  tweets  in  that  area

    View full-size slide

  17. Lets  Collect  some  data
    If  all  goes  well  and  everything  is  working  you  should  see  the  following  screen  
    Go  to  the  link  in  a  browser,  login  to  twi>er  and  type  the  pin  number  in.    Press  return  
    and  tweets  will  start  collecMng  in  a  file  called  “tweets.csv”  in  the  same  folder.  

    View full-size slide

  18. Visualising  the  output  
    with  Google  Map  Engine

    View full-size slide

  19. Go  to  h=ps://mapsengine.google.com  
    Click  on  New  Map  
    Name  your  map  by  clicking  on  “Un8tled  Map”  
    Click  “Import”  and  upload  your  CSV  file  
    Click  Base  Map  to  change  the  map  style  
    Crea8ng  your  first  Visualisa8on  with  the  Data  Collected  

    View full-size slide

  20. Crea8ng  your  first  Visualisa8on  with  the  Data  Collected  
    Once  the  file  has  uploaded,  choose  the  correct  LaMtude  and  Longitude  columns  and  
    then  click  ConMnue.    Click  the  ID  field  (le^  over)  as  the  marker  Mtle  and  press  finish.  
    The  updated  output  file  will,  hopefully,  auto  detect  the  lat/lng  fields.  

    View full-size slide

  21. Crea8ng  your  first  Visualisa8on  with  the  Data  Collected  
    Style  your  markers  by  pressing  the  paint  bucket  (highlighted  in  red)  and  select  the  
    circle  shape  and  colour.    Experiment  with  styling  your  map.  

    View full-size slide

  22. Crea8ng  your  first  Visualisa8on  with  the  Data  Collected  
    If  all  went  well  your  map  will  look  something  like  this.    Why  not  share  it  with  the  
    world!  

    View full-size slide

  23. Crea8ng  your  first  Visualisa8on  with  the  Data  Collected  
    And  you  can  click  the  markers  and  see  the  tweets  

    View full-size slide

  24. Visualising  the  output  
    with  Google  Map  Engine

    View full-size slide

  25. Visualising  the  output  
    with  CartoDB

    View full-size slide

  26. Go  to  h=ps://cartodb.com/  
    Click  on  Login  or  Sign  Up  
    Click  on  New  Map  
    Drag  your  CSV  file  onto  the  Map  to  upload  your  data  file  
    Click  Base  Map  to  change  the  map  style  
    Crea8ng  your  first  Visualisa8on  with  the  Data  Collected  

    View full-size slide

  27. Crea8ng  your  first  Visualisa8on  with  the  Data  Collected  
    Once  the  file  has  uploaded,  your  data  should  be  automaMcally  be  Geocoded  and  displayed  on  
    the  map.    Now  you  can  edit  the  metadata,  view  the  data  contained  within  the  tweets  and  play  
    with  the  visualisaMons.    

    View full-size slide

  28. Crea8ng  your  first  Visualisa8on  with  the  Data  Collected  
    Style  your  markers  and  change  the  visualisaMons.    Experiment  with  the  different  opMons  for  
    styling  your  map.    You  can  also  publish  your  map  at  this  stage.  

    View full-size slide

  29. Different  Sources  you  can  use  to  Visualise  your  data
    Google  Fusion  Tables  
    -­‐  h>p://www.google.com/drive/apps.html#fusiontables  
    Google  Map  Engine  
    -­‐  h>ps://mapsengine.google.com  
    Google  Earth  
    -­‐  h>p://www.google.com/earth/index.html  
    MapTube  
    - h>p://www.maptube.org  
    CartoDB  
    - h>p://www.cartodb.com

    View full-size slide

  30. h>p://www.bigdatatoolkit.org

    View full-size slide





























































































  31. Thanks
    Steven Gray
    [email protected]
    @frogo +StevenGray

    View full-size slide