Link
Embed
Share
Beginning
This slide
Copy link URL
Copy link URL
Copy iframe embed code
Copy iframe embed code
Copy javascript embed code
Copy javascript embed code
Share
Tweet
Share
Tweet
Slide 1
Slide 1 text
Taming'Social'Media' with'MongoDB' Danny'Holloway'
[email protected]
' June'26,'2012'
Slide 2
Slide 2 text
Overview' • IntroducCon' • Social'Media'Challenges' • MongoDB'Setup' • CollecCng'Tweets' • Querying'Tweets' • Accessing'the'Data' • Finding'Most'AcCve'Tweeter' • Lessons'Learned' • Building'an'Interface' • Demo' 2'
Slide 3
Slide 3 text
IntroducCon' • Built'a'tool'to'collect'tweets'over'Australia' and'interact'with'them'on'a'map' • Working'at'HumanGeo ' ' ' '' – Building'tools'and'services'for'geospaCal'analysis' of'Big'Data' – Using'MongoDB'for'horizontally'scalable'storage' and'geospaCal'analysis' 3'
Slide 4
Slide 4 text
Social'Media'Challenges'' • No'control'over'data' – “Consumers*of*Tweets*should*tolerate*the*addi4on* of*new*fields*and*variance*in*ordering*of*fields* with*ease.”*;*TwiTer' • High'Volume' – ~17k'tweets'in'a'day'or'6.2M'per'year'with'exact' coordinates'in'Australia' – Record'high'of'>25k'tweets'per'second'or'>788B' per'year'around'the'world']'TwiTer' 4'
Slide 5
Slide 5 text
MongoDB'Setup' • Create'database' • Create'capped'collecCons' • Create'indexes' 5'
Slide 6
Slide 6 text
CollecCng'Tweets' • Using'tweetstream'to'collect'tweets'over' Australia'from'statuses/filter'endpoint' • Insert'results'into'collecCons' 6'
Slide 7
Slide 7 text
CollecCng'Tweets'(cont)' • Augment'results'for'beTer'queries' – TwiTer'provides'date'strings'like'"Wed'Jun'13' 23:17:58'+0000'2012“' ' 7'
Slide 8
Slide 8 text
Querying'Tweets' • Get'all'of'the'latest'tweets' ' • Get'all'the'tweets'from'a'user' ' 8'
Slide 9
Slide 9 text
Querying'Tweets'(cont)' • Get'tweets'near'a'point' • Get'tweets'within'a'bounding'box' ' 9'
Slide 10
Slide 10 text
Accessing'the'Data' • Using'BoTle'to'create'a'RESTful'API' 10'
Slide 11
Slide 11 text
Finding'Most'AcCve'Tweeter' • Calculate'tweet'count'for'each'user'and' return'tweets'for'that'user' 11'
Slide 12
Slide 12 text
Lessons'Learned' • Use'Longitude,'LaCtude'ordering'for' coordinates' • Default'index'value'range'is'exclusive'of'upper' bound' • TwiTer'has'bugs'too' • Making'your'own'maps'isn’t'hard'(it'can'take' some'Cme)' 12'
Slide 13
Slide 13 text
Building'an'Interface' • Dust'javascript'templaCng'library' • Leaflet'javascript'interacCve'map'library' • jQuery''javascript'library' • TileStream'map'Cle'server' ' 13'