Machine learning: boldly going
where Twitter's APIs don't
Slide 2
Slide 2 text
Twitter
● Lots of data!
● Limited resources (180 calls per 15 mins) on
/statuses/show:id
● We can get more resource by using our users
API allowance, but that still isn't enough
● API doesn't give all we want :( (Favourites)
Slide 3
Slide 3 text
Favourites
● Stream api doesn't give favourite count :(
● Polling individual tweets does
● We just want the totals per tweet
● So we need to pick the most active tweets
● Most tweets have no favourites on them
Slide 4
Slide 4 text
Metric gathering
● We need metrics!
● Sometimes you aren’t sure what the
metrics are (limited resources)
● Random sampling for a baseline
Slide 5
Slide 5 text
1st Attempt
● Hand crafted
● Requires good domain knowledge
● Difficult to get all the rules in place (required
quite a few tweaks based on looking at results)
Slide 6
Slide 6 text
No content
Slide 7
Slide 7 text
Machine Learning
● Works well in a big data domain with random
elements (Twitter..)
● Possible to learn trends that you would not think
of given appropriate data set
● Neural Network + Backpropagation
● After picking the dataset and the right algorithm
it does all the work for you!
Slide 8
Slide 8 text
Results
Slide 9
Slide 9 text
Get some.
● https://github.com/mbst/neural-network
● https://engage.metabroadcast.com