Slide 1

Slide 1 text

Machine learning: boldly going where Twitter's APIs don't

Slide 2

Slide 2 text

Twitter ● Lots of data! ● Limited resources (180 calls per 15 mins) on /statuses/show:id ● We can get more resource by using our users API allowance, but that still isn't enough ● API doesn't give all we want :( (Favourites)

Slide 3

Slide 3 text

Favourites ● Stream api doesn't give favourite count :( ● Polling individual tweets does ● We just want the totals per tweet ● So we need to pick the most active tweets ● Most tweets have no favourites on them

Slide 4

Slide 4 text

Metric gathering ● We need metrics! ● Sometimes you aren’t sure what the metrics are (limited resources) ● Random sampling for a baseline

Slide 5

Slide 5 text

1st Attempt ● Hand crafted ● Requires good domain knowledge ● Difficult to get all the rules in place (required quite a few tweaks based on looking at results)

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

Machine Learning ● Works well in a big data domain with random elements (Twitter..) ● Possible to learn trends that you would not think of given appropriate data set ● Neural Network + Backpropagation ● After picking the dataset and the right algorithm it does all the work for you!

Slide 8

Slide 8 text

Results

Slide 9

Slide 9 text

Get some. ● https://github.com/mbst/neural-network ● https://engage.metabroadcast.com