Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Sentimental Headlines

Luiz
October 13, 2013

Sentimental Headlines

These are the slides for my Devbootcamp final project. Repo is here: https://github.com/kelmerp/headline_sentiment_rating

Luiz

October 13, 2013
Tweet

More Decks by Luiz

Other Decks in Programming

Transcript

  1. THE BLACKLIST Previous day Current day Blacklist [“home”, “log in”,

    “Sports section”, “t.co/blah/junk”, ...] Blacklist Match? [“Headline”, “Headline”, “Weather”, “Headline”, “Headline”, ... ] “Weather” [“Headline”, “Headline”, “Weather”, “Headline”, “Headline”, ... ] Sunday, October 13, 13
  2. One day’s headlines 1. Headline 2. Headline 3. Headline ...

    20. Headline -.23 0.12 -.45 (-1..1) 0.0 Content: Alchemy Score: Average: -.142 PLOTTING (CNN) -1 1 Sunday, October 13, 13
  3. One day’s headlines Average: -.142 PLOTTING (CNN) -1 1 Next

    day’s headlines 1. Headline 2. Headline 3. Headline ... 20. Headline -.1 0.32 -.05 (-1..1) 0.0 Content: (CNN) Average: 0.12 1. Headline 2. Headline 3. Headline ... 20. Headline -.23 0.12 -.45 (-1..1) 0.0 Content: Alchemy Score: Alchemy Score: Sunday, October 13, 13
  4. PLOTTING -1 1 J F M A M J J

    A S Sunday, October 13, 13
  5. THE PROCESS Text 3 Hour sprints Feature branches Version commits

    Days 1-3: Days 4-6: Days 7-9: Set-up database Built Script Scraped first 50,000 headlines Fed headlines through Alchemy Graphed D3 scatterplots Scraped next 50,000 headlines Accomplished calendar view Decreased load time to < 3 secs Polished front-end Sunday, October 13, 13
  6. UNIQUE CHALLENGES Big data = cumbersome D.B. optimization matters Statistical

    analysis is HARD others... Sunday, October 13, 13
  7. Intro/Idea (30 seconds) The tools (1 min) wayback machine, Alchemy

    API The script (2 min) wayback gem, nokogiri logic/key points The scatter plot (2 min) math/logic behind it D3 show calendar..? or wait till demo Demo (1 min) have key dates picked out The Process (1 -2 min) 3 hour sprints feature branches commits Unique challenges (1-2mins) Big data is cumbersome DB optimization Sunday, October 13, 13