Upgrade to Pro — share decks privately, control downloads, hide ads and more …

three-hurricanes

FC
December 06, 2018

 three-hurricanes

FC

December 06, 2018
Tweet

More Decks by FC

Other Decks in Education

Transcript

  1. 2017 HURRICANE START END CATEGORY COSTS (NOOA) Harvey 8/17 9/2

    4 $125 billion Irma 8/30 9/13 5 $50 billion Maria 9/16 10/2 5 $90 billion
  2. DID WE FORGET ABOUT MARIA? 6 months afterward, government of

    PR reported 65 - 500 deaths 1 year later, GWU university study reported Maria caused 2,975 deaths
  3. In 2017, there was a 55% increase from the average

    for same months in 2015 and 2016 where P.R. residents died of sepsis. The large increase could be explained by delayed medical treatment in homes and hospitals caused by Hurricane Maria damage.
  4. INTERSECTION OF MEDIA FATIGUE, RACE, AND CLASS Was the general

    U.S public tired of hearing about hurricanes? To what degree did the US public lack empathy or care for Puerto Rico because they didn’t view the territory as part of the U.S. (thus viewed it as foreign news which has been proven by many studies to have a short attention span)? Trump called PR beggars and stealers. A country with already poor infrastructure. To what extent did racism & class influence lack of empathy, otherness, take place in the lack of empathy for the situation?
  5. LESS COVERAGE, LESS HELP? How did the three hurricanes differ

    in volume of NYT press coverage? Was coverage proportional to the catastrophic impacts? What was the sentiment during and 1 month following the hurricane? 1 year later?
  6. INITIAL QUESTIONS How did sentiment change over time? Was there

    more press coverage as the situation got worse in P.R? Was there more press coverage after more people died in P.R. over time? How did sentiment change over the course of a year after the hurricane ended?
  7. GETTING DATA Query nyt api for what it considers news

    During the start of hurricane, one month after, 1 year later Concurrent overlap mention in some articles Can come from different sections: sports, podcast, travel
  8. LIMITATIONS Would the results be different if I could read

    many articles from different sources? Death toll numbers vastly disputed Results in queries varied Needs more manual deletion Nyt crossword
  9. DATA PIPELINE NYT API (free, max 999 queries / day)

    Query, ingest, serialize, tokenize Useful Python modules: Beautifulsoup4, requests, html
  10. TRUMPED BY LEXICON CHOICE LEXICON SENTIMENT NRC (Saif Mohammad, Peter

    Turney) SURPRISE BING (Bing Liu, et. al) POSITIVE AFINN (Finn Arup Nielsen) NA LOUGHRAN (Tim Loughran, Bill McDonald) NA
  11. 4. VADER LEXICON Since “Trump” was such a large player

    in the outcome of the sentiment EDA, what if I tried using a non bag-of-words approach?
  12. VADER (Valence Aware Dictionary and sEntiment Reasoner) a lexicon and

    rule-based sentiment analysis tool Trained on social media text, ideal for tweets based on lexicons of sentiment-related words. In this approach, each of the words in the lexicon is rated as to whether it is positive or negative, and in many cases, how positive or negative.
  13. Word Sentiment rating tragedy -3.4 rejoiced 2.0 insane -1.7 disaster

    -3.1 great 3.1 positive words have higher positive ratings and more negative words have lower negative ratings.
  14. Sentiment metric Value Positive 0.45 Neutral 0.55 Negative 0.00 Compound

    0.69 VADER produces four sentiment metrics from these word ratings. The first three, positive, neutral and negative, represent the proportion of the text that falls into those categories. The final metric, the compound score, is the sum of all of the lexicon ratings (1.9 and 1.8 in this case) which have been standardised to range between -1 and 1. In this case, our example sentence has a rating of 0.69, which is pretty strongly positive.
  15. 5. NAMED ENTITY RECOGNITION Who are the entities in articles

    relating to the three hurricanes? Help us identify & explore the main characters in the stories about the hurricanes?
  16. LESSONS LEARNED Think more critically about limitations of sentiment lexicon

    type & demonstrated limitations of bag-of-words type methods
  17. ANALYSIS TOOLS Web parsing Beautifulsoup4, requests Munging Pandas, dplyr NLP

    NLTK, Textblob, Vader, Tidytext, spaCy Visualization Ggplot2, matplotlib, seaborn
  18. REFERENCES • Hutto, C.J. & Gilbert, E.E. (2014). VADER: A

    Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text. Eighth International Conference on Weblogs and Social Media (ICWSM-14). Ann Arbor, MI, June 2014. • Silge, Julia & Robinson, David (2017).Tidy Text Mining with R. • Amber E. Boydstun. Making the News: Politics, the Media, and Agenda Setting. Chicago: University of Chicago Press. 2013.
  19. IMAGES # help-pr-flag http://www.capalino.com/wp-content/uploads/2017/09/Puerto-Rico-CSR.jpg # trump-in-pr https://img.thedailybeast.com/image/upload/c_crop,d_placeholder_euli9k,h_1440,w_2560,x_0,y_0/dpr_2.0/c_limit,w_740/fl_lossy,q_auto/v1507940585/171013-lund- puerto-rico-lede_pki0e8 # lmm-pr-flag

    https://d1oc2d5bw2auvq.cloudfront.net/static-assets-prod/lin-manuel-miranda-og-image-E4F8-opt https://www.google.com/url? sa=i&rct=j&q=&esrc=s&source=images&cd=&cad=rja&uact=8&ved=2ahUKEwjY4J_6povfAhWRrVkKHdVvCSkQjRx6BAgBEAU&url=https%3A%2F%2Fwww.wsj.com% 2Farticles%2Fpuerto-rico-braces-for-more-flooding-as-maria-dumps-more-rain-1505998033&psig=AOvVaw0GPf5T3eIkvyQtMIAIKH-R&ust=1544189127382370 #last slide https://www.google.com/url?