Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Mo'Mentum

 Mo'Mentum

Thursday update for mo'mentum demo slides

Peter Winslow

February 09, 2017
Tweet

More Decks by Peter Winslow

Other Decks in Technology

Transcript

  1. Just what I was looking for! Many resources for sharing

    petitions! Little help for writing petitions... Mo’Mentum
  2. Just what I was looking for! Many resources for sharing

    petitions! Little help for writing petitions... Mo’Mentum My Petition
  3. Just what I was looking for! Mo’Mentum Many resources for

    sharing petitions! Little help for writing petitions... My Petition Probability of success Time scale to reach signature goal
  4. Change.org sitemap Petition urls Petition id’s Petition text and metadata

    Over ~ 40,000 Petitions Data Collection Change.org API
  5. Sentiment, POS Tagging, word/sentence counts, ... Text data Stopwords, Lemmatization

    Features Metadata Success/Failure Feature Engineering Signature Accumulation Rate
  6. Metadata Success/Failure Signature Accumulation Rate Targets Sentiment, POS Tagging, word/sentence

    counts, ... Text data Stopwords, Lemmatization Features Feature Engineering
  7. Predicts signature accumulation rate Gradient Boosting Regressor: Least Squares loss

    function Train-Test-evaluation split with 5-fold CV RMSE = 1.6 79.7% within 1 93.1% within 2
  8. Peter Winslow The Professional PhD + 1 Postdoc in theoretical

    High Energy Physics and Cosmology. Origin of matter in the Universe New Father! Kiana Winslow, born Nov. 29th 2016
  9. Algorithms: Classification Random Forest Classifier (Scikit-Learn) Predict success/failure of petition

    Reasons for choosing: • Lots of complication yet resistant to overfitting Challenges: • Class imbalance in the data Validation: Train-Test-evaluation split with 5-fold CV Backup Slides
  10. Algorithms: Regression GradientBoostingRegressor (Scikit-Learn) Predict signature accumulation rate Reasons for

    choosing: • Many features, highly non-linear, can return predicted “quantiles” Challenges: • The right evaluation metric? Validation: Train-Test-evaluation split with 5-fold CV Backup Slides