favorite topics I’m stuck. How do I better engage my readers while still writing in my favorite genre? •world’s largest community of readers and writers •readers can vote on stories
messy) •1200 documents across 23 genres •each document is a vector of topic weights •correlating these topic vectors and ordering by distance should recapitulate genres (and show super-‐ and sub-‐ genre structures)
1.2k stories, with tags and summaries, across 23 genres • back end: R, stm, Python, pandas to do heavy lifting; SQLalchemy for interfacing with SQL (postgres). • front end: Flask, Bootstrap, deployed on AWS Back to Front
2015) • combined LDA (latent Dirichlet allocation) with regression model, where covariates were genre and votes • allowed topic rank assignment for a given story (document) to vary by the degree to which a topic predicts “likes,” by genre- • uses Latent Dirichlet Allocation; KL divergence to choose number of topics • output document-topic and term-topic matrices to pandas and postgresql db