Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
pycon_delhi_lightening
Search
Devashish Deshpande
September 24, 2016
Technology
0
1.5k
pycon_delhi_lightening
Lightening talk delivered at PyCon India 2016
Devashish Deshpande
September 24, 2016
Tweet
Share
Other Decks in Technology
See All in Technology
リンクアンドモチベーション ソフトウェアエンジニア向け紹介資料 / Introduction to Link and Motivation for Software Engineers
lmi
4
300k
複雑なState管理からの脱却
sansantech
PRO
1
140
個人でもIAM Identity Centerを使おう!(アクセス管理編)
ryder472
3
200
TypeScript、上達の瞬間
sadnessojisan
46
13k
Incident Response Practices: Waroom's Features and Future Challenges
rrreeeyyy
0
160
障害対応指揮の意思決定と情報共有における価値観 / Waroom Meetup #2
arthur1
5
470
Exadata Database Service on Dedicated Infrastructure(ExaDB-D) UI スクリーン・キャプチャ集
oracle4engineer
PRO
2
3.2k
【Pycon mini 東海 2024】Google Colaboratoryで試すVLM
kazuhitotakahashi
2
500
AWS Media Services 最新サービスアップデート 2024
eijikominami
0
200
【Startup CTO of the Year 2024 / Audience Award】アセンド取締役CTO 丹羽健
niwatakeru
0
980
OCI Network Firewall 概要
oracle4engineer
PRO
0
4.1k
Terraform CI/CD パイプラインにおける AWS CodeCommit の代替手段
hiyanger
1
240
Featured
See All Featured
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
0
89
Building Flexible Design Systems
yeseniaperezcruz
327
38k
Large-scale JavaScript Application Architecture
addyosmani
510
110k
Optimising Largest Contentful Paint
csswizardry
33
2.9k
Music & Morning Musume
bryan
46
6.2k
For a Future-Friendly Web
brad_frost
175
9.4k
GraphQLの誤解/rethinking-graphql
sonatard
67
10k
Put a Button on it: Removing Barriers to Going Fast.
kastner
59
3.5k
Making the Leap to Tech Lead
cromwellryan
133
8.9k
KATA
mclloyd
29
14k
Designing on Purpose - Digital PM Summit 2013
jponch
115
7k
Imperfection Machines: The Place of Print at Facebook
scottboms
265
13k
Transcript
News classification with Gensim Devashish Deshpande Undergraduate student RaRe Technologies
Incubator Program Github: dsquareindia Blogs: https://rare-technologies.com/blog/
Gensim: Topic modeling in python
Problem of News (mis)classification
Screenshots from play newsstand
Topic-word coloring with LDA Image taken from LDA paper by
David Blei
What is a good LDA model? • Come up with
good topics • Infer topic distribution (United topic): mourinho, red_devils, old_trafford, bad_team... (Arsenal topic): wenger, henry, invincibles,.... (City topic): aguero, etihad, england, premier_league (Chelsea topic): blues, football, roman, bridge,... Football LDA model
Evaluating topic models • Manually – Look at the topics.
See if they are interpretable. – Comparing different topic models Qualititative
None
Topic Coherence • Quantitave
Topic Coherence • Assign a number to the human interpretability!
Comparing topic models becomes much easier
Topic Coherence • Better LDA -> Better topics -> Better
classification Topics from topic modeling tutorial on Lee corpus
Join the community! • Pick up issues from: https://github.com/RaRe-Technologies/gensim •
Come for the sprint!