Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
pycon_delhi_lightening
Search
Devashish Deshpande
September 24, 2016
Technology
0
1.5k
pycon_delhi_lightening
Lightening talk delivered at PyCon India 2016
Devashish Deshpande
September 24, 2016
Tweet
Share
Other Decks in Technology
See All in Technology
Sansanのデータプロダクトマネジメントのアプローチ
sansantech
PRO
0
230
ソフトウェアQAがハードウェアの人になったの
mineo_matsuya
3
130
IPA&AWSダブル全冠が明かす、人生を変えた勉強法のすべて
iwamot
PRO
2
220
大量配信システムにおけるSLOの実践:「見えない」信頼性をSLOで可視化
plaidtech
PRO
0
290
対話型音声AIアプリケーションの信頼性向上の取り組み
ivry_presentationmaterials
2
700
OpenTelemetryセマンティック規約の恩恵とMackerel APMにおける活用例 / SRE NEXT 2025
mackerelio
3
1.6k
United Airlines Customer Service– Call 1-833-341-3142 Now!
airhelp
0
180
伴走から自律へ: 形式知へと導くSREイネーブリングによる プロダクトチームの信頼性オーナーシップ向上 / SRE NEXT 2025
visional_engineering_and_design
3
230
ビジネス職が分析も担う事業部制組織でのデータ活用の仕組みづくり / Enabling Data Analytics in Business-Led Divisional Organizations
zaimy
1
310
Lufthansa ®️ USA Contact Numbers: Complete 2025 Support Guide
lufthanahelpsupport
0
250
AI エージェントと考え直すデータ基盤
na0
18
7.3k
ゼロからはじめる採用広報
yutadayo
4
1k
Featured
See All Featured
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
281
13k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
161
15k
RailsConf 2023
tenderlove
30
1.1k
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
656
60k
Into the Great Unknown - MozCon
thekraken
40
1.9k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
53
2.9k
StorybookのUI Testing Handbookを読んだ
zakiyama
30
5.9k
How STYLIGHT went responsive
nonsquared
100
5.6k
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
PRO
181
54k
Six Lessons from altMBA
skipperchong
28
3.9k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
29
1.8k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
667
120k
Transcript
News classification with Gensim Devashish Deshpande Undergraduate student RaRe Technologies
Incubator Program Github: dsquareindia Blogs: https://rare-technologies.com/blog/
Gensim: Topic modeling in python
Problem of News (mis)classification
Screenshots from play newsstand
Topic-word coloring with LDA Image taken from LDA paper by
David Blei
What is a good LDA model? • Come up with
good topics • Infer topic distribution (United topic): mourinho, red_devils, old_trafford, bad_team... (Arsenal topic): wenger, henry, invincibles,.... (City topic): aguero, etihad, england, premier_league (Chelsea topic): blues, football, roman, bridge,... Football LDA model
Evaluating topic models • Manually – Look at the topics.
See if they are interpretable. – Comparing different topic models Qualititative
None
Topic Coherence • Quantitave
Topic Coherence • Assign a number to the human interpretability!
Comparing topic models becomes much easier
Topic Coherence • Better LDA -> Better topics -> Better
classification Topics from topic modeling tutorial on Lee corpus
Join the community! • Pick up issues from: https://github.com/RaRe-Technologies/gensim •
Come for the sprint!