$30 off During Our Annual Pro Sale. View Details »
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
pycon_delhi_lightening
Search
Devashish Deshpande
September 24, 2016
Technology
0
1.5k
pycon_delhi_lightening
Lightening talk delivered at PyCon India 2016
Devashish Deshpande
September 24, 2016
Tweet
Share
Other Decks in Technology
See All in Technology
まだ間に合う! Agentic AI on AWSの現在地をやさしく一挙おさらい
minorun365
12
720
IAMユーザーゼロの運用は果たして可能なのか
yama3133
2
490
ExpoのインダストリーブースでみたAWSが見せる製造業の未来
hamadakoji
0
150
AWS re:Invent 2025~初参加の成果と学び~
kubomasataka
0
130
AlmaLinux + KVM + Cockpit で始めるお手軽仮想化基盤 ~ 開発環境などでの利用を想定して ~
koedoyoshida
0
120
AWSを使う上で最低限知っておきたいセキュリティ研修を社内で実施した話 ~みんなでやるセキュリティ~
maimyyym
2
1.8k
ChatGPTで論⽂は読めるのか
spatial_ai_network
11
29k
CARTAのAI CoE が挑む「事業を進化させる AI エンジニアリング」 / carta ai coe evolution business ai engineering
carta_engineering
0
2k
WordPress は終わったのか ~今のWordPress の制作手法ってなにがあんねん?~ / Is WordPress Over? How We Build with WordPress Today
tbshiki
2
830
Oracle Cloud Infrastructure IaaS 新機能アップデート 2025/09 - 2025/11
oracle4engineer
PRO
0
170
Kiro を用いたペアプロのススメ
taikis
1
320
たまに起きる外部サービスの障害に備えたり備えなかったりする話
egmc
0
290
Featured
See All Featured
Self-Hosted WebAssembly Runtime for Runtime-Neutral Checkpoint/Restore in Edge–Cloud Continuum
chikuwait
0
21
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
25
1.6k
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
31
9.8k
DevOps and Value Stream Thinking: Enabling flow, efficiency and business value
helenjbeal
1
63
Future Trends and Review - Lecture 12 - Web Technologies (1019888BNR)
signer
PRO
0
3.1k
Why You Should Never Use an ORM
jnunemaker
PRO
61
9.6k
Being A Developer After 40
akosma
91
590k
Into the Great Unknown - MozCon
thekraken
40
2.2k
HU Berlin: Industrial-Strength Natural Language Processing with spaCy and Prodigy
inesmontani
PRO
0
80
Deep Space Network (abreviated)
tonyrice
0
16
Scaling GitHub
holman
464
140k
We Are The Robots
honzajavorek
0
110
Transcript
News classification with Gensim Devashish Deshpande Undergraduate student RaRe Technologies
Incubator Program Github: dsquareindia Blogs: https://rare-technologies.com/blog/
Gensim: Topic modeling in python
Problem of News (mis)classification
Screenshots from play newsstand
Topic-word coloring with LDA Image taken from LDA paper by
David Blei
What is a good LDA model? • Come up with
good topics • Infer topic distribution (United topic): mourinho, red_devils, old_trafford, bad_team... (Arsenal topic): wenger, henry, invincibles,.... (City topic): aguero, etihad, england, premier_league (Chelsea topic): blues, football, roman, bridge,... Football LDA model
Evaluating topic models • Manually – Look at the topics.
See if they are interpretable. – Comparing different topic models Qualititative
None
Topic Coherence • Quantitave
Topic Coherence • Assign a number to the human interpretability!
Comparing topic models becomes much easier
Topic Coherence • Better LDA -> Better topics -> Better
classification Topics from topic modeling tutorial on Lee corpus
Join the community! • Pick up issues from: https://github.com/RaRe-Technologies/gensim •
Come for the sprint!