Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
pycon_delhi_lightening
Search
Devashish Deshpande
September 24, 2016
Technology
0
1.5k
pycon_delhi_lightening
Lightening talk delivered at PyCon India 2016
Devashish Deshpande
September 24, 2016
Tweet
Share
Other Decks in Technology
See All in Technology
Contract One Engineering Unit 紹介資料
sansan33
PRO
0
13k
Bill One 開発エンジニア 紹介資料
sansan33
PRO
5
17k
2026年、サーバーレスの現在地 -「制約と戦う技術」から「当たり前の実行基盤」へ- /serverless2026
slsops
2
260
Context Engineeringが企業で不可欠になる理由
hirosatogamo
PRO
3
620
プロポーザルに込める段取り八分
shoheimitani
1
470
外部キー制約の知っておいて欲しいこと - RDBMSを正しく使うために必要なこと / FOREIGN KEY Night
soudai
PRO
12
5.6k
超初心者からでも大丈夫!オープンソース半導体の楽しみ方〜今こそ!オレオレチップをつくろう〜
keropiyo
0
110
Webhook best practices for rock solid and resilient deployments
glaforge
2
300
AzureでのIaC - Bicep? Terraform? それ早く言ってよ会議
torumakabe
1
580
広告の効果検証を題材にした因果推論の精度検証について
zozotech
PRO
0
190
Oracle Cloud Observability and Management Platform - OCI 運用監視サービス概要 -
oracle4engineer
PRO
2
14k
SREチームをどう作り、どう育てるか ― Findy横断SREのマネジメント
rvirus0817
0
320
Featured
See All Featured
BBQ
matthewcrist
89
10k
The Spectacular Lies of Maps
axbom
PRO
1
520
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
25
1.7k
Accessibility Awareness
sabderemane
0
53
New Earth Scene 8
popppiees
1
1.5k
Darren the Foodie - Storyboard
khoart
PRO
2
2.4k
The Impact of AI in SEO - AI Overviews June 2024 Edition
aleyda
5
730
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
12
1.4k
Chasing Engaging Ingredients in Design
codingconduct
0
110
The B2B funnel & how to create a winning content strategy
katarinadahlin
PRO
1
280
Leveraging LLMs for student feedback in introductory data science courses - posit::conf(2025)
minecr
0
150
Speed Design
sergeychernyshev
33
1.5k
Transcript
News classification with Gensim Devashish Deshpande Undergraduate student RaRe Technologies
Incubator Program Github: dsquareindia Blogs: https://rare-technologies.com/blog/
Gensim: Topic modeling in python
Problem of News (mis)classification
Screenshots from play newsstand
Topic-word coloring with LDA Image taken from LDA paper by
David Blei
What is a good LDA model? • Come up with
good topics • Infer topic distribution (United topic): mourinho, red_devils, old_trafford, bad_team... (Arsenal topic): wenger, henry, invincibles,.... (City topic): aguero, etihad, england, premier_league (Chelsea topic): blues, football, roman, bridge,... Football LDA model
Evaluating topic models • Manually – Look at the topics.
See if they are interpretable. – Comparing different topic models Qualititative
None
Topic Coherence • Quantitave
Topic Coherence • Assign a number to the human interpretability!
Comparing topic models becomes much easier
Topic Coherence • Better LDA -> Better topics -> Better
classification Topics from topic modeling tutorial on Lee corpus
Join the community! • Pick up issues from: https://github.com/RaRe-Technologies/gensim •
Come for the sprint!