Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
pycon_delhi_lightening
Search
Devashish Deshpande
September 24, 2016
Technology
0
1.5k
pycon_delhi_lightening
Lightening talk delivered at PyCon India 2016
Devashish Deshpande
September 24, 2016
Tweet
Share
Other Decks in Technology
See All in Technology
5分でカオスエンジニアリングを分かった気になろう
pandayumi
0
260
機械学習を扱うプラットフォーム開発と運用事例
lycorptech_jp
PRO
0
670
新規プロダクトでプロトタイプから正式リリースまでNext.jsで開発したリアル
kawanoriku0
1
220
Autonomous Database - Dedicated 技術詳細 / adb-d_technical_detail_jp
oracle4engineer
PRO
4
10k
品質視点から考える組織デザイン/Organizational Design from Quality
mii3king
0
210
エンジニアリングマネージャーの成長の道筋とキャリア / Developers Summit 2025 KANSAI
daiksy
3
1.1k
企業の生成AIガバナンスにおけるエージェントとセキュリティ
lycorptech_jp
PRO
3
200
まずはマネコンでちゃちゃっと作ってから、それをCDKにしてみよか。
yamada_r
2
120
Snowflake Intelligenceにはこうやって立ち向かう!クラシルが考えるAI Readyなデータ基盤と活用のためのDataOps
gappy50
0
280
いま注目のAIエージェントを作ってみよう
supermarimobros
0
360
Evolución del razonamiento matemático de GPT-4.1 a GPT-5 - Data Aventura Summit 2025 & VSCode DevDays
lauchacarro
0
210
LLM時代のパフォーマンスチューニング:MongoDB運用で試したコンテキスト活用の工夫
ishikawa_pro
0
170
Featured
See All Featured
Embracing the Ebb and Flow
colly
87
4.8k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
46
7.6k
Documentation Writing (for coders)
carmenintech
74
5k
Why Our Code Smells
bkeepers
PRO
339
57k
The Invisible Side of Design
smashingmag
301
51k
A better future with KSS
kneath
239
17k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
131
19k
The Language of Interfaces
destraynor
161
25k
Navigating Team Friction
lara
189
15k
Product Roadmaps are Hard
iamctodd
PRO
54
11k
Site-Speed That Sticks
csswizardry
10
820
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
333
22k
Transcript
News classification with Gensim Devashish Deshpande Undergraduate student RaRe Technologies
Incubator Program Github: dsquareindia Blogs: https://rare-technologies.com/blog/
Gensim: Topic modeling in python
Problem of News (mis)classification
Screenshots from play newsstand
Topic-word coloring with LDA Image taken from LDA paper by
David Blei
What is a good LDA model? • Come up with
good topics • Infer topic distribution (United topic): mourinho, red_devils, old_trafford, bad_team... (Arsenal topic): wenger, henry, invincibles,.... (City topic): aguero, etihad, england, premier_league (Chelsea topic): blues, football, roman, bridge,... Football LDA model
Evaluating topic models • Manually – Look at the topics.
See if they are interpretable. – Comparing different topic models Qualititative
None
Topic Coherence • Quantitave
Topic Coherence • Assign a number to the human interpretability!
Comparing topic models becomes much easier
Topic Coherence • Better LDA -> Better topics -> Better
classification Topics from topic modeling tutorial on Lee corpus
Join the community! • Pick up issues from: https://github.com/RaRe-Technologies/gensim •
Come for the sprint!