Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
pycon_delhi_lightening
Search
Devashish Deshpande
September 24, 2016
Technology
0
1.5k
pycon_delhi_lightening
Lightening talk delivered at PyCon India 2016
Devashish Deshpande
September 24, 2016
Tweet
Share
Other Decks in Technology
See All in Technology
マルチエージェントのチームビルディング_2025-10-25
shinoyamada
0
230
ざっくり学ぶ 『エンジニアリングリーダー 技術組織を育てるリーダーシップと セルフマネジメント』 / 50 minute Engineering Leader
iwashi86
7
3.8k
AIとの協業で実現!レガシーコードをKotlinらしく生まれ変わらせる実践ガイド
zozotech
PRO
2
190
現場の壁を乗り越えて、 「計装注入」が拓く オブザーバビリティ / Beyond the Field Barriers: Instrumentation Injection and the Future of Observability
aoto
PRO
1
730
【SORACOM UG Explorer 2025】さらなる10年へ ~ SORACOM MVC 発表
soracom
PRO
0
180
GTC 2025 : 가속되고 있는 미래
inureyes
PRO
0
140
設計に疎いエンジニアでも始めやすいアーキテクチャドキュメント
phaya72
18
13k
仕様駆動開発を実現する上流工程におけるAIエージェント活用
sergicalsix
10
4.9k
可観測性は開発環境から、開発環境にもオブザーバビリティ導入のススメ
layerx
PRO
4
2.3k
re:Invent 2025の見どころと便利アイテムをご紹介 / Highlights and Useful Items for re:Invent 2025
yuj1osm
0
460
文字列操作の達人になる ~ Kotlinの文字列の便利な世界 ~ - Kotlin fest 2025
tomorrowkey
2
260
abema-trace-sampling-observability-cost-optimization
tetsuya28
0
390
Featured
See All Featured
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
285
14k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
116
20k
Building a Scalable Design System with Sketch
lauravandoore
463
33k
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
12
1.2k
Connecting the Dots Between Site Speed, User Experience & Your Business [WebExpo 2025]
tammyeverts
10
630
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
658
61k
Visualization
eitanlees
150
16k
Optimising Largest Contentful Paint
csswizardry
37
3.5k
Improving Core Web Vitals using Speculation Rules API
sergeychernyshev
21
1.2k
Typedesign – Prime Four
hannesfritz
42
2.8k
[Rails World 2023 - Day 1 Closing Keynote] - The Magic of Rails
eileencodes
37
2.6k
Art, The Web, and Tiny UX
lynnandtonic
303
21k
Transcript
News classification with Gensim Devashish Deshpande Undergraduate student RaRe Technologies
Incubator Program Github: dsquareindia Blogs: https://rare-technologies.com/blog/
Gensim: Topic modeling in python
Problem of News (mis)classification
Screenshots from play newsstand
Topic-word coloring with LDA Image taken from LDA paper by
David Blei
What is a good LDA model? • Come up with
good topics • Infer topic distribution (United topic): mourinho, red_devils, old_trafford, bad_team... (Arsenal topic): wenger, henry, invincibles,.... (City topic): aguero, etihad, england, premier_league (Chelsea topic): blues, football, roman, bridge,... Football LDA model
Evaluating topic models • Manually – Look at the topics.
See if they are interpretable. – Comparing different topic models Qualititative
None
Topic Coherence • Quantitave
Topic Coherence • Assign a number to the human interpretability!
Comparing topic models becomes much easier
Topic Coherence • Better LDA -> Better topics -> Better
classification Topics from topic modeling tutorial on Lee corpus
Join the community! • Pick up issues from: https://github.com/RaRe-Technologies/gensim •
Come for the sprint!