Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
pycon_delhi_lightening
Search
Devashish Deshpande
September 24, 2016
Technology
0
1.5k
pycon_delhi_lightening
Lightening talk delivered at PyCon India 2016
Devashish Deshpande
September 24, 2016
Tweet
Share
Other Decks in Technology
See All in Technology
GitHub Copilot CLI 現状確認会議
torumakabe
8
2.1k
Hardware/Software Co-design: Motivations and reflections with respect to security
bcantrill
1
150
田舎で20年スクラム(後編):一個人が企業で長期戦アジャイルに挑む意味
chinmo
1
1.6k
「違う現場で格闘する二人」——社内コミュニティがつないだトヨタ流アジャイルの実践とその先
shinichitakeuchi
0
470
「リリースファースト」の実感を届けるには 〜停滞するチームに変化を起こすアプローチ〜 #RSGT2026
kintotechdev
0
1.1k
Node vs Deno vs Bun 〜推しランタイムを見つけよう〜
kamekyame
1
530
AI Agent Agentic Workflow の可観測性 / Observability of AI Agent Agentic Workflow
yuzujoe
4
2.1k
AI との良い付き合い方を僕らは誰も知らない (WSS 2026 静岡版)
asei
1
350
Master Dataグループ紹介資料
sansan33
PRO
1
4.2k
kintone開発のプラットフォームエンジニアの紹介
cybozuinsideout
PRO
0
540
20260114_データ横丁 新年LT大会:2026年の抱負
taromatsui_cccmkhd
0
310
20260120 Amazon VPC のパブリックサブネットを無くしたい!
masaruogura
1
110
Featured
See All Featured
How to Think Like a Performance Engineer
csswizardry
28
2.4k
Taking LLMs out of the black box: A practical guide to human-in-the-loop distillation
inesmontani
PRO
3
2k
30 Presentation Tips
portentint
PRO
1
190
Exploring anti-patterns in Rails
aemeredith
2
230
End of SEO as We Know It (SMX Advanced Version)
ipullrank
2
3.9k
The SEO identity crisis: Don't let AI make you average
varn
0
52
RailsConf 2023
tenderlove
30
1.3k
Jamie Indigo - Trashchat’s Guide to Black Boxes: Technical SEO Tactics for LLMs
techseoconnect
PRO
0
43
How to Align SEO within the Product Triangle To Get Buy-In & Support - #RIMC
aleyda
1
1.4k
The Power of CSS Pseudo Elements
geoffreycrofte
80
6.1k
From π to Pie charts
rasagy
0
120
Building Better People: How to give real-time feedback that sticks.
wjessup
370
20k
Transcript
News classification with Gensim Devashish Deshpande Undergraduate student RaRe Technologies
Incubator Program Github: dsquareindia Blogs: https://rare-technologies.com/blog/
Gensim: Topic modeling in python
Problem of News (mis)classification
Screenshots from play newsstand
Topic-word coloring with LDA Image taken from LDA paper by
David Blei
What is a good LDA model? • Come up with
good topics • Infer topic distribution (United topic): mourinho, red_devils, old_trafford, bad_team... (Arsenal topic): wenger, henry, invincibles,.... (City topic): aguero, etihad, england, premier_league (Chelsea topic): blues, football, roman, bridge,... Football LDA model
Evaluating topic models • Manually – Look at the topics.
See if they are interpretable. – Comparing different topic models Qualititative
None
Topic Coherence • Quantitave
Topic Coherence • Assign a number to the human interpretability!
Comparing topic models becomes much easier
Topic Coherence • Better LDA -> Better topics -> Better
classification Topics from topic modeling tutorial on Lee corpus
Join the community! • Pick up issues from: https://github.com/RaRe-Technologies/gensim •
Come for the sprint!