Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
pycon_delhi_lightening
Search
Devashish Deshpande
September 24, 2016
Technology
0
1.5k
pycon_delhi_lightening
Lightening talk delivered at PyCon India 2016
Devashish Deshpande
September 24, 2016
Tweet
Share
Other Decks in Technology
See All in Technology
「もしもデータ基盤開発で『強くてニューゲーム』ができたなら今の僕はどんなデータ基盤を作っただろう」
aeonpeople
0
250
M&Aで拡大し続けるGENDAのデータ活用を促すためのDatabricks権限管理 / AEON TECH HUB #22
genda
0
260
[Data & AI Summit '25 Fall] AIでデータ活用を進化させる!Google Cloudで作るデータ活用の未来
kirimaru
0
4k
AIBuildersDay_track_A_iidaxs
iidaxs
4
1.4k
事業の財務責任に向き合うリクルートデータプラットフォームのFinOps
recruitengineers
PRO
2
230
たまに起きる外部サービスの障害に備えたり備えなかったりする話
egmc
0
420
AWSインフルエンサーへの道 / load of AWS Influencer
whisaiyo
0
230
AWSの新機能をフル活用した「re:Inventエージェント」開発秘話
minorun365
2
470
Oracle Database@Azure:サービス概要のご紹介
oracle4engineer
PRO
2
200
Strands AgentsとNova 2 SonicでS2Sを実践してみた
yama3133
1
1.9k
フルカイテン株式会社 エンジニア向け採用資料
fullkaiten
0
9.9k
なぜ あなたはそんなに re:Invent に行くのか?
miu_crescent
PRO
0
210
Featured
See All Featured
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
31
3k
The Power of CSS Pseudo Elements
geoffreycrofte
80
6.1k
The browser strikes back
jonoalderson
0
150
How to train your dragon (web standard)
notwaldorf
97
6.5k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
132
19k
Claude Code のすすめ
schroneko
67
210k
What's in a price? How to price your products and services
michaelherold
246
13k
Winning Ecommerce Organic Search in an AI Era - #searchnstuff2025
aleyda
0
1.8k
Building the Perfect Custom Keyboard
takai
1
660
How to Align SEO within the Product Triangle To Get Buy-In & Support - #RIMC
aleyda
1
1.3k
How to build an LLM SEO readiness audit: a practical framework
nmsamuel
1
580
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
9
1k
Transcript
News classification with Gensim Devashish Deshpande Undergraduate student RaRe Technologies
Incubator Program Github: dsquareindia Blogs: https://rare-technologies.com/blog/
Gensim: Topic modeling in python
Problem of News (mis)classification
Screenshots from play newsstand
Topic-word coloring with LDA Image taken from LDA paper by
David Blei
What is a good LDA model? • Come up with
good topics • Infer topic distribution (United topic): mourinho, red_devils, old_trafford, bad_team... (Arsenal topic): wenger, henry, invincibles,.... (City topic): aguero, etihad, england, premier_league (Chelsea topic): blues, football, roman, bridge,... Football LDA model
Evaluating topic models • Manually – Look at the topics.
See if they are interpretable. – Comparing different topic models Qualititative
None
Topic Coherence • Quantitave
Topic Coherence • Assign a number to the human interpretability!
Comparing topic models becomes much easier
Topic Coherence • Better LDA -> Better topics -> Better
classification Topics from topic modeling tutorial on Lee corpus
Join the community! • Pick up issues from: https://github.com/RaRe-Technologies/gensim •
Come for the sprint!