Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
pycon_delhi_lightening
Search
Devashish Deshpande
September 24, 2016
Technology
0
1.4k
pycon_delhi_lightening
Lightening talk delivered at PyCon India 2016
Devashish Deshpande
September 24, 2016
Tweet
Share
Other Decks in Technology
See All in Technology
サービス開発を前に進めるために 新米リードエンジニアが 取り組んだこと / Steps Taken by a Novice Lead Engineer to Advance Service Development
nologyance
0
180
AWSでRAGを作る法方
sonoda_mj
1
140
ここがすごいよ! AWS Systems Manager!
saichan11
0
1.8k
さらに高品質・高速化を目指すAI時代のテスト設計支援と、めざす先 / AI Test Lab vol.1
shift_evolve
0
190
Flutter研修【MIXI 24新卒技術研修】
mixi_engineers
PRO
0
160
年間一億円削減した時系列データベースのアーキテクチャ改善~不確実性の高いプロジェクトへの挑戦~
lycorptech_jp
PRO
3
2.9k
フルリモートワークはエンジニアの夢を叶えたか? #cm_odyssey
mamohacy
2
600
Classmethod流のPlatform Engineering / classmethod-platform-engineering-devio2024
tomoki10
0
480
VPoEの視点から見た、ヘンリーがサーバーサイドKotlinを使う理由 / Why Server-side Kotlin 2024
cho0o0
1
420
Luupの開発組織におけるインシデントマネジメントの変遷 ver.RoadtoSRENEXT2024
grimoh
1
270
ゆめみのアクセシビリティの現在地と今後
ryokatsuse
3
290
簡単に始めるSnowflakeの機械学習
nayuts
1
190
Featured
See All Featured
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
29
2.5k
WebSockets: Embracing the real-time Web
robhawkes
59
7.2k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
155
14k
Rebuilding a faster, lazier Slack
samanthasiow
78
8.5k
The World Runs on Bad Software
bkeepers
PRO
63
11k
Documentation Writing (for coders)
carmenintech
63
4.2k
jQuery: Nuts, Bolts and Bling
dougneiner
61
7.4k
The Straight Up "How To Draw Better" Workshop
denniskardys
229
130k
Why You Should Never Use an ORM
jnunemaker
PRO
51
8.9k
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
224
21k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
248
20k
Intergalactic Javascript Robots from Outer Space
tanoku
266
26k
Transcript
News classification with Gensim Devashish Deshpande Undergraduate student RaRe Technologies
Incubator Program Github: dsquareindia Blogs: https://rare-technologies.com/blog/
Gensim: Topic modeling in python
Problem of News (mis)classification
Screenshots from play newsstand
Topic-word coloring with LDA Image taken from LDA paper by
David Blei
What is a good LDA model? • Come up with
good topics • Infer topic distribution (United topic): mourinho, red_devils, old_trafford, bad_team... (Arsenal topic): wenger, henry, invincibles,.... (City topic): aguero, etihad, england, premier_league (Chelsea topic): blues, football, roman, bridge,... Football LDA model
Evaluating topic models • Manually – Look at the topics.
See if they are interpretable. – Comparing different topic models Qualititative
None
Topic Coherence • Quantitave
Topic Coherence • Assign a number to the human interpretability!
Comparing topic models becomes much easier
Topic Coherence • Better LDA -> Better topics -> Better
classification Topics from topic modeling tutorial on Lee corpus
Join the community! • Pick up issues from: https://github.com/RaRe-Technologies/gensim •
Come for the sprint!