Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
pycon_delhi_lightening
Search
Devashish Deshpande
September 24, 2016
Technology
0
1.5k
pycon_delhi_lightening
Lightening talk delivered at PyCon India 2016
Devashish Deshpande
September 24, 2016
Tweet
Share
Other Decks in Technology
See All in Technology
Nx × AI によるモノレポ活用 〜コードジェネレーター編〜
puku0x
0
560
10年以上続くプロダクトで今取り組んでること、取り組もうとしていること
sansantech
PRO
2
110
事業特性から逆算したインフラ設計
upsider_tech
0
110
AIに頼りすぎない新人育成術
cuebic9bic
3
300
ZOZOTOWNの大規模マーケティングメール配信を支えるアーキテクチャ
zozotech
PRO
0
310
UDDのススメ - 拡張版 -
maguroalternative
1
520
20250807 Applied Engineer Open House
sakana_ai
PRO
2
380
データモデリング通り #2オンライン勉強会 ~方法論の話をしよう~
datayokocho
0
160
「Roblox」の開発環境とその効率化 ~DAU9700万人超の巨大プラットフォームの開発 事始め~
keitatanji
0
120
風が吹けばWHOISが使えなくなる~なぜWHOIS・RDAPはサーバー証明書のメール認証に使えなくなったのか~
orangemorishita
15
5.8k
GMOペパボのデータ基盤とデータ活用の現在地 / Current State of GMO Pepabo's Data Infrastructure and Data Utilization
zaimy
3
220
マルチプロダクト×マルチテナントを支えるモジュラモノリスを中心としたアソビューのアーキテクチャ
disc99
1
530
Featured
See All Featured
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
PRO
183
54k
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
26
3k
The World Runs on Bad Software
bkeepers
PRO
70
11k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
8
880
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
29
2.8k
Documentation Writing (for coders)
carmenintech
73
5k
A better future with KSS
kneath
239
17k
Rebuilding a faster, lazier Slack
samanthasiow
83
9.1k
Typedesign – Prime Four
hannesfritz
42
2.7k
Measuring & Analyzing Core Web Vitals
bluesmoon
8
550
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
15
1.6k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
49
3k
Transcript
News classification with Gensim Devashish Deshpande Undergraduate student RaRe Technologies
Incubator Program Github: dsquareindia Blogs: https://rare-technologies.com/blog/
Gensim: Topic modeling in python
Problem of News (mis)classification
Screenshots from play newsstand
Topic-word coloring with LDA Image taken from LDA paper by
David Blei
What is a good LDA model? • Come up with
good topics • Infer topic distribution (United topic): mourinho, red_devils, old_trafford, bad_team... (Arsenal topic): wenger, henry, invincibles,.... (City topic): aguero, etihad, england, premier_league (Chelsea topic): blues, football, roman, bridge,... Football LDA model
Evaluating topic models • Manually – Look at the topics.
See if they are interpretable. – Comparing different topic models Qualititative
None
Topic Coherence • Quantitave
Topic Coherence • Assign a number to the human interpretability!
Comparing topic models becomes much easier
Topic Coherence • Better LDA -> Better topics -> Better
classification Topics from topic modeling tutorial on Lee corpus
Join the community! • Pick up issues from: https://github.com/RaRe-Technologies/gensim •
Come for the sprint!