Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
pycon_delhi_lightening
Search
Devashish Deshpande
September 24, 2016
Technology
0
1.4k
pycon_delhi_lightening
Lightening talk delivered at PyCon India 2016
Devashish Deshpande
September 24, 2016
Tweet
Share
Other Decks in Technology
See All in Technology
MongoDB Atlas Vectorsearchではじめる生成AIアプリ開発
chie8842
3
510
データマネジメントを支える武器としてのメタデータ管理
10xinc
2
980
どう買う?Azure
kuniteru
1
190
LLMの現在
pfn
PRO
8
4.1k
Terraform v1.7のTest mocking機能の紹介 / Introducing the Test mocking feature of Terraform v1.7
yayoi_dd
1
100
OpenTelemetry実践 はじめの一歩
taxin
0
330
HoneycombとOpenTelemetryでオブザーバビリティに入門してみる
sumiren
2
160
Vos logs méritent mieux que la config par défaut
lyrixx
2
430
技術広報として2023年度に頑張ったこと / What we did well in FY2023 as a DevRel
pauli
5
490
TypeScript Quiz (Encraft #12 Frontend Quiz Night)
uhyo
6
830
ビジネスロジックを「型」で表現するOOPのための関数型DDD / Functional And Type-Safe DDD for OOP
yuitosato
29
12k
SREsのためのSRE定着ガイド
netmarkjp
10
1.7k
Featured
See All Featured
Documentation Writing (for coders)
carmenintech
59
3.8k
VelocityConf: Rendering Performance Case Studies
addyosmani
319
23k
The Mythical Team-Month
searls
214
42k
The Cult of Friendly URLs
andyhume
73
5.6k
What's in a price? How to price your products and services
michaelherold
236
11k
Bash Introduction
62gerente
604
210k
5 minutes of I Can Smell Your CMS
philhawksworth
199
19k
I Don’t Have Time: Getting Over the Fear to Launch Your Podcast
jcasabona
19
1.6k
Faster Mobile Websites
deanohume
296
30k
Facilitating Awesome Meetings
lara
39
5.5k
Statistics for Hackers
jakevdp
789
220k
Rails Girls Zürich Keynote
gr2m
91
13k
Transcript
News classification with Gensim Devashish Deshpande Undergraduate student RaRe Technologies
Incubator Program Github: dsquareindia Blogs: https://rare-technologies.com/blog/
Gensim: Topic modeling in python
Problem of News (mis)classification
Screenshots from play newsstand
Topic-word coloring with LDA Image taken from LDA paper by
David Blei
What is a good LDA model? • Come up with
good topics • Infer topic distribution (United topic): mourinho, red_devils, old_trafford, bad_team... (Arsenal topic): wenger, henry, invincibles,.... (City topic): aguero, etihad, england, premier_league (Chelsea topic): blues, football, roman, bridge,... Football LDA model
Evaluating topic models • Manually – Look at the topics.
See if they are interpretable. – Comparing different topic models Qualititative
None
Topic Coherence • Quantitave
Topic Coherence • Assign a number to the human interpretability!
Comparing topic models becomes much easier
Topic Coherence • Better LDA -> Better topics -> Better
classification Topics from topic modeling tutorial on Lee corpus
Join the community! • Pick up issues from: https://github.com/RaRe-Technologies/gensim •
Come for the sprint!