Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
pycon_delhi_lightening
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Devashish Deshpande
September 24, 2016
Technology
1.6k
0
Share
pycon_delhi_lightening
Lightening talk delivered at PyCon India 2016
Devashish Deshpande
September 24, 2016
Other Decks in Technology
See All in Technology
Java正規表現エンジン(NFA)の仕組みと パフォーマンスを維持するための最適化手法
takeuchi_132917
0
160
Claude Codeを組織で使いこなす— サーバサイドAIエージェント運用の実践知
techtekt
PRO
0
140
GitHub Copilot CLIでWebアクセシビリティを改善した話
tomokusaba
0
140
AIガバナンス実践 - 生成AIコネクタのデータ漏洩リスクと実務対策
knishioka
0
150
Platform engineering for developers, architects & the rest of us (AI agents)
danielbryantuk
0
160
GoとSIMDとWasmの今。
askua
1
270
関西に縁あるMicrosoft MVPsが語るCopilotの未来
kasada
0
820
自称宇宙最速で不合格となったAIP-C01にリベンジを果たすべくAIで問題集アプリを作ってみた。
yama3133
0
250
Platform Engineering as a Product: Criteria for Improvement and Multi-Tenant Design
kumorn5s
0
430
AI駆動開発でなんでもハンズオン環境をつくってみた
yoshimi0227
0
180
オンコールの負荷軽減のためのBits Assistant 活用方法 / How to Use Bits Assistant to Reduce the Workload on On-Call Staff
sms_tech
1
360
ルールやカスタム機能、どう使う?理想の出力を引き出すために今知りたいIBM Bob 5つの機能
muehara
0
160
Featured
See All Featured
Have SEOs Ruined the Internet? - User Awareness of SEO in 2025
akashhashmi
0
350
The State of eCommerce SEO: How to Win in Today's Products SERPs - #SEOweek
aleyda
2
11k
Introduction to Domain-Driven Design and Collaborative software design
baasie
1
810
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
27k
Building Adaptive Systems
keathley
44
3k
Test your architecture with Archunit
thirion
1
2.3k
How to Align SEO within the Product Triangle To Get Buy-In & Support - #RIMC
aleyda
2
1.5k
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
25
1.9k
Put a Button on it: Removing Barriers to Going Fast.
kastner
60
4.3k
Practical Orchestrator
shlominoach
191
11k
Bash Introduction
62gerente
615
210k
How GitHub (no longer) Works
holman
316
150k
Transcript
News classification with Gensim Devashish Deshpande Undergraduate student RaRe Technologies
Incubator Program Github: dsquareindia Blogs: https://rare-technologies.com/blog/
Gensim: Topic modeling in python
Problem of News (mis)classification
Screenshots from play newsstand
Topic-word coloring with LDA Image taken from LDA paper by
David Blei
What is a good LDA model? • Come up with
good topics • Infer topic distribution (United topic): mourinho, red_devils, old_trafford, bad_team... (Arsenal topic): wenger, henry, invincibles,.... (City topic): aguero, etihad, england, premier_league (Chelsea topic): blues, football, roman, bridge,... Football LDA model
Evaluating topic models • Manually – Look at the topics.
See if they are interpretable. – Comparing different topic models Qualititative
None
Topic Coherence • Quantitave
Topic Coherence • Assign a number to the human interpretability!
Comparing topic models becomes much easier
Topic Coherence • Better LDA -> Better topics -> Better
classification Topics from topic modeling tutorial on Lee corpus
Join the community! • Pick up issues from: https://github.com/RaRe-Technologies/gensim •
Come for the sprint!