Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
pycon_delhi_lightening
Search
Devashish Deshpande
September 24, 2016
Technology
0
1.5k
pycon_delhi_lightening
Lightening talk delivered at PyCon India 2016
Devashish Deshpande
September 24, 2016
Tweet
Share
Other Decks in Technology
See All in Technology
Azure AI Foundryでマルチエージェントワークフロー
seosoft
0
140
Workflows から Agents へ ~ 生成 AI アプリの成長過程とアプローチ~
belongadmin
3
170
AIのAIによるAIのための出力評価と改善
chocoyama
0
410
キャディでのApache Iceberg, Trino採用事例 -Apache Iceberg and Trino Usecase in CADDi--
caddi_eng
0
170
Observability infrastructure behind the trillion-messages scale Kafka platform
lycorptech_jp
PRO
0
120
Amazon ECS & AWS Fargate 運用アーキテクチャ2025 / Amazon ECS and AWS Fargate Ops Architecture 2025
iselegant
13
3.8k
データプラットフォーム技術におけるメダリオンアーキテクチャという考え方/DataPlatformWithMedallionArchitecture
smdmts
4
490
2年でここまで成長!AWSで育てたAI Slack botの軌跡
iwamot
PRO
2
120
比起獨自升級 我更喜歡 DevOps 文化 <3
line_developers_tw
PRO
0
1.1k
Prox Industries株式会社 会社紹介資料
proxindustries
0
140
“プロダクトを好きになれるか“も QAエンジニア転職の大事な判断基準だと思ったの
tomodakengo
1
230
活きてなかったデータを活かしてみた話 / Shirokane Kougyou vol 19
sansan_randd
1
400
Featured
See All Featured
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
8
660
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
10
910
How STYLIGHT went responsive
nonsquared
100
5.6k
Statistics for Hackers
jakevdp
799
220k
RailsConf 2023
tenderlove
30
1.1k
A better future with KSS
kneath
239
17k
Why You Should Never Use an ORM
jnunemaker
PRO
56
9.4k
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
31
2.4k
Principles of Awesome APIs and How to Build Them.
keavy
126
17k
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
PRO
181
53k
Scaling GitHub
holman
459
140k
Adopting Sorbet at Scale
ufuk
77
9.4k
Transcript
News classification with Gensim Devashish Deshpande Undergraduate student RaRe Technologies
Incubator Program Github: dsquareindia Blogs: https://rare-technologies.com/blog/
Gensim: Topic modeling in python
Problem of News (mis)classification
Screenshots from play newsstand
Topic-word coloring with LDA Image taken from LDA paper by
David Blei
What is a good LDA model? • Come up with
good topics • Infer topic distribution (United topic): mourinho, red_devils, old_trafford, bad_team... (Arsenal topic): wenger, henry, invincibles,.... (City topic): aguero, etihad, england, premier_league (Chelsea topic): blues, football, roman, bridge,... Football LDA model
Evaluating topic models • Manually – Look at the topics.
See if they are interpretable. – Comparing different topic models Qualititative
None
Topic Coherence • Quantitave
Topic Coherence • Assign a number to the human interpretability!
Comparing topic models becomes much easier
Topic Coherence • Better LDA -> Better topics -> Better
classification Topics from topic modeling tutorial on Lee corpus
Join the community! • Pick up issues from: https://github.com/RaRe-Technologies/gensim •
Come for the sprint!