Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
OpenTalks.AI - Алексей Бурнаков, Тематическое моделирование новостей на основе детекции цитирований
Search
OpenTalks.AI
February 21, 2020
Science
0
2k
OpenTalks.AI - Алексей Бурнаков, Тематическое моделирование новостей на основе детекции цитирований
OpenTalks.AI
February 21, 2020
Tweet
Share
More Decks by OpenTalks.AI
See All by OpenTalks.AI
OpenTalks.AI - Виктор Лемпицкий, Моделирование 3Д сцен: новые подходы в 2020 году
opentalks
0
420
OpenTalks.AI - Алексей Чернявский, Нейросетевые алгоритмы для повышения качества медицинских изображений
opentalks
0
400
OpenTalks.AI - Александр Громов, Устойчивость нейросетевых моделей при анализе КТ/НДКТ-исследований
opentalks
0
350
OpenTalks.AI - Денис Тимонин, Megatron-LM: Обучение мультимиллиардных LMs при помощи техники Model Parallelism
opentalks
0
460
OpenTalks.AI - Егор Филимонов, Возможности платформы Huawei Atlas и эффективный гетерогенный инференс.
opentalks
0
110
OpenTalks.AI - Александр Прозоров, Референсная архитектура робота сервисного центра в отраслях с изменчивыми бизнес-процессами
opentalks
0
340
OpenTalks.AI - Наталья Лукашевич, Анализ тональности по отношению к компании — с чем не справился BERT
opentalks
0
320
OpenTalks.AI - Константин Воронцов, Фейковые новости и другие типы потенциально опасного дискурса: типология, подходы, датасеты, соревнования
opentalks
0
390
OpenTalks.AI - Дмитрий Ветров, Фрактальность функции потерь, эффект двойного спуска и степенные законы в глубинном обучении - фрагменты одной мозаики
opentalks
0
410
Other Decks in Science
See All in Science
Direct Preference Optimization
zchenry
0
140
Mastering Feature Engineering: Mining the Hidden Salary Formula with CakeResume
tlyu0419
0
130
同じデータでもP値が変わる話/key_considerations_in_NHST
florets1
1
1.1k
ultraArmをモニター提供してもらった話
miura55
0
110
[NeurIPS 2023 論文読み会] Wasserstein Quantum Monte Carlo
stakaya
0
350
Demucsを用いた音源分離
508shuto
0
190
Science of Scienceおよび科学計量学に関する研究論文の俯瞰可視化_LT版
hayataka88
0
470
Presenting Effectively with Data (in a Hurry)
thomaselove
1
250
研究・教育・産学連携の循環の実践
sshimizu2006
0
220
名古屋市立大学データサイエンス学部 夏のオープンキャンパス模擬授業20230818
ncu_ds
0
1.1k
2023-08-02_spatialLIBD_BioC2023_demo
lcolladotor
0
110
【論文紹介】DocTr_ Document Transformer for Structured Information Extraction in Documents / iccv2023-doctr
yuya4
3
580
Featured
See All Featured
No one is an island. Learnings from fostering a developers community.
thoeni
16
2.1k
Bash Introduction
62gerente
604
210k
Keith and Marios Guide to Fast Websites
keithpitt
408
22k
Agile that works and the tools we love
rasmusluckow
325
20k
ParisWeb 2013: Learning to Love: Crash Course in Emotional UX Design
dotmariusz
104
6.6k
The Brand Is Dead. Long Live the Brand.
mthomps
49
29k
Scaling GitHub
holman
457
140k
A Tale of Four Properties
chriscoyier
151
22k
Done Done
chrislema
178
15k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
116
18k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
322
20k
Pencils Down: Stop Designing & Start Developing
hursman
117
11k
Transcript
NEWS TOPIC MODELLING BASED ON CITATION DETECTION Alexey Burnakov, TASS
ITAR-TASS www.tass.ru 115th birthday It's been a while…
News Media Market: A Complex Graph
Citation Detection
News Specific Citation Detection Personal data Headline Editorial Cite number
Cite index Citing media Date News rating Editor rating Board rating Organization rating
Methods I Cosine similarity Bag of words / tf-idf Generalized
linear models
Methods II PageRank Random-Walk Graph Partitioning
Citation Detection: results precision = 0.89 recall = 0.87 Logistic
regression output F1 score = 0.88 MCC score = 0.88 AUC: 0.998 We did good at a train dataset
PageRank: results. TOP-25 of the Russian Mass Media
NLP Pipeline Raw text Tokenization Who cites TASS Which news
was cited Topic modelling Customer facing
Topic Modelling I Motivation: Are there big topics today? Notre-Dame
de Paris’s on fire : (
Topic Modelling II Airbus emergency landing Motivation: Are there big
topics today?
Topic Modelling III Flood in the Irkutsk Region :( `Losharik`
Submarine deadly accident :( Motivation: Are there big topics today?
Topic Report Ex-Kyrgyz president Atambaev seizure by special forces
Competition Snapshot Which agency did a good job?
Daily Competition Snapshot Who is the hero of the day?
Personal data Personal data Personal data Personal data Personal data
Daily Competition Snapshot https://www.gazeta.ru/politics/2019/08/08_a_12564199.shtml
THANK YOU!