Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Linked Data, Big Data and User Science at globo...
Search
Ícaro Medeiros
May 14, 2024
11
0
Share
Linked Data, Big Data and User Science at globo.com
Ícaro Medeiros
May 14, 2024
More Decks by Ícaro Medeiros
See All by Ícaro Medeiros
WWW 2013 - Linked Data at Globo.com
icaromedeiros
0
38
Linked Data in Use - Front in Bahia 2014
icaromedeiros
0
12
Why Python is better for Data Science - SP Big Data Meetup
icaromedeiros
0
13
Featured
See All Featured
Getting science done with accelerated Python computing platforms
jacobtomlinson
2
190
Skip the Path - Find Your Career Trail
mkilby
1
110
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
10
1.1k
Collaborative Software Design: How to facilitate domain modelling decisions
baasie
1
200
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
55
3.3k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
128
55k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
333
22k
Between Models and Reality
mayunak
3
270
Designing Experiences People Love
moore
143
24k
A Guide to Academic Writing Using Generative AI - A Workshop
ks91
PRO
1
280
Mind Mapping
helmedeiros
PRO
1
170
Product Roadmaps are Hard
iamctodd
PRO
55
12k
Transcript
Ícaro Medeiros
[email protected]
[email protected]
I Encontro de Computação Semântica @UFRJ
11/03/2015 LINKED DATA BIG DATA USER SCIENCE @ globo.com
INTERACTION CONTENT USER
( icaro, home_globoesporte, pageview@23:00 ) ( icaro, materia_1, scroll+2min@14:00 )
Signals ( materia_1: [messi, neymar, barcelona] ) content description
LINKED DATA (content)
Ontologies ‣ 288 classes ‣ Person: 65K ‣ Place: 50K
‣ Athlete: 22K ‣ Politicians: 32K
Annotation tool
None
Interface follows the ontology Fields Suggest as you type Triples
stored in Virtuoso Automatic entity extraction Fast search in Elastic Search
Contextual navigation
globo esporte .com
globo esporte .com
globo esporte .com
Automatic page generation
None
Intelligent Search
BIG DATA
Cluster Stats ‣ 10 machines ‣ 1 TB RAM ‣
500 TB disk ‣ 338 VCores
Signal Capturing
Beyond clicks (engagement science) ‣ Attention-based metrics ‣ Scroll ‣
Time spent on page ‣ Dwell time ‣ Social Media Analytics http://labs.yahoo.com/publication/beyond-clicks-dwell-time-for-personalization/
Shares are noisy http://time.com/12933/what-you-think-you-know-about-the-web-is-wrong/
Scroll http://time.com/12933/what-you-think-you-know-about-the-web-is-wrong/
Recommendation ‣ TF-IDF ‣ Collaborative Filtering ‣ Users ‣ Content
‣ Latent Factor Analysis
None
None
USER SCIENCE for news reading
User Modeling (for news reading) ‣ Dynamic pro fi ling
‣ Explicit personal data ‣ Interests (implicit) ‣ Temporal constraints: periodicity
Signal Capturing Excelsior Signals
Semantic User Modeling ‣ Annotations from engaged content ‣ Pro
fi le can answer: ‣ My favourite team ‣ City I live in ‣ My hometown
Spreading Activation
My profile
City/State I live in
Hometown and State
Football team test (3.5MM users) 82%precision 95%precision@top3 * When the
user has read at least one article that cites their team
How fast? mean request time between interaction and profile update
5 min 48 ms
Potential uses ‣ Personalized homepages ‣ Targeted advertising ‣ Granular
user/content description ‣ Semantic Recommendation ‣ Clustering ‣ Demographic data ‣ Informed product creation/evolution
github.com/ globocom/ IWantToWorkAtGloboCom
Ícaro Medeiros
[email protected]
Semantic team
[email protected]
globo.com slides icaromedeiros.com.br slideshare.net/icaromedeiros