Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Linked Data, Big Data and User Science at globo...
Search
Ícaro Medeiros
May 14, 2024
0
11
Linked Data, Big Data and User Science at globo.com
Ícaro Medeiros
May 14, 2024
Tweet
Share
More Decks by Ícaro Medeiros
See All by Ícaro Medeiros
WWW 2013 - Linked Data at Globo.com
icaromedeiros
0
38
Linked Data in Use - Front in Bahia 2014
icaromedeiros
0
10
Why Python is better for Data Science - SP Big Data Meetup
icaromedeiros
0
11
Featured
See All Featured
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
46
2.7k
VelocityConf: Rendering Performance Case Studies
addyosmani
333
24k
No one is an island. Learnings from fostering a developers community.
thoeni
21
3.6k
Have SEOs Ruined the Internet? - User Awareness of SEO in 2025
akashhashmi
0
270
Designing Experiences People Love
moore
144
24k
The browser strikes back
jonoalderson
0
390
Digital Projects Gone Horribly Wrong (And the UX Pros Who Still Save the Day) - Dean Schuster
uxyall
0
370
Prompt Engineering for Job Search
mfonobong
0
160
How to Talk to Developers About Accessibility
jct
2
130
The AI Search Optimization Roadmap by Aleyda Solis
aleyda
1
5.2k
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
PRO
196
71k
Put a Button on it: Removing Barriers to Going Fast.
kastner
60
4.2k
Transcript
Ícaro Medeiros
[email protected]
[email protected]
I Encontro de Computação Semântica @UFRJ
11/03/2015 LINKED DATA BIG DATA USER SCIENCE @ globo.com
INTERACTION CONTENT USER
( icaro, home_globoesporte, pageview@23:00 ) ( icaro, materia_1, scroll+2min@14:00 )
Signals ( materia_1: [messi, neymar, barcelona] ) content description
LINKED DATA (content)
Ontologies ‣ 288 classes ‣ Person: 65K ‣ Place: 50K
‣ Athlete: 22K ‣ Politicians: 32K
Annotation tool
None
Interface follows the ontology Fields Suggest as you type Triples
stored in Virtuoso Automatic entity extraction Fast search in Elastic Search
Contextual navigation
globo esporte .com
globo esporte .com
globo esporte .com
Automatic page generation
None
Intelligent Search
BIG DATA
Cluster Stats ‣ 10 machines ‣ 1 TB RAM ‣
500 TB disk ‣ 338 VCores
Signal Capturing
Beyond clicks (engagement science) ‣ Attention-based metrics ‣ Scroll ‣
Time spent on page ‣ Dwell time ‣ Social Media Analytics http://labs.yahoo.com/publication/beyond-clicks-dwell-time-for-personalization/
Shares are noisy http://time.com/12933/what-you-think-you-know-about-the-web-is-wrong/
Scroll http://time.com/12933/what-you-think-you-know-about-the-web-is-wrong/
Recommendation ‣ TF-IDF ‣ Collaborative Filtering ‣ Users ‣ Content
‣ Latent Factor Analysis
None
None
USER SCIENCE for news reading
User Modeling (for news reading) ‣ Dynamic pro fi ling
‣ Explicit personal data ‣ Interests (implicit) ‣ Temporal constraints: periodicity
Signal Capturing Excelsior Signals
Semantic User Modeling ‣ Annotations from engaged content ‣ Pro
fi le can answer: ‣ My favourite team ‣ City I live in ‣ My hometown
Spreading Activation
My profile
City/State I live in
Hometown and State
Football team test (3.5MM users) 82%precision 95%precision@top3 * When the
user has read at least one article that cites their team
How fast? mean request time between interaction and profile update
5 min 48 ms
Potential uses ‣ Personalized homepages ‣ Targeted advertising ‣ Granular
user/content description ‣ Semantic Recommendation ‣ Clustering ‣ Demographic data ‣ Informed product creation/evolution
github.com/ globocom/ IWantToWorkAtGloboCom
Ícaro Medeiros
[email protected]
Semantic team
[email protected]
globo.com slides icaromedeiros.com.br slideshare.net/icaromedeiros