Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Intro a Google Refine
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
dcabo
May 25, 2013
640
0
Share
Intro a Google Refine
dcabo
May 25, 2013
More Decks by dcabo
See All by dcabo
Open Data y Transparencia
dcabo
0
75
Mejorando el periodismo con Ruby
dcabo
0
600
Reutilización de datos y transparencia
dcabo
3
350
Preparando datos para su análisis
dcabo
0
630
Beyond FOIA (FOIA and Technology)
dcabo
1
92
Open Data y Transparencia
dcabo
0
200
¿Dónde van mis impuestos?
dcabo
3
250
Casos prácticos de la reutilización de datos públicos
dcabo
2
140
Against the Spanish odds (the techie side)
dcabo
3
280
Featured
See All Featured
How To Speak Unicorn (iThemes Webinar)
marktimemedia
1
480
Between Models and Reality
mayunak
4
320
The Art of Programming - Codeland 2020
erikaheidi
57
14k
Mozcon NYC 2025: Stop Losing SEO Traffic
samtorres
1
250
Building Better People: How to give real-time feedback that sticks.
wjessup
370
20k
Sam Torres - BigQuery for SEOs
techseoconnect
PRO
0
280
AI: The stuff that nobody shows you
jnunemaker
PRO
8
690
How to Build an AI Search Optimization Roadmap - Criteria and Steps to Take #SEOIRL
aleyda
1
2.1k
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
25
1.9k
Typedesign – Prime Four
hannesfritz
42
3.1k
Why Mistakes Are the Best Teachers: Turning Failure into a Pathway for Growth
auna
0
150
Applied NLP in the Age of Generative AI
inesmontani
PRO
4
2.3k
Transcript
Limpiando datos con Google Refine David Cabo (@dcabo)
[email protected]
Limpiando datos • Refine: Herramienta de exploración y limpieza de
datos • Proceso • 1. Obtener los datos • 2. Limpiarlos con Refine • 3. Analizarlos: Excel, Open Office, R...
¿Qué puede hacer? • Filtrar y agrupar datos por distintos
criterios • Aplicar transformaciones a los datos • Unir/partir columnas • Verificar con bases de datos externas:FreeBase, Open Corporates... • Clustering: limpieza basada en similitudes: corrección de erratas • ...