Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Data Breaking Bad
Search
Michael Hausenblas
June 03, 2013
Technology
210
1
Share
Data Breaking Bad
Open Stage talk at Berlin Buzzwords 2013
Michael Hausenblas
June 03, 2013
More Decks by Michael Hausenblas
See All by Michael Hausenblas
KubeCologne keynote—Troubleshooting Kubernetes apps
mhausenblas
4
8.3k
Extending Kubernetes 101
mhausenblas
4
2.4k
Kubernetes and serverless technologies for high-performance applications
mhausenblas
1
380
Troubleshooting Kubernetes Applications
mhausenblas
1
640
Autoscaling All Things Kubernetes with Prometheus
mhausenblas
0
980
Three Billy Goats Gruff : from a monolith to containers to functions
mhausenblas
0
630
Bending Kubernetes to Your Needs
mhausenblas
2
2.9k
Kubernetes Security: from Image Hygiene to Network Policies
mhausenblas
8
4k
Hands-on Cloud Native Lifecycle Management
mhausenblas
3
500
Other Decks in Technology
See All in Technology
不確実性と戦いながら見積もりを作成するプロセス/mitsumori-process
hirodragon112
1
160
来期の評価で変えようと思っていること 〜AI時代に変わること・変わらないこと〜
estie
0
130
OpenClawでPM業務を自動化
knishioka
2
350
やさしいとこから始めるGitHubリポジトリのセキュリティ
tsubakimoto_s
3
2.1k
Bill One 開発エンジニア 紹介資料
sansan33
PRO
5
18k
Podcast配信で広がったアウトプットの輪~70人と音声発信してきた7年間~/outputconf_01
fortegp05
0
140
AgentCoreとLINEを使った飲食店おすすめアプリを作ってみた
yakumo
2
270
Network Firewall Proxyで 自前プロキシを消し去ることができるのか
gusandayo
0
140
パワポ作るマンをMCP Apps化してみた
iwamot
PRO
0
260
Bref でサービスを運用している話
sgash708
0
220
私がよく使うMCPサーバー3選と社内で安全に活用する方法
kintotechdev
0
150
AI時代のIssue駆動開発のススメ
moongift
PRO
0
320
Featured
See All Featured
Technical Leadership for Architectural Decision Making
baasie
3
300
Speed Design
sergeychernyshev
33
1.6k
Effective software design: The role of men in debugging patriarchy in IT @ Voxxed Days AMS
baasie
0
270
How to Build an AI Search Optimization Roadmap - Criteria and Steps to Take #SEOIRL
aleyda
1
2k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
31
3.1k
Amusing Abliteration
ianozsvald
0
150
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
9
1.2k
Six Lessons from altMBA
skipperchong
29
4.2k
Game over? The fight for quality and originality in the time of robots
wayneb77
1
150
Gemini Prompt Engineering: Practical Techniques for Tangible AI Outcomes
mfonobong
2
340
Build your cross-platform service in a week with App Engine
jlugia
234
18k
How to make the Groovebox
asonas
2
2.1k
Transcript
Da Michael Hausenblas, MapR Technologies Berlin Buzzwords 2013, Open Stage
Talk Friday, 7 June 13
Nope. Not this one. Friday, 7 June 13
Friday, 7 June 13
things you can influence things that affect you try and
focus on this stuff Friday, 7 June 13
The awkward moment when I open the data I got
from a customer Friday, 7 June 13
http://techcrunch.com/2012/11/25/the-big-data-fallacy-data-%E2%89%A0-information-%E2%89%A0-insights/ aka crap in, crap out Friday, 7 June 13
Some examples … Friday, 7 June 13
• Encöding hell • Schema? Sure, I fax you a
screenshot • Dupes and other fakes • Sampling Friday, 7 June 13
Encöding hell application-specific encodings • URL encoding • HTML encoding
• Database escaping non-ASCII? a%20percent-encoded%20string%20as%20of%20RFC%203986 a <strong>HTML</strong> encoded string Friday, 7 June 13
• Use Unicode • Use Unicode • Use Unicode Encöding
hell http://www.swedishfika.com/2010/01/19/escaping-from-encoding-hell/ Friday, 7 June 13
• Encöding hell • Schema? Sure, I fax you a
screenshot • Dupes and other fakes • Sampling Friday, 7 June 13
Schema? Sure, I fax you a screenshot Friday, 7 June
13
Schema? Sure, I fax you a screenshot • There is
a need for proper, formal documentation • For humans and machines • Basis for validation—automate! Friday, 7 June 13
• Encöding hell • Schema? Sure, I fax you a
screenshot • Dupes and other fakes • Sampling Friday, 7 June 13
Dupes and other fakes Friday, 7 June 13
Dupes and other fakes Friday, 7 June 13
Dupes and other fakes • Use plots to get an
overview • Watch out for outliers • Try to establish source for errors and fix • Document (in any case) Friday, 7 June 13
• Encöding hell • Schema? Sure, I fax you a
screenshot • Dupes and other fakes • Sampling Friday, 7 June 13
• My data is too big. I can’t check it
all. • Why don’t you sample, then? Sampling Friday, 7 June 13
http://mortardata.com/ Friday, 7 June 13
Friday, 7 June 13
Go and buy this book. Now. Friday, 7 June 13