Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Data Breaking Bad
Search
Michael Hausenblas
June 03, 2013
Technology
1
200
Data Breaking Bad
Open Stage talk at Berlin Buzzwords 2013
Michael Hausenblas
June 03, 2013
Tweet
Share
More Decks by Michael Hausenblas
See All by Michael Hausenblas
KubeCologne keynote—Troubleshooting Kubernetes apps
mhausenblas
4
8.2k
Extending Kubernetes 101
mhausenblas
4
2.3k
Kubernetes and serverless technologies for high-performance applications
mhausenblas
1
370
Troubleshooting Kubernetes Applications
mhausenblas
1
630
Autoscaling All Things Kubernetes with Prometheus
mhausenblas
0
970
Three Billy Goats Gruff : from a monolith to containers to functions
mhausenblas
0
620
Bending Kubernetes to Your Needs
mhausenblas
2
2.9k
Kubernetes Security: from Image Hygiene to Network Policies
mhausenblas
8
4k
Hands-on Cloud Native Lifecycle Management
mhausenblas
3
490
Other Decks in Technology
See All in Technology
こんなところでも(地味に)活躍するImage Modeさんを知ってるかい?- Image Mode for OpenShift -
tsukaman
0
120
AIと新時代を切り拓く。これからのSREとメルカリIBISの挑戦
0gm
0
850
30万人の同時アクセスに耐えたい!新サービスの盤石なリリースを支える負荷試験 / SRE Kaigi 2026
genda
3
1.1k
GSIが複数キー対応したことで、俺達はいったい何が嬉しいのか?
smt7174
3
150
生成AIを活用した音声文字起こしシステムの2つの構築パターンについて
miu_crescent
PRO
1
160
Deno・Bunの標準機能やElysiaJSを使ったWebSocketサーバー実装 / ラーメン屋を貸し切ってLT会! IoTLT 2026新年会
you
PRO
0
300
SREのプラクティスを用いた3領域同時 マネジメントへの挑戦 〜SRE・情シス・セキュリティを統合した チーム運営術〜
coconala_engineer
2
620
広告の効果検証を題材にした因果推論の精度検証について
zozotech
PRO
0
150
GitLab Duo Agent Platform × AGENTS.md で実現するSpec-Driven Development / GitLab Duo Agent Platform × AGENTS.md
n11sh1
0
130
AWS Network Firewall Proxyを触ってみた
nagisa53
0
200
What happened to RubyGems and what can we learn?
mikemcquaid
0
250
プロダクト成長を支える開発基盤とスケールに伴う課題
yuu26
4
1.3k
Featured
See All Featured
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
27k
The browser strikes back
jonoalderson
0
360
Believing is Seeing
oripsolob
1
53
Marketing Yourself as an Engineer | Alaka | Gurzu
gurzu
0
130
Being A Developer After 40
akosma
91
590k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
128
55k
技術選定の審美眼(2025年版) / Understanding the Spiral of Technologies 2025 edition
twada
PRO
117
110k
brightonSEO & MeasureFest 2025 - Christian Goodrich - Winning strategies for Black Friday CRO & PPC
cargoodrich
3
98
Paper Plane
katiecoart
PRO
0
46k
Stewardship and Sustainability of Urban and Community Forests
pwiseman
0
110
Avoiding the “Bad Training, Faster” Trap in the Age of AI
tmiket
0
72
Unsuck your backbone
ammeep
671
58k
Transcript
Da Michael Hausenblas, MapR Technologies Berlin Buzzwords 2013, Open Stage
Talk Friday, 7 June 13
Nope. Not this one. Friday, 7 June 13
Friday, 7 June 13
things you can influence things that affect you try and
focus on this stuff Friday, 7 June 13
The awkward moment when I open the data I got
from a customer Friday, 7 June 13
http://techcrunch.com/2012/11/25/the-big-data-fallacy-data-%E2%89%A0-information-%E2%89%A0-insights/ aka crap in, crap out Friday, 7 June 13
Some examples … Friday, 7 June 13
• Encöding hell • Schema? Sure, I fax you a
screenshot • Dupes and other fakes • Sampling Friday, 7 June 13
Encöding hell application-specific encodings • URL encoding • HTML encoding
• Database escaping non-ASCII? a%20percent-encoded%20string%20as%20of%20RFC%203986 a <strong>HTML</strong> encoded string Friday, 7 June 13
• Use Unicode • Use Unicode • Use Unicode Encöding
hell http://www.swedishfika.com/2010/01/19/escaping-from-encoding-hell/ Friday, 7 June 13
• Encöding hell • Schema? Sure, I fax you a
screenshot • Dupes and other fakes • Sampling Friday, 7 June 13
Schema? Sure, I fax you a screenshot Friday, 7 June
13
Schema? Sure, I fax you a screenshot • There is
a need for proper, formal documentation • For humans and machines • Basis for validation—automate! Friday, 7 June 13
• Encöding hell • Schema? Sure, I fax you a
screenshot • Dupes and other fakes • Sampling Friday, 7 June 13
Dupes and other fakes Friday, 7 June 13
Dupes and other fakes Friday, 7 June 13
Dupes and other fakes • Use plots to get an
overview • Watch out for outliers • Try to establish source for errors and fix • Document (in any case) Friday, 7 June 13
• Encöding hell • Schema? Sure, I fax you a
screenshot • Dupes and other fakes • Sampling Friday, 7 June 13
• My data is too big. I can’t check it
all. • Why don’t you sample, then? Sampling Friday, 7 June 13
http://mortardata.com/ Friday, 7 June 13
Friday, 7 June 13
Go and buy this book. Now. Friday, 7 June 13