Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Data Breaking Bad
Search
Michael Hausenblas
June 03, 2013
Technology
1
200
Data Breaking Bad
Open Stage talk at Berlin Buzzwords 2013
Michael Hausenblas
June 03, 2013
Tweet
Share
More Decks by Michael Hausenblas
See All by Michael Hausenblas
KubeCologne keynote—Troubleshooting Kubernetes apps
mhausenblas
4
8.2k
Extending Kubernetes 101
mhausenblas
4
2.3k
Kubernetes and serverless technologies for high-performance applications
mhausenblas
1
370
Troubleshooting Kubernetes Applications
mhausenblas
1
630
Autoscaling All Things Kubernetes with Prometheus
mhausenblas
0
970
Three Billy Goats Gruff : from a monolith to containers to functions
mhausenblas
0
620
Bending Kubernetes to Your Needs
mhausenblas
2
2.9k
Kubernetes Security: from Image Hygiene to Network Policies
mhausenblas
8
4k
Hands-on Cloud Native Lifecycle Management
mhausenblas
3
490
Other Decks in Technology
See All in Technology
IaaS/SaaS管理における SREの実践 - SRE Kaigi 2026
bbqallstars
4
1.7k
茨城の思い出を振り返る ~CDKのセキュリティを添えて~ / 20260201 Mitsutoshi Matsuo
shift_evolve
PRO
1
200
Webhook best practices for rock solid and resilient deployments
glaforge
1
270
Oracle Cloud Observability and Management Platform - OCI 運用監視サービス概要 -
oracle4engineer
PRO
2
14k
Sansan Engineering Unit 紹介資料
sansan33
PRO
1
3.8k
データの整合性を保ちたいだけなんだ
shoheimitani
8
3k
2人で作ったAIダッシュボードが、開発組織の次の一手を照らした話― Cursor × SpecKit × 可視化の実践 ― Qiita AI Summit
noalisaai
1
370
~Everything as Codeを諦めない~ 後からCDK
mu7889yoon
3
280
2026年、サーバーレスの現在地 -「制約と戦う技術」から「当たり前の実行基盤」へ- /serverless2026
slsops
2
210
10Xにおける品質保証活動の全体像と改善 #no_more_wait_for_test
nihonbuson
PRO
2
210
We Built for Predictability; The Workloads Didn’t Care
stahnma
0
130
データ民主化のための LLM 活用状況と課題紹介(IVRy の場合)
wxyzzz
2
680
Featured
See All Featured
How to Align SEO within the Product Triangle To Get Buy-In & Support - #RIMC
aleyda
1
1.4k
End of SEO as We Know It (SMX Advanced Version)
ipullrank
3
3.9k
Joys of Absence: A Defence of Solitary Play
codingconduct
1
290
Taking LLMs out of the black box: A practical guide to human-in-the-loop distillation
inesmontani
PRO
3
2k
How to build an LLM SEO readiness audit: a practical framework
nmsamuel
1
640
Gemini Prompt Engineering: Practical Techniques for Tangible AI Outcomes
mfonobong
2
280
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
25
1.7k
Product Roadmaps are Hard
iamctodd
PRO
55
12k
WENDY [Excerpt]
tessaabrams
9
36k
Measuring & Analyzing Core Web Vitals
bluesmoon
9
750
Crafting Experiences
bethany
1
46
A brief & incomplete history of UX Design for the World Wide Web: 1989–2019
jct
1
300
Transcript
Da Michael Hausenblas, MapR Technologies Berlin Buzzwords 2013, Open Stage
Talk Friday, 7 June 13
Nope. Not this one. Friday, 7 June 13
Friday, 7 June 13
things you can influence things that affect you try and
focus on this stuff Friday, 7 June 13
The awkward moment when I open the data I got
from a customer Friday, 7 June 13
http://techcrunch.com/2012/11/25/the-big-data-fallacy-data-%E2%89%A0-information-%E2%89%A0-insights/ aka crap in, crap out Friday, 7 June 13
Some examples … Friday, 7 June 13
• Encöding hell • Schema? Sure, I fax you a
screenshot • Dupes and other fakes • Sampling Friday, 7 June 13
Encöding hell application-specific encodings • URL encoding • HTML encoding
• Database escaping non-ASCII? a%20percent-encoded%20string%20as%20of%20RFC%203986 a <strong>HTML</strong> encoded string Friday, 7 June 13
• Use Unicode • Use Unicode • Use Unicode Encöding
hell http://www.swedishfika.com/2010/01/19/escaping-from-encoding-hell/ Friday, 7 June 13
• Encöding hell • Schema? Sure, I fax you a
screenshot • Dupes and other fakes • Sampling Friday, 7 June 13
Schema? Sure, I fax you a screenshot Friday, 7 June
13
Schema? Sure, I fax you a screenshot • There is
a need for proper, formal documentation • For humans and machines • Basis for validation—automate! Friday, 7 June 13
• Encöding hell • Schema? Sure, I fax you a
screenshot • Dupes and other fakes • Sampling Friday, 7 June 13
Dupes and other fakes Friday, 7 June 13
Dupes and other fakes Friday, 7 June 13
Dupes and other fakes • Use plots to get an
overview • Watch out for outliers • Try to establish source for errors and fix • Document (in any case) Friday, 7 June 13
• Encöding hell • Schema? Sure, I fax you a
screenshot • Dupes and other fakes • Sampling Friday, 7 June 13
• My data is too big. I can’t check it
all. • Why don’t you sample, then? Sampling Friday, 7 June 13
http://mortardata.com/ Friday, 7 June 13
Friday, 7 June 13
Go and buy this book. Now. Friday, 7 June 13