Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Data Breaking Bad
Search
Michael Hausenblas
June 03, 2013
Technology
1
180
Data Breaking Bad
Open Stage talk at Berlin Buzzwords 2013
Michael Hausenblas
June 03, 2013
Tweet
Share
More Decks by Michael Hausenblas
See All by Michael Hausenblas
KubeCologne keynote—Troubleshooting Kubernetes apps
mhausenblas
4
7.2k
Extending Kubernetes 101
mhausenblas
4
1.9k
Kubernetes and serverless technologies for high-performance applications
mhausenblas
1
270
Troubleshooting Kubernetes Applications
mhausenblas
1
540
Autoscaling All Things Kubernetes with Prometheus
mhausenblas
0
840
Three Billy Goats Gruff : from a monolith to containers to functions
mhausenblas
0
450
Bending Kubernetes to Your Needs
mhausenblas
1
2.3k
Kubernetes Security: from Image Hygiene to Network Policies
mhausenblas
8
3.5k
Hands-on Cloud Native Lifecycle Management
mhausenblas
3
270
Other Decks in Technology
See All in Technology
SPI原点回帰論:事業課題とFour Keysの結節点を見出す実践的ソフトウェアプロセス改善 / DevOpsDays Tokyo 2024
visional_engineering_and_design
4
1.3k
「手動オペレーションに定評がある」と言われた私が心がけていること / phpcon_odawara2024
blue_goheimochi
1
310
【SORACOM UG】SIM Deep Dive セキュアエレメント編
soracom
PRO
0
240
入社後初めてのタスクでk8sアップグレードした話.pdf
kkato1
0
380
Oracle Exadata Database Service on Cloud@Customer (ExaDB-C@C) - UI スクリーン・キャプチャ集
oracle4engineer
PRO
1
1.1k
PHPカンファレンス小田原2024
ysknsid25
2
660
クラウドサインにおけるプロダクトマネージャーの役割と開発プロセス / 20240410_cloudsign-PdM
bengo4com
1
670
AWS を使う上で知っておきたいオンプレミス知識/aws-on-premise-essentials
emiki
1
4.1k
日本におけるデータエンジニアリングのこれまでとこれから
foursue
9
1.9k
オブザーバビリティの Primary Signals
onk
PRO
0
530
人間の尊厳、幸福、アクセシビリティ / 第116回「WEB TOUCH MEETING」アクセシビリティSP
nulabinc
PRO
2
180
The AI Revolution Will Not Be Monopolized: How open-source beats economies of scale, even for LLMs
inesmontani
PRO
1
630
Featured
See All Featured
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
8
8.3k
The Brand Is Dead. Long Live the Brand.
mthomps
48
28k
Atom: Resistance is Futile
akmur
258
25k
No one is an island. Learnings from fostering a developers community.
thoeni
14
2.1k
Why Our Code Smells
bkeepers
PRO
331
56k
Ruby is Unlike a Banana
tanoku
95
10k
What the flash - Photography Introduction
edds
64
11k
The Power of CSS Pseudo Elements
geoffreycrofte
58
5k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
658
120k
Into the Great Unknown - MozCon
thekraken
10
980
How GitHub Uses GitHub to Build GitHub
holman
468
290k
Building Better People: How to give real-time feedback that sticks.
wjessup
353
18k
Transcript
Da Michael Hausenblas, MapR Technologies Berlin Buzzwords 2013, Open Stage
Talk Friday, 7 June 13
Nope. Not this one. Friday, 7 June 13
Friday, 7 June 13
things you can influence things that affect you try and
focus on this stuff Friday, 7 June 13
The awkward moment when I open the data I got
from a customer Friday, 7 June 13
http://techcrunch.com/2012/11/25/the-big-data-fallacy-data-%E2%89%A0-information-%E2%89%A0-insights/ aka crap in, crap out Friday, 7 June 13
Some examples … Friday, 7 June 13
• Encöding hell • Schema? Sure, I fax you a
screenshot • Dupes and other fakes • Sampling Friday, 7 June 13
Encöding hell application-specific encodings • URL encoding • HTML encoding
• Database escaping non-ASCII? a%20percent-encoded%20string%20as%20of%20RFC%203986 a <strong>HTML</strong> encoded string Friday, 7 June 13
• Use Unicode • Use Unicode • Use Unicode Encöding
hell http://www.swedishfika.com/2010/01/19/escaping-from-encoding-hell/ Friday, 7 June 13
• Encöding hell • Schema? Sure, I fax you a
screenshot • Dupes and other fakes • Sampling Friday, 7 June 13
Schema? Sure, I fax you a screenshot Friday, 7 June
13
Schema? Sure, I fax you a screenshot • There is
a need for proper, formal documentation • For humans and machines • Basis for validation—automate! Friday, 7 June 13
• Encöding hell • Schema? Sure, I fax you a
screenshot • Dupes and other fakes • Sampling Friday, 7 June 13
Dupes and other fakes Friday, 7 June 13
Dupes and other fakes Friday, 7 June 13
Dupes and other fakes • Use plots to get an
overview • Watch out for outliers • Try to establish source for errors and fix • Document (in any case) Friday, 7 June 13
• Encöding hell • Schema? Sure, I fax you a
screenshot • Dupes and other fakes • Sampling Friday, 7 June 13
• My data is too big. I can’t check it
all. • Why don’t you sample, then? Sampling Friday, 7 June 13
http://mortardata.com/ Friday, 7 June 13
Friday, 7 June 13
Go and buy this book. Now. Friday, 7 June 13