Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Data Breaking Bad
Search
Michael Hausenblas
June 03, 2013
Technology
1
200
Data Breaking Bad
Open Stage talk at Berlin Buzzwords 2013
Michael Hausenblas
June 03, 2013
Tweet
Share
More Decks by Michael Hausenblas
See All by Michael Hausenblas
KubeCologne keynote—Troubleshooting Kubernetes apps
mhausenblas
4
7.9k
Extending Kubernetes 101
mhausenblas
4
2.1k
Kubernetes and serverless technologies for high-performance applications
mhausenblas
1
330
Troubleshooting Kubernetes Applications
mhausenblas
1
570
Autoscaling All Things Kubernetes with Prometheus
mhausenblas
0
930
Three Billy Goats Gruff : from a monolith to containers to functions
mhausenblas
0
540
Bending Kubernetes to Your Needs
mhausenblas
2
2.7k
Kubernetes Security: from Image Hygiene to Network Policies
mhausenblas
8
3.8k
Hands-on Cloud Native Lifecycle Management
mhausenblas
3
360
Other Decks in Technology
See All in Technology
Exadata Database Service on Cloud@Customer セキュリティ、ネットワーク、および管理について
oracle4engineer
PRO
2
1.6k
2025/3/1 公共交通オープンデータデイ2025
morohoshi
0
110
手を動かしてレベルアップしよう!
maruto
0
260
QAエンジニアが スクラムマスターをすると いいなぁと思った話
____rina____
0
150
データモデルYANGの処理系を再発明した話
tjmtrhs
0
330
DevinでAI AWSエンジニア製造計画 序章 〜CDKを添えて〜/devin-load-to-aws-engineer
tomoki10
0
220
エンジニアの健康管理術 / Engineer Health Management Techniques
y_sone
4
990
JAWS FESTA 2024「バスロケ」GPS×サーバーレスの開発と運用の舞台裏/jawsfesta2024-bus-gps-serverless
ma2shita
3
370
スクラムというコンフォートゾーンから抜け出そう!プロジェクト全体に目を向けるインセプションデッキ / Inception Deck for seeing the whole project
takaking22
3
170
Global Databaseで実現するマルチリージョン自動切替とBlue/Greenデプロイ
j2yano
0
170
マーケットプレイス版Oracle WebCenter Content For OCI
oracle4engineer
PRO
3
540
20250309 無冠のわたし これからどう先生きのこれる?
akiko_pusu
1
150
Featured
See All Featured
Mobile First: as difficult as doing things right
swwweet
223
9.5k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
129
19k
Music & Morning Musume
bryan
46
6.4k
Scaling GitHub
holman
459
140k
The Pragmatic Product Professional
lauravandoore
32
6.4k
BBQ
matthewcrist
87
9.5k
Practical Orchestrator
shlominoach
186
10k
The Cost Of JavaScript in 2023
addyosmani
47
7.4k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
134
33k
Become a Pro
speakerdeck
PRO
26
5.2k
Docker and Python
trallard
44
3.3k
What's in a price? How to price your products and services
michaelherold
244
12k
Transcript
Da Michael Hausenblas, MapR Technologies Berlin Buzzwords 2013, Open Stage
Talk Friday, 7 June 13
Nope. Not this one. Friday, 7 June 13
Friday, 7 June 13
things you can influence things that affect you try and
focus on this stuff Friday, 7 June 13
The awkward moment when I open the data I got
from a customer Friday, 7 June 13
http://techcrunch.com/2012/11/25/the-big-data-fallacy-data-%E2%89%A0-information-%E2%89%A0-insights/ aka crap in, crap out Friday, 7 June 13
Some examples … Friday, 7 June 13
• Encöding hell • Schema? Sure, I fax you a
screenshot • Dupes and other fakes • Sampling Friday, 7 June 13
Encöding hell application-specific encodings • URL encoding • HTML encoding
• Database escaping non-ASCII? a%20percent-encoded%20string%20as%20of%20RFC%203986 a <strong>HTML</strong> encoded string Friday, 7 June 13
• Use Unicode • Use Unicode • Use Unicode Encöding
hell http://www.swedishfika.com/2010/01/19/escaping-from-encoding-hell/ Friday, 7 June 13
• Encöding hell • Schema? Sure, I fax you a
screenshot • Dupes and other fakes • Sampling Friday, 7 June 13
Schema? Sure, I fax you a screenshot Friday, 7 June
13
Schema? Sure, I fax you a screenshot • There is
a need for proper, formal documentation • For humans and machines • Basis for validation—automate! Friday, 7 June 13
• Encöding hell • Schema? Sure, I fax you a
screenshot • Dupes and other fakes • Sampling Friday, 7 June 13
Dupes and other fakes Friday, 7 June 13
Dupes and other fakes Friday, 7 June 13
Dupes and other fakes • Use plots to get an
overview • Watch out for outliers • Try to establish source for errors and fix • Document (in any case) Friday, 7 June 13
• Encöding hell • Schema? Sure, I fax you a
screenshot • Dupes and other fakes • Sampling Friday, 7 June 13
• My data is too big. I can’t check it
all. • Why don’t you sample, then? Sampling Friday, 7 June 13
http://mortardata.com/ Friday, 7 June 13
Friday, 7 June 13
Go and buy this book. Now. Friday, 7 June 13