Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Data Breaking Bad
Search
Michael Hausenblas
June 03, 2013
Technology
1
200
Data Breaking Bad
Open Stage talk at Berlin Buzzwords 2013
Michael Hausenblas
June 03, 2013
Tweet
Share
More Decks by Michael Hausenblas
See All by Michael Hausenblas
KubeCologne keynote—Troubleshooting Kubernetes apps
mhausenblas
4
8.1k
Extending Kubernetes 101
mhausenblas
4
2.3k
Kubernetes and serverless technologies for high-performance applications
mhausenblas
1
360
Troubleshooting Kubernetes Applications
mhausenblas
1
610
Autoscaling All Things Kubernetes with Prometheus
mhausenblas
0
960
Three Billy Goats Gruff : from a monolith to containers to functions
mhausenblas
0
590
Bending Kubernetes to Your Needs
mhausenblas
2
2.8k
Kubernetes Security: from Image Hygiene to Network Policies
mhausenblas
8
4k
Hands-on Cloud Native Lifecycle Management
mhausenblas
3
440
Other Decks in Technology
See All in Technology
serverless team topology
_kensh
3
260
ゼロコード計装導入後のカスタム計装でさらに可観測性を高めよう
sansantech
PRO
1
610
累計5000万DLサービスの裏側 – LINEマンガのKotlinで挑む大規模 Server-side ETLの最適化
ldf_tech
0
110
AWS DMS で SQL Server を移行してみた/aws-dms-sql-server-migration
emiki
0
270
ヘンリー会社紹介資料(エンジニア向け) / company deck for engineer
henryofficial
0
440
「タコピーの原罪」から学ぶ間違った”支援” / the bad support of Takopii
piyonakajima
0
160
ざっくり学ぶ 『エンジニアリングリーダー 技術組織を育てるリーダーシップと セルフマネジメント』 / 50 minute Engineering Leader
iwashi86
8
4k
Kotlinで型安全にバイテンポラルデータを扱いたい! ReladomoラッパーをAIと実装してみた話
itohiro73
3
130
.NET 10のBlazorの期待の新機能
htkym
0
170
AIの個性を理解し、指揮する
shoota
3
590
データエンジニアとして生存するために 〜界隈を盛り上げる「お祭り」が必要な理由〜 / data_summit_findy_Session_1
sansan_randd
0
330
DSPy入門
tomehirata
6
830
Featured
See All Featured
Improving Core Web Vitals using Speculation Rules API
sergeychernyshev
21
1.2k
KATA
mclloyd
PRO
32
15k
Intergalactic Javascript Robots from Outer Space
tanoku
273
27k
Measuring & Analyzing Core Web Vitals
bluesmoon
9
650
Gamification - CAS2011
davidbonilla
81
5.5k
Statistics for Hackers
jakevdp
799
220k
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
230
22k
The Art of Programming - Codeland 2020
erikaheidi
56
14k
It's Worth the Effort
3n
187
28k
Fantastic passwords and where to find them - at NoRuKo
philnash
52
3.5k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
55
3k
How to train your dragon (web standard)
notwaldorf
97
6.3k
Transcript
Da Michael Hausenblas, MapR Technologies Berlin Buzzwords 2013, Open Stage
Talk Friday, 7 June 13
Nope. Not this one. Friday, 7 June 13
Friday, 7 June 13
things you can influence things that affect you try and
focus on this stuff Friday, 7 June 13
The awkward moment when I open the data I got
from a customer Friday, 7 June 13
http://techcrunch.com/2012/11/25/the-big-data-fallacy-data-%E2%89%A0-information-%E2%89%A0-insights/ aka crap in, crap out Friday, 7 June 13
Some examples … Friday, 7 June 13
• Encöding hell • Schema? Sure, I fax you a
screenshot • Dupes and other fakes • Sampling Friday, 7 June 13
Encöding hell application-specific encodings • URL encoding • HTML encoding
• Database escaping non-ASCII? a%20percent-encoded%20string%20as%20of%20RFC%203986 a <strong>HTML</strong> encoded string Friday, 7 June 13
• Use Unicode • Use Unicode • Use Unicode Encöding
hell http://www.swedishfika.com/2010/01/19/escaping-from-encoding-hell/ Friday, 7 June 13
• Encöding hell • Schema? Sure, I fax you a
screenshot • Dupes and other fakes • Sampling Friday, 7 June 13
Schema? Sure, I fax you a screenshot Friday, 7 June
13
Schema? Sure, I fax you a screenshot • There is
a need for proper, formal documentation • For humans and machines • Basis for validation—automate! Friday, 7 June 13
• Encöding hell • Schema? Sure, I fax you a
screenshot • Dupes and other fakes • Sampling Friday, 7 June 13
Dupes and other fakes Friday, 7 June 13
Dupes and other fakes Friday, 7 June 13
Dupes and other fakes • Use plots to get an
overview • Watch out for outliers • Try to establish source for errors and fix • Document (in any case) Friday, 7 June 13
• Encöding hell • Schema? Sure, I fax you a
screenshot • Dupes and other fakes • Sampling Friday, 7 June 13
• My data is too big. I can’t check it
all. • Why don’t you sample, then? Sampling Friday, 7 June 13
http://mortardata.com/ Friday, 7 June 13
Friday, 7 June 13
Go and buy this book. Now. Friday, 7 June 13