Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
In the beginning was TXT
Search
Markus Wein
October 02, 2014
Programming
0
110
In the beginning was TXT
A very short overview of the history of encodings, given at Vienna.rb on 2014-10-02
Markus Wein
October 02, 2014
Tweet
Share
More Decks by Markus Wein
See All by Markus Wein
Command Line Productivity
cypher
1
130
A crash intro to deliberate practice
cypher
0
110
Keeping Your PostgreSQL Data Save
cypher
0
120
Ghost in the State Machine
cypher
2
310
n Things You Didn't Know About PostgreSQL (Rubyslava & PyVo 2014 Edition)
cypher
1
240
How to Become a Better Developer
cypher
2
1.8k
An Introduction to Rust
cypher
1
8.2k
How to Become a Better Developer
cypher
1
230
A Very Short Overview of Vagrant
cypher
0
7.9k
Other Decks in Programming
See All in Programming
GraphQL×Railsアプリのデータベース負荷分散 - 月間3,000万人利用サービスを無停止で
koxya
1
1.2k
なぜあの開発者はDevRelに伴走し続けるのか / Why Does That Developer Keep Running Alongside DevRel?
nrslib
3
390
monorepo の Go テストをはやくした〜い!~最小の依存解決への道のり~ / faster-testing-of-monorepos
convto
2
460
Signals & Resource API in Angular: 3 Effective Rules for Your Architecture @BASTA 2025 in Mainz
manfredsteyer
PRO
0
120
CI_CD「健康診断」のススメ。現場でのボトルネック特定から、健康診断を通じた組織的な改善手法
teamlab
PRO
0
200
株式会社 Sun terras カンパニーデック
sunterras
0
270
階層構造を表現するデータ構造とリファクタリング 〜1年で10倍成長したプロダクトの変化と課題〜
yuhisatoxxx
3
980
Web Components で実現する Hotwire とフロントエンドフレームワークの橋渡し / Bridging with Web Components
da1chi
3
2k
非同期jobをtransaction内で 呼ぶなよ!絶対に呼ぶなよ!
alstrocrack
0
690
After go func(): Goroutines Through a Beginner’s Eye
97vaibhav
0
360
XP, Testing and ninja testing ZOZ5
m_seki
3
600
オープンソースソフトウェアへの解像度🔬
utam0k
12
2.5k
Featured
See All Featured
A Tale of Four Properties
chriscoyier
160
23k
Typedesign – Prime Four
hannesfritz
42
2.8k
Rails Girls Zürich Keynote
gr2m
95
14k
Optimizing for Happiness
mojombo
379
70k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
27k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
285
14k
4 Signs Your Business is Dying
shpigford
185
22k
Art, The Web, and Tiny UX
lynnandtonic
303
21k
Facilitating Awesome Meetings
lara
56
6.6k
Side Projects
sachag
455
43k
Building an army of robots
kneath
306
46k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
15
1.7k
Transcript
In the beginning was TXT
!
EBCDIC
Source: http://en.wikipedia.org/wiki/EBCDIC
ASCII
"#$%&
None
ä, ö, or å and Ø?
Latin-1 ISO/IEC 8859-1
Latin-*
Windows code pages
Then came the €
(
None
Shift-JIS
This sucks
Unicode!
Unicode!
✈️ (planes!)
Basic Multilingual Plane
Code Points
U+0041 (LATIN SMALL LETTER A)
Source: http://codepoints.net/U+0041
Grapheme
a a a a a a a
Composite characters
U+0065 U+0301 or U+00E9
e+´ => é é
´ != ´
Unicode… is not an encoding
UTF-32
UCS-2/UTF-16
UTF-8
Source: http://en.wikipedia.org/wiki/File:UnicodeGrow2b.png
What does it look like?
Codepoint Char ASCII Latin-1 ISO-8859-15 UTF-8 UTF-16 U+0041 A 0x41
0x41 0x41 0x41 0x00 0x41 U+00C4 Ä - 0xc4 0xc4 0xc3 0x84 0x00 0xc4 U+20AC € - - 0xa4 0xe3 0x82 0xac 0x20 0xac U+C218 ࣻ - - - 0xec 0x88 0x98 0xc2 0x18 Encoding comparison Source: http://perlgeek.de/en/article/encodings-and-unicode
Remember: Just because someone claims it’s UTF-8, doesn’t mean it
is