Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
In the beginning was TXT
Search
Markus Wein
October 02, 2014
Programming
0
110
In the beginning was TXT
A very short overview of the history of encodings, given at Vienna.rb on 2014-10-02
Markus Wein
October 02, 2014
Tweet
Share
More Decks by Markus Wein
See All by Markus Wein
Command Line Productivity
cypher
1
130
A crash intro to deliberate practice
cypher
0
110
Keeping Your PostgreSQL Data Save
cypher
0
120
Ghost in the State Machine
cypher
2
320
n Things You Didn't Know About PostgreSQL (Rubyslava & PyVo 2014 Edition)
cypher
1
240
How to Become a Better Developer
cypher
2
1.8k
An Introduction to Rust
cypher
1
8.2k
How to Become a Better Developer
cypher
1
240
A Very Short Overview of Vagrant
cypher
0
7.9k
Other Decks in Programming
See All in Programming
TVerのWeb内製化 - 開発スピードと品質を両立させるまでの道のり
techtver
PRO
1
470
flutter_kaigi_2025.pdf
kyoheig3
1
290
Flutterアプリ運用の現場で役立った監視Tips 5選
ostk0069
1
420
AIの弱点、やっぱりプログラミングは人間が(も)勉強しよう / YAPC AI and Programming
kishida
9
4.4k
JEP 496 と JEP 497 から学ぶ耐量子計算機暗号入門 / Learning Post-Quantum Crypto Basics from JEP 496 & 497
mackey0225
2
240
なぜ強調表示できず ** が表示されるのか — Perlで始まったMarkdownの歴史と日本語文書における課題
kwahiro
11
5.7k
Register is more than clipboard
satorunooshie
1
470
CSC509 Lecture 13
javiergs
PRO
0
250
ビルドプロセスをデバッグしよう!
yt8492
0
310
Eloquentを使ってどこまでコードの治安を保てるのか?を新人が考察してみた
itokoh0405
0
3.1k
詳細の決定を遅らせつつ実装を早くする
shimabox
1
1k
Vueで学ぶデータ構造入門 リンクリストとキューでリアクティビティを捉える / Vue Data Structures: Linked Lists and Queues for Reactivity
konkarin
1
190
Featured
See All Featured
Optimising Largest Contentful Paint
csswizardry
37
3.5k
jQuery: Nuts, Bolts and Bling
dougneiner
65
8k
Writing Fast Ruby
sferik
630
62k
How Fast Is Fast Enough? [PerfNow 2025]
tammyeverts
3
320
Building Adaptive Systems
keathley
44
2.8k
Being A Developer After 40
akosma
91
590k
Why Our Code Smells
bkeepers
PRO
340
57k
Fantastic passwords and where to find them - at NoRuKo
philnash
52
3.5k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
46
7.8k
The Straight Up "How To Draw Better" Workshop
denniskardys
239
140k
Side Projects
sachag
455
43k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
9
970
Transcript
In the beginning was TXT
!
EBCDIC
Source: http://en.wikipedia.org/wiki/EBCDIC
ASCII
"#$%&
None
ä, ö, or å and Ø?
Latin-1 ISO/IEC 8859-1
Latin-*
Windows code pages
Then came the €
(
None
Shift-JIS
This sucks
Unicode!
Unicode!
✈️ (planes!)
Basic Multilingual Plane
Code Points
U+0041 (LATIN SMALL LETTER A)
Source: http://codepoints.net/U+0041
Grapheme
a a a a a a a
Composite characters
U+0065 U+0301 or U+00E9
e+´ => é é
´ != ´
Unicode… is not an encoding
UTF-32
UCS-2/UTF-16
UTF-8
Source: http://en.wikipedia.org/wiki/File:UnicodeGrow2b.png
What does it look like?
Codepoint Char ASCII Latin-1 ISO-8859-15 UTF-8 UTF-16 U+0041 A 0x41
0x41 0x41 0x41 0x00 0x41 U+00C4 Ä - 0xc4 0xc4 0xc3 0x84 0x00 0xc4 U+20AC € - - 0xa4 0xe3 0x82 0xac 0x20 0xac U+C218 ࣻ - - - 0xec 0x88 0x98 0xc2 0x18 Encoding comparison Source: http://perlgeek.de/en/article/encodings-and-unicode
Remember: Just because someone claims it’s UTF-8, doesn’t mean it
is