Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
In the beginning was TXT
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
Markus Wein
October 02, 2014
Programming
130
0
Share
In the beginning was TXT
A very short overview of the history of encodings, given at Vienna.rb on 2014-10-02
Markus Wein
October 02, 2014
More Decks by Markus Wein
See All by Markus Wein
Command Line Productivity
cypher
1
160
A crash intro to deliberate practice
cypher
0
130
Keeping Your PostgreSQL Data Save
cypher
0
140
Ghost in the State Machine
cypher
2
340
n Things You Didn't Know About PostgreSQL (Rubyslava & PyVo 2014 Edition)
cypher
1
260
How to Become a Better Developer
cypher
2
1.8k
An Introduction to Rust
cypher
1
8.3k
How to Become a Better Developer
cypher
1
250
A Very Short Overview of Vagrant
cypher
0
8k
Other Decks in Programming
See All in Programming
ネイティブアプリとWebフロントエンドのAPI通信ラッパーにおける共通化の勘所
suguruooki
0
250
Don't Prompt Harder, Structure Better
kitasuke
0
280
Laravel Nightwatchの裏側 - Laravel公式Observabilityツールを支える設計と実装
avosalmon
1
320
へんな働き方
yusukebe
6
2.9k
20260313 - Grafana & Friends Taipei #1 - Kubernetes v1.36 的開發雜記:那些困在 Alpha 加護病房太久的 Metrics
tico88612
0
250
forteeの改修から振り返るPHPerKaigi 2026
muno92
PRO
3
240
「接続」—パフォーマンスチューニングの最後の一手 〜点と点を結ぶ、その一瞬のために〜
kentaroutakeda
5
2.5k
Symfonyの特性(設計思想)を手軽に活かす特性(trait)
ickx
0
130
Codex CLIのSubagentsによる並列API実装 / Parallel API Implementation with Codex CLI Subagents
takatty
2
850
Mastering Event Sourcing: Your Parents Holidayed in Yugoslavia
super_marek
0
150
生成 AI 時代のスナップショットテストってやつを見せてあげますよ(α版)
ojun9
0
340
Kubernetes上でAgentを動かすための最新動向と押さえるべき概念まとめ
sotamaki0421
3
430
Featured
See All Featured
KATA
mclloyd
PRO
35
15k
We Have a Design System, Now What?
morganepeng
55
8.1k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
32
2.7k
Tell your own story through comics
letsgokoyo
1
880
Designing for Performance
lara
611
70k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
12
1.1k
SEO in 2025: How to Prepare for the Future of Search
ipullrank
3
3.4k
Done Done
chrislema
186
16k
Why Your Marketing Sucks and What You Can Do About It - Sophie Logan
marketingsoph
0
120
Visualization
eitanlees
150
17k
How to optimise 3,500 product descriptions for ecommerce in one day using ChatGPT
katarinadahlin
PRO
1
3.5k
How to Align SEO within the Product Triangle To Get Buy-In & Support - #RIMC
aleyda
1
1.5k
Transcript
In the beginning was TXT
!
EBCDIC
Source: http://en.wikipedia.org/wiki/EBCDIC
ASCII
"#$%&
None
ä, ö, or å and Ø?
Latin-1 ISO/IEC 8859-1
Latin-*
Windows code pages
Then came the €
(
None
Shift-JIS
This sucks
Unicode!
Unicode!
✈️ (planes!)
Basic Multilingual Plane
Code Points
U+0041 (LATIN SMALL LETTER A)
Source: http://codepoints.net/U+0041
Grapheme
a a a a a a a
Composite characters
U+0065 U+0301 or U+00E9
e+´ => é é
´ != ´
Unicode… is not an encoding
UTF-32
UCS-2/UTF-16
UTF-8
Source: http://en.wikipedia.org/wiki/File:UnicodeGrow2b.png
What does it look like?
Codepoint Char ASCII Latin-1 ISO-8859-15 UTF-8 UTF-16 U+0041 A 0x41
0x41 0x41 0x41 0x00 0x41 U+00C4 Ä - 0xc4 0xc4 0xc3 0x84 0x00 0xc4 U+20AC € - - 0xa4 0xe3 0x82 0xac 0x20 0xac U+C218 ࣻ - - - 0xec 0x88 0x98 0xc2 0x18 Encoding comparison Source: http://perlgeek.de/en/article/encodings-and-unicode
Remember: Just because someone claims it’s UTF-8, doesn’t mean it
is