Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
In the beginning was TXT
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
Markus Wein
October 02, 2014
Programming
0
120
In the beginning was TXT
A very short overview of the history of encodings, given at Vienna.rb on 2014-10-02
Markus Wein
October 02, 2014
Tweet
Share
More Decks by Markus Wein
See All by Markus Wein
Command Line Productivity
cypher
1
150
A crash intro to deliberate practice
cypher
0
120
Keeping Your PostgreSQL Data Save
cypher
0
130
Ghost in the State Machine
cypher
2
330
n Things You Didn't Know About PostgreSQL (Rubyslava & PyVo 2014 Edition)
cypher
1
250
How to Become a Better Developer
cypher
2
1.8k
An Introduction to Rust
cypher
1
8.3k
How to Become a Better Developer
cypher
1
240
A Very Short Overview of Vagrant
cypher
0
8k
Other Decks in Programming
See All in Programming
ノイジーネイバー問題を解決する 公平なキューイング
occhi
0
100
Amazon Bedrockを活用したRAGの品質管理パイプライン構築
tosuri13
5
720
そのAIレビュー、レビューしてますか? / Are you reviewing those AI reviews?
rkaga
6
4.6k
カスタマーサクセス業務を変革したヘルススコアの実現と学び
_hummer0724
0
700
AI Schema Enrichment for your Oracle AI Database
thatjeffsmith
0
290
副作用をどこに置くか問題:オブジェクト指向で整理する設計判断ツリー
koxya
1
610
責任感のあるCloudWatchアラームを設計しよう
akihisaikeda
3
180
AIと一緒にレガシーに向き合ってみた
nyafunta9858
0
240
humanlayerのブログから学ぶ、良いCLAUDE.mdの書き方
tsukamoto1783
0
200
dchart: charts from deck markup
ajstarks
3
990
AI時代のキャリアプラン「技術の引力」からの脱出と「問い」へのいざない / tech-gravity
minodriven
21
7.2k
Claude Codeと2つの巻き戻し戦略 / Two Rewind Strategies with Claude Code
fruitriin
0
100
Featured
See All Featured
Done Done
chrislema
186
16k
JAMstack: Web Apps at Ludicrous Speed - All Things Open 2022
reverentgeek
1
340
How Fast Is Fast Enough? [PerfNow 2025]
tammyeverts
3
450
Visual Storytelling: How to be a Superhuman Communicator
reverentgeek
2
430
Automating Front-end Workflow
addyosmani
1371
200k
Agile Actions for Facilitating Distributed Teams - ADO2019
mkilby
0
110
Rebuilding a faster, lazier Slack
samanthasiow
85
9.4k
Optimising Largest Contentful Paint
csswizardry
37
3.6k
Designing for Performance
lara
610
70k
Stop Working from a Prison Cell
hatefulcrawdad
273
21k
Odyssey Design
rkendrick25
PRO
1
500
The Director’s Chair: Orchestrating AI for Truly Effective Learning
tmiket
1
96
Transcript
In the beginning was TXT
!
EBCDIC
Source: http://en.wikipedia.org/wiki/EBCDIC
ASCII
"#$%&
None
ä, ö, or å and Ø?
Latin-1 ISO/IEC 8859-1
Latin-*
Windows code pages
Then came the €
(
None
Shift-JIS
This sucks
Unicode!
Unicode!
✈️ (planes!)
Basic Multilingual Plane
Code Points
U+0041 (LATIN SMALL LETTER A)
Source: http://codepoints.net/U+0041
Grapheme
a a a a a a a
Composite characters
U+0065 U+0301 or U+00E9
e+´ => é é
´ != ´
Unicode… is not an encoding
UTF-32
UCS-2/UTF-16
UTF-8
Source: http://en.wikipedia.org/wiki/File:UnicodeGrow2b.png
What does it look like?
Codepoint Char ASCII Latin-1 ISO-8859-15 UTF-8 UTF-16 U+0041 A 0x41
0x41 0x41 0x41 0x00 0x41 U+00C4 Ä - 0xc4 0xc4 0xc3 0x84 0x00 0xc4 U+20AC € - - 0xa4 0xe3 0x82 0xac 0x20 0xac U+C218 ࣻ - - - 0xec 0x88 0x98 0xc2 0x18 Encoding comparison Source: http://perlgeek.de/en/article/encodings-and-unicode
Remember: Just because someone claims it’s UTF-8, doesn’t mean it
is