Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
In the beginning was TXT
Search
Markus Wein
October 02, 2014
Programming
0
110
In the beginning was TXT
A very short overview of the history of encodings, given at Vienna.rb on 2014-10-02
Markus Wein
October 02, 2014
Tweet
Share
More Decks by Markus Wein
See All by Markus Wein
Command Line Productivity
cypher
1
130
A crash intro to deliberate practice
cypher
0
110
Keeping Your PostgreSQL Data Save
cypher
0
110
Ghost in the State Machine
cypher
2
310
n Things You Didn't Know About PostgreSQL (Rubyslava & PyVo 2014 Edition)
cypher
1
230
How to Become a Better Developer
cypher
2
1.8k
An Introduction to Rust
cypher
1
8.1k
How to Become a Better Developer
cypher
1
230
A Very Short Overview of Vagrant
cypher
0
7.8k
Other Decks in Programming
See All in Programming
プロダクト志向ってなんなんだろうね
righttouch
PRO
0
150
Cursor AI Agentと伴走する アプリケーションの高速リプレイス
daisuketakeda
1
130
エンジニア向け採用ピッチ資料
inusan
0
160
Cline指示通りに動かない? AI小説エージェントで学ぶ指示書の書き方と自動アップデートの仕組み
kamomeashizawa
1
570
「Cursor/Devin全社導入の理想と現実」のその後
saitoryc
0
140
AWS CDKの推しポイント 〜CloudFormationと比較してみた〜
akihisaikeda
3
310
FormFlow - Build Stunning Multistep Forms
yceruto
1
190
What Spring Developers Should Know About Jakarta EE
ivargrimstad
0
210
Code as Context 〜 1にコードで 2にリンタ 34がなくて 5にルール? 〜
yodakeisuke
0
100
Team topologies and the microservice architecture: a synergistic relationship
cer
PRO
0
1k
今ならAmazon ECSのサービス間通信をどう選ぶか / Selection of ECS Interservice Communication 2025
tkikuc
17
3.3k
XP, Testing and ninja testing
m_seki
3
180
Featured
See All Featured
VelocityConf: Rendering Performance Case Studies
addyosmani
330
24k
It's Worth the Effort
3n
185
28k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
45
7.4k
jQuery: Nuts, Bolts and Bling
dougneiner
63
7.8k
Writing Fast Ruby
sferik
628
61k
Testing 201, or: Great Expectations
jmmastey
42
7.5k
Building Flexible Design Systems
yeseniaperezcruz
328
39k
StorybookのUI Testing Handbookを読んだ
zakiyama
30
5.8k
BBQ
matthewcrist
89
9.7k
Bash Introduction
62gerente
614
210k
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
31
2.4k
The Cost Of JavaScript in 2023
addyosmani
51
8.4k
Transcript
In the beginning was TXT
!
EBCDIC
Source: http://en.wikipedia.org/wiki/EBCDIC
ASCII
"#$%&
None
ä, ö, or å and Ø?
Latin-1 ISO/IEC 8859-1
Latin-*
Windows code pages
Then came the €
(
None
Shift-JIS
This sucks
Unicode!
Unicode!
✈️ (planes!)
Basic Multilingual Plane
Code Points
U+0041 (LATIN SMALL LETTER A)
Source: http://codepoints.net/U+0041
Grapheme
a a a a a a a
Composite characters
U+0065 U+0301 or U+00E9
e+´ => é é
´ != ´
Unicode… is not an encoding
UTF-32
UCS-2/UTF-16
UTF-8
Source: http://en.wikipedia.org/wiki/File:UnicodeGrow2b.png
What does it look like?
Codepoint Char ASCII Latin-1 ISO-8859-15 UTF-8 UTF-16 U+0041 A 0x41
0x41 0x41 0x41 0x00 0x41 U+00C4 Ä - 0xc4 0xc4 0xc3 0x84 0x00 0xc4 U+20AC € - - 0xa4 0xe3 0x82 0xac 0x20 0xac U+C218 ࣻ - - - 0xec 0x88 0x98 0xc2 0x18 Encoding comparison Source: http://perlgeek.de/en/article/encodings-and-unicode
Remember: Just because someone claims it’s UTF-8, doesn’t mean it
is