Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
In the beginning was TXT
Search
Markus Wein
October 02, 2014
Programming
0
110
In the beginning was TXT
A very short overview of the history of encodings, given at Vienna.rb on 2014-10-02
Markus Wein
October 02, 2014
Tweet
Share
More Decks by Markus Wein
See All by Markus Wein
Command Line Productivity
cypher
1
130
A crash intro to deliberate practice
cypher
0
110
Keeping Your PostgreSQL Data Save
cypher
0
120
Ghost in the State Machine
cypher
2
320
n Things You Didn't Know About PostgreSQL (Rubyslava & PyVo 2014 Edition)
cypher
1
240
How to Become a Better Developer
cypher
2
1.8k
An Introduction to Rust
cypher
1
8.2k
How to Become a Better Developer
cypher
1
240
A Very Short Overview of Vagrant
cypher
0
7.9k
Other Decks in Programming
See All in Programming
Blazing Fast UI Development with Compose Hot Reload (droidcon London 2025)
zsmb
0
480
Inside of Swift Export
giginet
PRO
1
480
SidekiqでAIに商品説明を生成させてみた
akinko_0915
0
120
三者三様 宣言的UI
kkagurazaka
0
360
Webサーバーサイド言語としてのRustについて
kouyuume
1
5.1k
テーブル定義書の構造化抽出して、生成AIでDWH分析を試してみた / devio2025tokyo
kasacchiful
0
390
GC25 Recap: The Code You Reviewed is Not the Code You Built / #newt_gophercon_tour
mazrean
0
150
Researchlyの開発で参考にしたデザイン
adsholoko
0
110
AkarengaLT vol.38
hashimoto_kei
1
140
業務でAIを使いたい話
hnw
0
230
実践Claude Code:20の失敗から学ぶAIペアプログラミング
takedatakashi
18
9.6k
React Nativeならぬ"Vue Native"が実現するかも?_新世代マルチプラットフォーム開発フレームワークのLynxとLynxのVue.js対応を追ってみよう_Vue Lynx
yut0naga1_fa
2
2k
Featured
See All Featured
Balancing Empowerment & Direction
lara
5
720
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
16
1.7k
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
36
6.1k
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
31
9.7k
Learning to Love Humans: Emotional Interface Design
aarron
274
41k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
359
30k
Rails Girls Zürich Keynote
gr2m
95
14k
Build The Right Thing And Hit Your Dates
maggiecrowley
38
2.9k
Why You Should Never Use an ORM
jnunemaker
PRO
60
9.6k
Fantastic passwords and where to find them - at NoRuKo
philnash
52
3.5k
Scaling GitHub
holman
463
140k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
31
2.9k
Transcript
In the beginning was TXT
!
EBCDIC
Source: http://en.wikipedia.org/wiki/EBCDIC
ASCII
"#$%&
None
ä, ö, or å and Ø?
Latin-1 ISO/IEC 8859-1
Latin-*
Windows code pages
Then came the €
(
None
Shift-JIS
This sucks
Unicode!
Unicode!
✈️ (planes!)
Basic Multilingual Plane
Code Points
U+0041 (LATIN SMALL LETTER A)
Source: http://codepoints.net/U+0041
Grapheme
a a a a a a a
Composite characters
U+0065 U+0301 or U+00E9
e+´ => é é
´ != ´
Unicode… is not an encoding
UTF-32
UCS-2/UTF-16
UTF-8
Source: http://en.wikipedia.org/wiki/File:UnicodeGrow2b.png
What does it look like?
Codepoint Char ASCII Latin-1 ISO-8859-15 UTF-8 UTF-16 U+0041 A 0x41
0x41 0x41 0x41 0x00 0x41 U+00C4 Ä - 0xc4 0xc4 0xc3 0x84 0x00 0xc4 U+20AC € - - 0xa4 0xe3 0x82 0xac 0x20 0xac U+C218 ࣻ - - - 0xec 0x88 0x98 0xc2 0x18 Encoding comparison Source: http://perlgeek.de/en/article/encodings-and-unicode
Remember: Just because someone claims it’s UTF-8, doesn’t mean it
is