Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
In the beginning was TXT
Search
Markus Wein
October 02, 2014
Programming
0
99
In the beginning was TXT
A very short overview of the history of encodings, given at Vienna.rb on 2014-10-02
Markus Wein
October 02, 2014
Tweet
Share
More Decks by Markus Wein
See All by Markus Wein
Command Line Productivity
cypher
1
130
A crash intro to deliberate practice
cypher
0
110
Keeping Your PostgreSQL Data Save
cypher
0
110
Ghost in the State Machine
cypher
2
300
n Things You Didn't Know About PostgreSQL (Rubyslava & PyVo 2014 Edition)
cypher
1
230
How to Become a Better Developer
cypher
2
1.8k
An Introduction to Rust
cypher
1
8.1k
How to Become a Better Developer
cypher
1
230
A Very Short Overview of Vagrant
cypher
0
7.8k
Other Decks in Programming
See All in Programming
Haskell でアルゴリズムを抽象化する / 関数型言語で競技プログラミング
naoya
16
3.7k
Parallel::Pipesの紹介
skaji
2
900
eBPFを用いたAIネットワーク監視システム論文の実装 / eBPF Japan Meetup #4
yuukit
3
740
Development of an App for Intuitive AI Learning - Blockly Summit 2025
teba_eleven
0
110
PT AI без купюр
v0lka
0
230
機械学習って何? 5分で解説頑張ってみる
kuroneko2828
0
200
Passkeys for Java Developers
ynojima
2
830
List Unfolding - 'unfold' as the Computational Dual of 'fold', and how 'unfold' relates to 'iterate'"
philipschwarz
PRO
0
190
GoのGenericsによるslice操作との付き合い方
syumai
2
390
iOSアプリ開発もLLMで自動運転する
hiragram
6
2.3k
UPDATEがシステムを複雑にする? イミュータブルデータモデルのすすめ
shimomura
1
520
FastMCPでMCPサーバー/クライアントを構築してみる
ttnyt8701
2
130
Featured
See All Featured
Six Lessons from altMBA
skipperchong
28
3.8k
Raft: Consensus for Rubyists
vanstee
139
7k
Mobile First: as difficult as doing things right
swwweet
223
9.6k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
667
120k
Become a Pro
speakerdeck
PRO
28
5.4k
Principles of Awesome APIs and How to Build Them.
keavy
126
17k
A better future with KSS
kneath
239
17k
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
32
5.9k
No one is an island. Learnings from fostering a developers community.
thoeni
21
3.3k
The Invisible Side of Design
smashingmag
299
50k
Product Roadmaps are Hard
iamctodd
PRO
53
11k
Designing for humans not robots
tammielis
253
25k
Transcript
In the beginning was TXT
!
EBCDIC
Source: http://en.wikipedia.org/wiki/EBCDIC
ASCII
"#$%&
None
ä, ö, or å and Ø?
Latin-1 ISO/IEC 8859-1
Latin-*
Windows code pages
Then came the €
(
None
Shift-JIS
This sucks
Unicode!
Unicode!
✈️ (planes!)
Basic Multilingual Plane
Code Points
U+0041 (LATIN SMALL LETTER A)
Source: http://codepoints.net/U+0041
Grapheme
a a a a a a a
Composite characters
U+0065 U+0301 or U+00E9
e+´ => é é
´ != ´
Unicode… is not an encoding
UTF-32
UCS-2/UTF-16
UTF-8
Source: http://en.wikipedia.org/wiki/File:UnicodeGrow2b.png
What does it look like?
Codepoint Char ASCII Latin-1 ISO-8859-15 UTF-8 UTF-16 U+0041 A 0x41
0x41 0x41 0x41 0x00 0x41 U+00C4 Ä - 0xc4 0xc4 0xc3 0x84 0x00 0xc4 U+20AC € - - 0xa4 0xe3 0x82 0xac 0x20 0xac U+C218 ࣻ - - - 0xec 0x88 0x98 0xc2 0x18 Encoding comparison Source: http://perlgeek.de/en/article/encodings-and-unicode
Remember: Just because someone claims it’s UTF-8, doesn’t mean it
is