Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
In the beginning was TXT
Search
Markus Wein
October 02, 2014
Programming
0
120
In the beginning was TXT
A very short overview of the history of encodings, given at Vienna.rb on 2014-10-02
Markus Wein
October 02, 2014
Tweet
Share
More Decks by Markus Wein
See All by Markus Wein
Command Line Productivity
cypher
1
150
A crash intro to deliberate practice
cypher
0
120
Keeping Your PostgreSQL Data Save
cypher
0
130
Ghost in the State Machine
cypher
2
330
n Things You Didn't Know About PostgreSQL (Rubyslava & PyVo 2014 Edition)
cypher
1
250
How to Become a Better Developer
cypher
2
1.8k
An Introduction to Rust
cypher
1
8.3k
How to Become a Better Developer
cypher
1
240
A Very Short Overview of Vagrant
cypher
0
8k
Other Decks in Programming
See All in Programming
なるべく楽してバックエンドに型をつけたい!(楽とは言ってない)
hibiki_cube
0
140
AIエージェントのキホンから学ぶ「エージェンティックコーディング」実践入門
masahiro_nishimi
5
470
KIKI_MBSD Cybersecurity Challenges 2025
ikema
0
1.3k
360° Signals in Angular: Signal Forms with SignalStore & Resources @ngLondon 01/2026
manfredsteyer
PRO
0
130
ノイジーネイバー問題を解決する 公平なキューイング
occhi
0
100
ぼくの開発環境2026
yuzneri
0
240
LLM Observabilityによる 対話型音声AIアプリケーションの安定運用
gekko0114
2
430
AWS re:Invent 2025参加 直前 Seattle-Tacoma Airport(SEA)におけるハードウェア紛失インシデントLT
tetutetu214
2
110
HTTPプロトコル正しく理解していますか? 〜かわいい猫と共に学ぼう。ฅ^•ω•^ฅ ニャ〜
hekuchan
2
690
The Past, Present, and Future of Enterprise Java
ivargrimstad
0
590
Rust 製のコードエディタ “Zed” を使ってみた
nearme_tech
PRO
0
190
開発者から情シスまで - 多様なユーザー層に届けるAPI提供戦略 / Postman API Night Okinawa 2026 Winter
tasshi
0
200
Featured
See All Featured
Designing Powerful Visuals for Engaging Learning
tmiket
0
230
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
47
7.9k
Navigating the Design Leadership Dip - Product Design Week Design Leaders+ Conference 2024
apolaine
0
180
Navigating the moral maze — ethical principles for Al-driven product design
skipperchong
2
250
Noah Learner - AI + Me: how we built a GSC Bulk Export data pipeline
techseoconnect
PRO
0
110
16th Malabo Montpellier Forum Presentation
akademiya2063
PRO
0
51
How People are Using Generative and Agentic AI to Supercharge Their Products, Projects, Services and Value Streams Today
helenjbeal
1
130
Gemini Prompt Engineering: Practical Techniques for Tangible AI Outcomes
mfonobong
2
280
Lightning talk: Run Django tests with GitHub Actions
sabderemane
0
120
Fight the Zombie Pattern Library - RWD Summit 2016
marcelosomers
234
17k
How to Talk to Developers About Accessibility
jct
2
130
Jamie Indigo - Trashchat’s Guide to Black Boxes: Technical SEO Tactics for LLMs
techseoconnect
PRO
0
62
Transcript
In the beginning was TXT
!
EBCDIC
Source: http://en.wikipedia.org/wiki/EBCDIC
ASCII
"#$%&
None
ä, ö, or å and Ø?
Latin-1 ISO/IEC 8859-1
Latin-*
Windows code pages
Then came the €
(
None
Shift-JIS
This sucks
Unicode!
Unicode!
✈️ (planes!)
Basic Multilingual Plane
Code Points
U+0041 (LATIN SMALL LETTER A)
Source: http://codepoints.net/U+0041
Grapheme
a a a a a a a
Composite characters
U+0065 U+0301 or U+00E9
e+´ => é é
´ != ´
Unicode… is not an encoding
UTF-32
UCS-2/UTF-16
UTF-8
Source: http://en.wikipedia.org/wiki/File:UnicodeGrow2b.png
What does it look like?
Codepoint Char ASCII Latin-1 ISO-8859-15 UTF-8 UTF-16 U+0041 A 0x41
0x41 0x41 0x41 0x00 0x41 U+00C4 Ä - 0xc4 0xc4 0xc3 0x84 0x00 0xc4 U+20AC € - - 0xa4 0xe3 0x82 0xac 0x20 0xac U+C218 ࣻ - - - 0xec 0x88 0x98 0xc2 0x18 Encoding comparison Source: http://perlgeek.de/en/article/encodings-and-unicode
Remember: Just because someone claims it’s UTF-8, doesn’t mean it
is