Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
In the beginning was TXT
Search
Markus Wein
October 02, 2014
Programming
0
110
In the beginning was TXT
A very short overview of the history of encodings, given at Vienna.rb on 2014-10-02
Markus Wein
October 02, 2014
Tweet
Share
More Decks by Markus Wein
See All by Markus Wein
Command Line Productivity
cypher
1
130
A crash intro to deliberate practice
cypher
0
110
Keeping Your PostgreSQL Data Save
cypher
0
120
Ghost in the State Machine
cypher
2
310
n Things You Didn't Know About PostgreSQL (Rubyslava & PyVo 2014 Edition)
cypher
1
240
How to Become a Better Developer
cypher
2
1.8k
An Introduction to Rust
cypher
1
8.2k
How to Become a Better Developer
cypher
1
230
A Very Short Overview of Vagrant
cypher
0
7.9k
Other Decks in Programming
See All in Programming
大規模アプリのDIフレームワーク刷新戦略 ~過去最大規模の並行開発を止めずにアプリ全体に導入するまで~
mot_techtalk
0
360
CSC509 Lecture 01
javiergs
PRO
1
430
Goで実践するドメイン駆動開発 AIと歩み始めた新規プロダクト開発の現在地
imkaoru
2
110
高度なUI/UXこそHotwireで作ろう Kaigi on Rails 2025
naofumi
4
3.2k
Breaking Up with Big ViewModels — Without Breaking Your Architecture (droidcon Berlin 2025)
steliosf
PRO
1
290
AIで開発生産性を上げる個人とチームの取り組み
taniigo
0
130
CSC305 Lecture 04
javiergs
PRO
0
230
パフォーマンスチューニングで Web 技術を深掘り直す
progfay
18
4.9k
アメ車でサンノゼを走ってきたよ!
s_shimotori
0
130
いま中途半端なSwift 6対応をするより、Default ActorやApproachable Concurrencyを有効にしてからでいいんじゃない?
yimajo
2
320
Web技術を最大限活用してRAW画像を現像する / Developing RAW Images on the Web
ssssota
2
1.1k
XP, Testing and ninja testing ZOZ5
m_seki
2
250
Featured
See All Featured
Automating Front-end Workflow
addyosmani
1371
200k
Unsuck your backbone
ammeep
671
58k
The World Runs on Bad Software
bkeepers
PRO
71
11k
Stop Working from a Prison Cell
hatefulcrawdad
271
21k
The Illustrated Children's Guide to Kubernetes
chrisshort
48
51k
Gamification - CAS2011
davidbonilla
81
5.5k
BBQ
matthewcrist
89
9.8k
How To Stay Up To Date on Web Technology
chriscoyier
791
250k
The Invisible Side of Design
smashingmag
301
51k
The Language of Interfaces
destraynor
162
25k
Testing 201, or: Great Expectations
jmmastey
45
7.7k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
252
21k
Transcript
In the beginning was TXT
!
EBCDIC
Source: http://en.wikipedia.org/wiki/EBCDIC
ASCII
"#$%&
None
ä, ö, or å and Ø?
Latin-1 ISO/IEC 8859-1
Latin-*
Windows code pages
Then came the €
(
None
Shift-JIS
This sucks
Unicode!
Unicode!
✈️ (planes!)
Basic Multilingual Plane
Code Points
U+0041 (LATIN SMALL LETTER A)
Source: http://codepoints.net/U+0041
Grapheme
a a a a a a a
Composite characters
U+0065 U+0301 or U+00E9
e+´ => é é
´ != ´
Unicode… is not an encoding
UTF-32
UCS-2/UTF-16
UTF-8
Source: http://en.wikipedia.org/wiki/File:UnicodeGrow2b.png
What does it look like?
Codepoint Char ASCII Latin-1 ISO-8859-15 UTF-8 UTF-16 U+0041 A 0x41
0x41 0x41 0x41 0x00 0x41 U+00C4 Ä - 0xc4 0xc4 0xc3 0x84 0x00 0xc4 U+20AC € - - 0xa4 0xe3 0x82 0xac 0x20 0xac U+C218 ࣻ - - - 0xec 0x88 0x98 0xc2 0x18 Encoding comparison Source: http://perlgeek.de/en/article/encodings-and-unicode
Remember: Just because someone claims it’s UTF-8, doesn’t mean it
is