Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
In the beginning was TXT
Search
Markus Wein
October 02, 2014
Programming
0
82
In the beginning was TXT
A very short overview of the history of encodings, given at Vienna.rb on 2014-10-02
Markus Wein
October 02, 2014
Tweet
Share
More Decks by Markus Wein
See All by Markus Wein
Command Line Productivity
cypher
1
120
A crash intro to deliberate practice
cypher
0
110
Keeping Your PostgreSQL Data Save
cypher
0
93
Ghost in the State Machine
cypher
2
280
n Things You Didn't Know About PostgreSQL (Rubyslava & PyVo 2014 Edition)
cypher
1
210
How to Become a Better Developer
cypher
2
1.8k
An Introduction to Rust
cypher
1
7.9k
How to Become a Better Developer
cypher
1
220
A Very Short Overview of Vagrant
cypher
0
7.7k
Other Decks in Programming
See All in Programming
競技プログラミングへのお誘い@阪大BOOSTセミナー
kotamanegi
0
360
責務を分離するための例外設計 - PHPカンファレンス 2024
kajitack
1
390
Webエンジニア主体のモバイルチームの 生産性を高く保つためにやったこと
igreenwood
0
330
rails stats で紐解く ANDPAD のイマを支える技術たち
andpad
1
290
なまけものオバケたち -PHP 8.4 に入った新機能の紹介-
tanakahisateru
1
120
StarlingMonkeyを触ってみた話 - 2024冬
syumai
3
270
Security_for_introducing_eBPF
kentatada
0
110
Cloudflare MCP ServerでClaude Desktop からWeb APIを構築
kutakutat
1
540
コンテナをたくさん詰め込んだシステムとランタイムの変化
makihiro
1
120
これでLambdaが不要に?!Step FunctionsのJSONata対応について
iwatatomoya
2
3.6k
クリエイティブコーディングとRuby学習 / Creative Coding and Learning Ruby
chobishiba
0
3.9k
Fibonacci Function Gallery - Part 1
philipschwarz
PRO
0
210
Featured
See All Featured
No one is an island. Learnings from fostering a developers community.
thoeni
19
3k
Optimising Largest Contentful Paint
csswizardry
33
3k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
95
17k
Into the Great Unknown - MozCon
thekraken
33
1.5k
Navigating Team Friction
lara
183
15k
Documentation Writing (for coders)
carmenintech
66
4.5k
XXLCSS - How to scale CSS and keep your sanity
sugarenia
247
1.3M
The Power of CSS Pseudo Elements
geoffreycrofte
73
5.4k
Done Done
chrislema
181
16k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
330
21k
Intergalactic Javascript Robots from Outer Space
tanoku
270
27k
The Cost Of JavaScript in 2023
addyosmani
45
7k
Transcript
In the beginning was TXT
!
EBCDIC
Source: http://en.wikipedia.org/wiki/EBCDIC
ASCII
"#$%&
None
ä, ö, or å and Ø?
Latin-1 ISO/IEC 8859-1
Latin-*
Windows code pages
Then came the €
(
None
Shift-JIS
This sucks
Unicode!
Unicode!
✈️ (planes!)
Basic Multilingual Plane
Code Points
U+0041 (LATIN SMALL LETTER A)
Source: http://codepoints.net/U+0041
Grapheme
a a a a a a a
Composite characters
U+0065 U+0301 or U+00E9
e+´ => é é
´ != ´
Unicode… is not an encoding
UTF-32
UCS-2/UTF-16
UTF-8
Source: http://en.wikipedia.org/wiki/File:UnicodeGrow2b.png
What does it look like?
Codepoint Char ASCII Latin-1 ISO-8859-15 UTF-8 UTF-16 U+0041 A 0x41
0x41 0x41 0x41 0x00 0x41 U+00C4 Ä - 0xc4 0xc4 0xc3 0x84 0x00 0xc4 U+20AC € - - 0xa4 0xe3 0x82 0xac 0x20 0xac U+C218 ࣻ - - - 0xec 0x88 0x98 0xc2 0x18 Encoding comparison Source: http://perlgeek.de/en/article/encodings-and-unicode
Remember: Just because someone claims it’s UTF-8, doesn’t mean it
is