Upgrade to PRO for Only $50/Year—Limited-Time Offer! 🔥
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
In the beginning was TXT
Search
Markus Wein
October 02, 2014
Programming
0
110
In the beginning was TXT
A very short overview of the history of encodings, given at Vienna.rb on 2014-10-02
Markus Wein
October 02, 2014
Tweet
Share
More Decks by Markus Wein
See All by Markus Wein
Command Line Productivity
cypher
1
140
A crash intro to deliberate practice
cypher
0
110
Keeping Your PostgreSQL Data Save
cypher
0
120
Ghost in the State Machine
cypher
2
320
n Things You Didn't Know About PostgreSQL (Rubyslava & PyVo 2014 Edition)
cypher
1
240
How to Become a Better Developer
cypher
2
1.8k
An Introduction to Rust
cypher
1
8.2k
How to Become a Better Developer
cypher
1
240
A Very Short Overview of Vagrant
cypher
0
7.9k
Other Decks in Programming
See All in Programming
配送計画の均等化機能を提供する取り組みについて(⽩⾦鉱業 Meetup Vol.21@六本⽊(数理最適化編))
izu_nori
0
140
Developing static sites with Ruby
okuramasafumi
0
250
著者と進める!『AIと個人開発したくなったらまずCursorで要件定義だ!』
yasunacoffee
0
120
dnx で実行できるコマンド、作ってみました
tomohisa
0
140
Why Kotlin? 電子カルテを Kotlin で開発する理由 / Why Kotlin? at Henry
agatan
2
6.9k
ハイパーメディア駆動アプリケーションとIslandアーキテクチャ: htmxによるWebアプリケーション開発と動的UIの局所的適用
nowaki28
0
390
開発に寄りそう自動テストの実現
goyoki
1
760
ViewファーストなRailsアプリ開発のたのしさ
sugiwe
0
430
How Software Deployment tools have changed in the past 20 years
geshan
0
28k
STYLE
koic
0
150
Go コードベースの構成と AI コンテキスト定義
andpad
0
120
堅牢なフロントエンドテスト基盤を構築するために行った取り組み
shogo4131
8
2.3k
Featured
See All Featured
Speed Design
sergeychernyshev
33
1.4k
Designing for humans not robots
tammielis
254
26k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
132
19k
Designing Dashboards & Data Visualisations in Web Apps
destraynor
231
54k
Principles of Awesome APIs and How to Build Them.
keavy
127
17k
Fight the Zombie Pattern Library - RWD Summit 2016
marcelosomers
234
17k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
35
3.3k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
27k
Rails Girls Zürich Keynote
gr2m
95
14k
RailsConf 2023
tenderlove
30
1.3k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
359
30k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
32
2.7k
Transcript
In the beginning was TXT
!
EBCDIC
Source: http://en.wikipedia.org/wiki/EBCDIC
ASCII
"#$%&
None
ä, ö, or å and Ø?
Latin-1 ISO/IEC 8859-1
Latin-*
Windows code pages
Then came the €
(
None
Shift-JIS
This sucks
Unicode!
Unicode!
✈️ (planes!)
Basic Multilingual Plane
Code Points
U+0041 (LATIN SMALL LETTER A)
Source: http://codepoints.net/U+0041
Grapheme
a a a a a a a
Composite characters
U+0065 U+0301 or U+00E9
e+´ => é é
´ != ´
Unicode… is not an encoding
UTF-32
UCS-2/UTF-16
UTF-8
Source: http://en.wikipedia.org/wiki/File:UnicodeGrow2b.png
What does it look like?
Codepoint Char ASCII Latin-1 ISO-8859-15 UTF-8 UTF-16 U+0041 A 0x41
0x41 0x41 0x41 0x00 0x41 U+00C4 Ä - 0xc4 0xc4 0xc3 0x84 0x00 0xc4 U+20AC € - - 0xa4 0xe3 0x82 0xac 0x20 0xac U+C218 ࣻ - - - 0xec 0x88 0x98 0xc2 0x18 Encoding comparison Source: http://perlgeek.de/en/article/encodings-and-unicode
Remember: Just because someone claims it’s UTF-8, doesn’t mean it
is