Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
In the beginning was TXT
Search
Markus Wein
October 02, 2014
Programming
0
80
In the beginning was TXT
A very short overview of the history of encodings, given at Vienna.rb on 2014-10-02
Markus Wein
October 02, 2014
Tweet
Share
More Decks by Markus Wein
See All by Markus Wein
Command Line Productivity
cypher
1
120
A crash intro to deliberate practice
cypher
0
110
Keeping Your PostgreSQL Data Save
cypher
0
92
Ghost in the State Machine
cypher
2
280
n Things You Didn't Know About PostgreSQL (Rubyslava & PyVo 2014 Edition)
cypher
1
210
How to Become a Better Developer
cypher
2
1.8k
An Introduction to Rust
cypher
1
7.9k
How to Become a Better Developer
cypher
1
220
A Very Short Overview of Vagrant
cypher
0
7.7k
Other Decks in Programming
See All in Programming
ヤプリ新卒SREの オンボーディング
masaki12
0
130
Better Code Design in PHP
afilina
PRO
0
120
CSC509 Lecture 09
javiergs
PRO
0
140
Amazon Qを使ってIaCを触ろう!
maruto
0
400
Jakarta EE meets AI
ivargrimstad
0
540
EventSourcingの理想と現実
wenas
6
2.3k
Webの技術スタックで マルチプラットフォームアプリ開発を可能にするElixirDesktopの紹介
thehaigo
2
1k
タクシーアプリ『GO』のリアルタイムデータ分析基盤における機械学習サービスの活用
mot_techtalk
4
1.4k
「今のプロジェクトいろいろ大変なんですよ、app/services とかもあって……」/After Kaigi on Rails 2024 LT Night
junk0612
5
2.1k
Compose 1.7のTextFieldはPOBox Plusで日本語変換できない
tomoya0x00
0
190
聞き手から登壇者へ: RubyKaigi2024 LTでの初挑戦が 教えてくれた、可能性の星
mikik0
1
130
CSC509 Lecture 12
javiergs
PRO
0
160
Featured
See All Featured
Side Projects
sachag
452
42k
XXLCSS - How to scale CSS and keep your sanity
sugarenia
246
1.3M
Building Better People: How to give real-time feedback that sticks.
wjessup
364
19k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
8
860
Testing 201, or: Great Expectations
jmmastey
38
7.1k
Become a Pro
speakerdeck
PRO
25
5k
How to Think Like a Performance Engineer
csswizardry
20
1.1k
Why You Should Never Use an ORM
jnunemaker
PRO
54
9.1k
StorybookのUI Testing Handbookを読んだ
zakiyama
27
5.3k
The Pragmatic Product Professional
lauravandoore
31
6.3k
Code Reviewing Like a Champion
maltzj
520
39k
Stop Working from a Prison Cell
hatefulcrawdad
267
20k
Transcript
In the beginning was TXT
!
EBCDIC
Source: http://en.wikipedia.org/wiki/EBCDIC
ASCII
"#$%&
None
ä, ö, or å and Ø?
Latin-1 ISO/IEC 8859-1
Latin-*
Windows code pages
Then came the €
(
None
Shift-JIS
This sucks
Unicode!
Unicode!
✈️ (planes!)
Basic Multilingual Plane
Code Points
U+0041 (LATIN SMALL LETTER A)
Source: http://codepoints.net/U+0041
Grapheme
a a a a a a a
Composite characters
U+0065 U+0301 or U+00E9
e+´ => é é
´ != ´
Unicode… is not an encoding
UTF-32
UCS-2/UTF-16
UTF-8
Source: http://en.wikipedia.org/wiki/File:UnicodeGrow2b.png
What does it look like?
Codepoint Char ASCII Latin-1 ISO-8859-15 UTF-8 UTF-16 U+0041 A 0x41
0x41 0x41 0x41 0x00 0x41 U+00C4 Ä - 0xc4 0xc4 0xc3 0x84 0x00 0xc4 U+20AC € - - 0xa4 0xe3 0x82 0xac 0x20 0xac U+C218 ࣻ - - - 0xec 0x88 0x98 0xc2 0x18 Encoding comparison Source: http://perlgeek.de/en/article/encodings-and-unicode
Remember: Just because someone claims it’s UTF-8, doesn’t mean it
is