Hello, my name is __________.

Hello, my name is ___________.

Hello, my name is Nova Patch

Hello, my name is @novapatch

Hello, my name is #MyNameIs

Why don’t you support my name?

1. Unintentional bugs Why don’t you support my name?

1. Unintentional bugs 2. Uninformed decisions Why don’t you support
my name?

1. Unintentional bugs 2. Uninformed decisions 3. Oppressive “real” name
policies Why don’t you support my name?

文字化け

Mojibake

æ–‡å—åŒ–ã� ‘

In memory of Nóirín Plunkett

Source: W3C “Character encodings: Essential concepts” by Richard Ishida; ©
W3C

MySQL utf8 vs. utf8mb4

JavaScript “characters”

“Almost all emoji —and all new ones— are encoded in
Plane 1” Why support non-BMP characters? Source: ”2015 Top Ten List: Why Support Beyond-BMP Code Points?” by Dr. Ken Lunde © Adobe Systems Incorporated

“Japan’s 2,136 Jōyō Kanji requires one Extension B ideograph” Why
support non-BMP characters? Source: ”2015 Top Ten List: Why Support Beyond-BMP Code Points?” by Dr. Ken Lunde © Adobe Systems Incorporated

“JIS X 0213:2004 requires 303 Extension B ideographs” Why support
non-BMP characters? Source: ”2015 Top Ten List: Why Support Beyond-BMP Code Points?” by Dr. Ken Lunde © Adobe Systems Incorporated

Why support non-BMP characters?

“GB 18030 certification without PUA requires six Extension B ideographs”
Why support non-BMP characters? Source: ”2015 Top Ten List: Why Support Beyond-BMP Code Points?” by Dr. Ken Lunde © Adobe Systems Incorporated

“China’s 8,105 hànzì set requires 196 Extension B through E
ideographs” Why support non-BMP characters? Source: ”2015 Top Ten List: Why Support Beyond-BMP Code Points?” by Dr. Ken Lunde © Adobe Systems Incorporated

“Hong Kong SCS-2008 requires 1,702 Extension B & C ideographs”

“Modern OSes and applications support code points outside the BMP”

“As of Unicode Version 6.0, there are more characters outside
the BMP” Why support non-BMP characters? Source: ”2015 Top Ten List: Why Support Beyond-BMP Code Points?” by Dr. Ken Lunde © Adobe Systems Incorporated

“The BMP is effectively full” Why support non-BMP characters? Source:
”2015 Top Ten List: Why Support Beyond-BMP Code Points?” by Dr. Ken Lunde © Adobe Systems Incorporated

李炘煜

李▯煜

李炘煜 U+674E U+7098 U+715C

O'Reilly

O\'Reilly

O'Reilly

OReilly

1. Identifier characters 2. Case folding 3. Normalization 4. Confusable
characters 5. Mixed scripts Unicode Usernames

1. UTS #31: Unicode Identifier and Pattern Syntax 2. UTR
#36: Unicode Security Considerations 3. UTS #39: Unicode Security Mechanisms 4. RFC 7613: Preparation, Enforcement, and Comparison of Internationalized Strings Representing Usernames and Passwords Unicode Usernames

Preventing fake names is not worth discriminating against real users.

Nova Patch @novapatch Shutterstock

Hello, my name is __________.

Hello, my name is __________.

More Decks by Nova Patch

Other Decks in Programming

Featured

Transcript