Unicode, JavaScript and the Emoji family

22725c2d3eb331146549bf0d5d3c050c?s=47 stefan judis
November 07, 2016

Unicode, JavaScript and the Emoji family

22725c2d3eb331146549bf0d5d3c050c?s=128

stefan judis

November 07, 2016
Tweet

Transcript

  1. 2.

    Stefan Judis Frontend Developer, Occasional Teacher, Meetup Organizer ❤ Open

    Source, Performance and Accessibility ❤ @stefanjudis
  2. 4.

    Stefan Judis Frontend Developer, Occasional Teacher, Meetup Organizer ❤ Open

    Source, Performance and Accessibility ❤ @stefanjudis
  3. 11.
  4. 15.

    UNICODE ... is an international encoding standard 01 02 03

    is a mapping from each letter, digit or symbol to a numeric value works across different platforms and programs
  5. 17.

    1,114,112 code points in 17 planes Basic Multilingual Plane U+0000

    to U+FFFF Supplementary Planes u+10000 to U+10FFFF U+10000 to U+1FFFF U+20000 to U+2FFFF U+30000 to U+DFFFF U+E0000 to U+EFFFF U+F0000 to U+10FFFF Supplementary Multilingual Plane Supplementary Ideographic Plane Supplementary Special-purpose Plane Supplementary Private Use Area Planes unassigned 1 plane 1 plane 1 plane 1 plane 2 planes 16 planes 11 planes UNICODE - overview -
  6. 18.

    characters for almost all modern languages + a lot of

    of symbols Basic Multilingual Plane U+0000 to U+FFFF Supplementary Planes U+10000 to U+10FFFF U+10000 to U+1FFFF U+20000 to U+2FFFF U+30000 to U+DFFFF U+E0000 to U+EFFFF U+F0000 to U+10FFFF Supplementary Multilingual Plane Supplementary Ideographic Plane Supplementary Special-purpose Plane Supplementary Private Use Area Planes unassigned 1 plane 1 plane 1 plane 1 plane 2 planes 16 planes 11 planes UNICODE - Basic Multilingual Plane -
  7. 19.

    everything else Basic Multilingual Plane U+0000 to U+FFFF Supplementary Planes

    U+10000 to U+10FFFF U+10000 to U+1FFFF U+20000 to U+2FFFF U+30000 to U+DFFFF U+E0000 to U+EFFFF U+F0000 to U+10FFFF Supplementary Multilingual Plane Supplementary Ideographic Plane Supplementary Special-purpose Plane Supplementary Private Use Area Planes unassigned 1 plane 1 plane 1 plane 1 plane 2 planes 16 planes 11 planes UNICODE - Supplementary Planes -
  8. 20.
  9. 21.

    EMOJIS ... were initially used by Japanese mobile operators 01

    02 03 were added to Unicode v6 in October 2010 are supported since OS X 10.7 (Lion) and Windows 8
  10. 22.

    Basic Multilingual Plane U+0000 to U+FFFF Supplementary Planes U+10000 to

    U+10FFFF U+10000 to U+1FFFF U+20000 to U+2FFFF U+30000 to U+DFFFF U+E0000 to U+EFFFF U+F0000 to U+10FFFF Supplementary Multilingual Plane Supplementary Ideographic Plane Supplementary Special-purpose Plane Supplementary Private Use Area Planes unassigned 1 Plane 1 Plane 1 Plane 1 Plane 2 Planes 16 Planes 11 Planes %' are in the Supplementary Multilingual Plane EMOJIS - overview -
  11. 26.
  12. 27.

    EMOJIS ZERO WIDTH JOINER U+200D Indicator that a single glyph

    should be presented for a sequence of characters - ZWJ sequences -
  13. 30.

    EMOJIS - ZWJ sequences - * U+1F468 + ZWJ U+200D

    + U+1F468 U+1F467 + ZWJ U+200D + ( 5 code points )
  14. 31.

    EMOJIS - ZWJ sequences - woman astronaut ( 4 code

    points ) ZWJ + + man artist ( 4 code points ) ZWJ + + man getting hair cut ( 4 code points ) ♂ ZWJ + + - woman mountain biking ( 4 code points ) ♀ ZWJ + + /
  15. 32.

    EMOJIS - ZWJ sequences - woman astronaut ( 4 code

    points ) ZWJ + + man artist ( 4 code points ) ZWJ + + man getting hair cut ( 4 code points ) ♂ ZWJ + + - woman mountain biking ( 4 code points ) ♀ ZWJ + + / "David Bowie" - Singer - ZWJ + + Apple Google ZWJ + +
  16. 33.

    EMOJIS - ZWJ sequences - woman astronaut ( 4 code

    points ) ZWJ + + man artist ( 4 code points ) ZWJ + + man getting hair cut ( 4 code points ) ♂ ZWJ + + - woman mountain biking ( 4 code points ) ♀ ZWJ + + / "David Bowie" Emoji is not yet supported.
  17. 34.

    EMOJIS - ZWJ sequences - woman astronaut ( 4 code

    points ) ZWJ + + man artist ( 4 code points ) ZWJ + + man getting hair cut ( 4 code points ) ♂ ZWJ + + - woman mountain biking ( 4 code points ) ♀ ZWJ + + / Sequences degrade gracefully! '\u{1F468}\u{200D}\u{1F3A4}' "" '\u{1F469}\u{200D}\u{1F3A4}' ""
  18. 35.

    EMOJIS - flags - ... 26 regional indicators used in

    pairs to represent regions U+1F1E6 U+1F1FF
  19. 36.

    EMOJIS - flags - ... 26 regional indicators used in

    pairs to represent regions U+1F1E6 U+1F1FF 7 U+1F1E9 U+1F1EA : U+1F1EC U+1F1E7 < U+1F1E8 U+1F1FD ( 2 code points ) ( 2 code points ) ( 2 code points )
  20. 37.

    EMOJIS - flags - www.dwitter.net/d/2708 function() { x.font='96px a' S=String.fromCodePoint

    W=e=>x.measureText(e).width i=t*4%257|0 W(S(F=0x1F1E6,F))>W(_=S(F+i%26,F+i/26|0))&&x.fillText(_,9,99) } Dweet by @veubeke
  21. 38.

    How many Emojis are out there? EMOJIS - overview -

    2198 unicode.org/reports/tr51/#Identification (excluding incomplete singletons) (excluding duplicates) (including all combined sequences)
  22. 40.

    JAVASCRIPT UTF-16, the string format used by JavaScript, uses a

    single 16-bit code unit to represent the most common characters. - string representation -
  23. 42.

    \u0000 - \uFFFF can fit into 16bit ツ ('\uFF82') 

    ('\uF8FF') ‚ ('\u9731') ⛷ ('\u26F7') JAVASCRIPT - characters with one code unit -
  24. 43.

    \u0000 - \uFFFF can fit into 16bit 'ツ'.length ''.length '‚'.length

    '⛷'.length 1 JAVASCRIPT - characters with one code unit -
  25. 44.

    How can we use code points out of the 16bit

    range? JAVASCRIPT - surrogate pairs -
  26. 45.

    Surrogate Pairs JAVASCRIPT - surrogate pairs - 2048 surrogate code

    points included in the Basic Multilingual Plane
  27. 46.

    Surrogate Pairs JAVASCRIPT - surrogate pairs - 2048 surrogate code

    points included in the Basic Multilingual Plane Leading/High Surrogates U+D800 to U+DBFF
  28. 47.

    Surrogate Pairs JAVASCRIPT - surrogate pairs - 2048 surrogate code

    points included in the Basic Multilingual Plane Leading/High Surrogates Trailing/Low Surrogates U+D800 to U+DBFF U+DC00 to U+DFFF
  29. 48.

    Surrogate Pairs JAVASCRIPT - surrogate pairs - 2048 surrogate code

    points included in the Basic Multilingual Plane Leading/High Surrogates Trailing/Low Surrogates U+D800 to U+DBFF U+DC00 to U+DFFF C = (H - 0xD800) * 0x400 + L - 0xDC00 + 0x10000 Formula to get code point C = (H - 55296) * 1024 + L - 56320 + 65536
  30. 51.

    Surrogate Pairs JAVASCRIPT - surrogate pairs - ''.charCodeAt(0) U+D83D 55357

    ''.charCodeAt(1) U+DC68 56424 U+1F468 128104 ''.length // 2
  31. 52.

    Surrogate Pairs JAVASCRIPT - surrogate pairs - ''.charCodeAt(0) U+D83D 55357

    ''.charCodeAt(1) U+DC68 56424 U+1F468 128104 0x1F468 = (0xD83D - 0xD800) * 0x400 + 0xDC68 - 0xDC00 + 0x10000 128104 = (55357 - 55296) * 1024 + 56424 - 56320 + 65536 ''.length // 2
  32. 53.

    Surrogate Pairs JAVASCRIPT - surrogate pairs - ''.charCodeAt(0) U+D83D 55357

    ''.charCodeAt(1) U+DC68 56424 U+1F468 128104 0x1F468 = (0xD83D - 0xD800) * 0x400 + 0xDC68 - 0xDC00 + 0x10000 128104 = (55357 - 55296) * 1024 + 56424 - 56320 + 65536 ''.length // 2
  33. 54.

    charCodeAt() vs codePointAt() JAVASCRIPT - surrogate pairs - U+1F468 128104

    ''.codePointAt(0) U+1F468 128104 ''.codePointAt(1) U+DC68 56424 ''.charCodeAt(0) U+D83D 55357 ''.charCodeAt(1) U+DC68 56424
  34. 55.

    charCodeAt() vs codePointAt() JAVASCRIPT - surrogate pairs - U+1F468 128104

    ''.codePointAt(0) U+1F468 128104 ''.codePointAt(1) U+DC68 56424 ''.charCodeAt(0) U+D83D 55357 ''.charCodeAt(1) U+DC68 56424
  35. 56.
  36. 58.

    JAVASCRIPT - String.prototype.length - This property returns the number of

    code units in the string. String.prototype.length
  37. 59.

    - the spread operator - The spread operator works for

    every iterable object. [...'ABC'] JAVASCRIPT
  38. 60.

    - the spread operator - The spread operator works for

    every iterable object. [...'ABC'] JAVASCRIPT > ''[Symbol.iterator] function [Symbol.iterator]() { [native code] }
  39. 61.

    - the spread operator - [...] iterates over the code

    points of a String value, returning each code point as a String value. String.prototype [ @@iterator ]( ) JAVASCRIPT
  40. 65.