Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
the world of characters
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
orisano
September 13, 2018
1.5k
8
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
the world of characters
orisano
September 13, 2018
More Decks by orisano
See All by orisano
OSS Performance Tuning Tips
orisano
8
6.2k
Docker-Compose & BuildKit
orisano
4
1.1k
Container Build Talk
orisano
3
2.6k
dockerignore talk
orisano
2
7.3k
Better docker image+
orisano
6
6.6k
Socket.IO Introduction
orisano
0
3.3k
Profiling Go Application
orisano
11
8.1k
Multi-stage Builds Patterns & Practice
orisano
6
5.3k
better docker image
orisano
22
31k
Featured
See All Featured
No one is an island. Learnings from fostering a developers community.
thoeni
21
3.8k
Why Our Code Smells
bkeepers
PRO
340
58k
Optimising Largest Contentful Paint
csswizardry
37
3.7k
職位にかかわらず全員がリーダーシップを発揮するチーム作り / Building a team where everyone can demonstrate leadership regardless of position
madoxten
62
54k
What does AI have to do with Human Rights?
axbom
PRO
1
2.2k
The SEO identity crisis: Don't let AI make you average
varn
0
490
Joys of Absence: A Defence of Solitary Play
codingconduct
1
400
Being A Developer After 40
akosma
91
590k
Marketing to machines
jonoalderson
1
5.5k
Thoughts on Productivity
jonyablonski
76
5.2k
RailsConf 2023
tenderlove
30
1.5k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
32
3.5k
Transcript
1จࣈͷੈք @orisano
Έͳ͞Μ จࣈΛ͑ΒΕ·͢ΑͶʁ
a
a => 1
͋
͋ => 1
佛
佛 => 1
None
=> 1
None
=> 1
Z͑ͫ̓ͪ̂ͫ̽ ̴̙̤̞͉͚̯̞̠͍A̴̵̜̰͔ ͫ͗ ͢ L̠ͨͧͩ͘ G̴̻͈͍͔̹ ̑͗̎̅͛ ́ Ǫ̵̹̻̝̳ ͂̌
̌͘! ͖̬̰̙̗ ̿̋ ͥ ͥ̂ͣ̐́́͜͞
Z͑ͫ̓ͪ̂ͫ̽ ̴̙̤̞͉͚̯̞̠͍A̴̵̜̰͔ ͫ͗ ͢ L̠ͨͧͩ͘ G̴̻͈͍͔̹ ̑͗̎̅͛ ́ Ǫ̵̹̻̝̳ ͂̌
̌͘! ͖̬̰̙̗ ̿̋ ͥ ͥ̂ͣ̐́́͜͞ => 6
Έͳ͞Μ όΠτΛ͑ΒΕ·͔͢ʁ (UTF-8)
a
a => 1
͋
͋ => 3
佛
佛 => 4
None
=> 4
None
=> 18
Z͑ͫ̓ͪ̂ͫ̽ ̴̙̤̞͉͚̯̞̠͍A̴̵̜̰͔ ͫ͗ ͢ L̠ͨͧͩ͘ G̴̻͈͍͔̹ ̑͗̎̅͛ ́ Ǫ̵̹̻̝̳ ͂̌
̌͘! ͖̬̰̙̗ ̿̋ ͥ ͥ̂ͣ̐́́͜͞
Z͑ͫ̓ͪ̂ͫ̽ ̴̙̤̞͉͚̯̞̠͍A̴̵̜̰͔ ͫ͗ ͢ L̠ͨͧͩ͘ G̴̻͈͍͔̹ ̑͗̎̅͛ ́ Ǫ̵̹̻̝̳ ͂̌
̌͘! ͖̬̰̙̗ ̿̋ ͥ ͥ̂ͣ̐́́͜͞ => 143
͋ͳ͕ͨࢥ͏1จࣈ Ͳ͏͑Δ͖͔ʁ
byteͰ͑ΒΕͳ͍
Unicodeจࣈू߹ จࣈͱ͕ରԠ͢Δ
͋ => 3042
=> 1F914
͜ͷͷ͜ͱΛ ίʔυϙΠϯτ ͱݺͿ
͜ͷίʔυϙΠϯτΛ byteྻͰදݱ͢Δํ๏Λ ΤϯίʔσΟϯάͱ͍͏
UTF-8ͱ͔UTF-16ͱ͔ ΤϯίʔσΟϯάͷҰछ
ͱΓ͋͑ͣ ίʔυϙΠϯτΛ͑Ε ղܾʁ
͍͍͑
=> 1F468 + 200D + 1F469 + 200D + 1F466
࣮ෳͷίʔυϙΠϯτͰ ҰͭͷจࣈʹͳͬͨΓ͢Δ
ਓ͕ؒೝ͍ࣝͯ͠Δ̍จࣈ ॻهૉ(Grapheme cluster) ͱݺΕ͍ͯΔ
Ͳ͏Ε ίʔυϙΠϯτͷྻ͔Β ॻهૉΛऔΓग़ͤΔ͔
ίʔυϙΠϯτ͕ؒ ॻهૉڥքʹͳΔ͔Ͳ͏͔ͷ ݫີͳϧʔϧ͕͋Δ
UAX #29 Unicode Text Segmentation
None
͜ΕΛJSͰ࣮ͯ͠·ͨ͠ github.com/orisano/graphemesplit
ৄ͘͠ UAX #29 Λݟͯ http://unicode.org/reports/tr29/
ݟΒ͵ਓʹʓจࣈͱ ݴΘΕͨͱ͖ʹ ͪΌΜͱ֬ೝ͠Α͏ʂ
1 byte? 1 codepoint? 1 grapheme cluster?