Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
UTF-8入門
Search
yn2011
December 27, 2018
Programming
0
200
UTF-8入門
文字コード / Unicode / UTF-8のデコード例 / UTF-8の脆弱性
yn2011
December 27, 2018
Tweet
Share
More Decks by yn2011
See All by yn2011
シェル芸入門
yn2011
1
1k
オブジェクト指向プログラミングについて調べてみた
yn2011
0
320
初心者系エンジニアにおすすめの技術書3冊
yn2011
0
220
Other Decks in Programming
See All in Programming
Architectural Extensions
denyspoltorak
0
280
それ、本当に安全? ファイルアップロードで見落としがちなセキュリティリスクと対策
penpeen
7
2.4k
CSC307 Lecture 03
javiergs
PRO
1
490
フロントエンド開発の勘所 -複数事業を経験して見えた判断軸の違い-
heimusu
7
2.8k
Oxlintはいいぞ
yug1224
5
1.3k
Vibe Coding - AI 驅動的軟體開發
mickyp100
0
170
生成AIを使ったコードレビューで定性的に品質カバー
chiilog
1
250
ぼくの開発環境2026
yuzneri
0
110
Oxlint JS plugins
kazupon
1
770
KIKI_MBSD Cybersecurity Challenges 2025
ikema
0
1.3k
Grafana:建立系統全知視角的捷徑
blueswen
0
330
Implementation Patterns
denyspoltorak
0
280
Featured
See All Featured
Art, The Web, and Tiny UX
lynnandtonic
304
21k
Stewardship and Sustainability of Urban and Community Forests
pwiseman
0
110
What does AI have to do with Human Rights?
axbom
PRO
0
2k
Testing 201, or: Great Expectations
jmmastey
46
8k
Mind Mapping
helmedeiros
PRO
0
78
Chasing Engaging Ingredients in Design
codingconduct
0
110
The Language of Interfaces
destraynor
162
26k
Taking LLMs out of the black box: A practical guide to human-in-the-loop distillation
inesmontani
PRO
3
2k
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
231
22k
Code Reviewing Like a Champion
maltzj
527
40k
End of SEO as We Know It (SMX Advanced Version)
ipullrank
3
3.9k
Embracing the Ebb and Flow
colly
88
5k
Transcript
UTF-8ೖ 2018/12/27 ΄ΖΑ͍ͯͬ͘ @yn2011
ࣗݾհ • SalesforceͱJavaScriptͷਓ • ࠷ۙͷझຯγΣϧܳ
͢͜ͱ • จࣈίʔυͷجૅ • Unicode / UTF-8 ͷ֓ཁ • UTF-8
ͷΤϯίʔυ/σίʔυ • UTF-8 ͷ੬ऑੑ
͢͜ͱ • จࣈίʔυͷجૅ • Unicode / UTF-8 ͷ֓ཁ • UTF-8
ͷΤϯίʔυ/σίʔυ • UTF-8 ͷ੬ऑੑ
จࣈίʔυʁ
จࣈίʔυ ͱ จࣈූ߸ԽํࣜΛ ۠ผ͢Δ
จࣈίʔυʢූ߸Խจࣈू߹ʣ • ֤จࣈʹରԠ͢ΔϏοτͷΈ߹ΘͤΛఆٛ • e.g. ASCII, JIS X 0208, Unicode
… UnicodeͷจࣈίʔυදͷྫʢҰ෦ʣ UnicodeҰཡ 3000-3FFF / WikipediaΑΓҾ༻
จࣈූ߸Խํࣜ • จࣈූ߸Խํࣜจࣈίʔυͷӡ༻نଇ • e.g. Unicode:UTF-8, UTF-16.. UTF-8ͷྫʢҰ෦ʣ ΦϨϯδ ORANGE-FACTORY
UTF-8ͷจࣈίʔυදΑΓҾ༻
WindowsͷϝϞா • ࠞཚ͢Δʢఆ൪ʣ
͢͜ͱ • จࣈίʔυͷجૅ • Unicode / UTF-8 ͷ֓ཁ • UTF-8
ͷΤϯίʔυ/σίʔυ • UTF-8 ͷ੬ऑੑ
UTF-8ʁ
UTF-8 • UTF-8UnicodeͱݺΕΔจࣈίʔυͷ จࣈූ߸Խํࣜͷ̍ͭ UTF-8 / Wikipedia ΑΓҾ༻
Unicode • ੈքதͷจࣈΛूͨ͠จࣈίʔυ • ଟݴޠରԠͷίετݮ • ओͳූ߸ԽํࣜʹUTF-8ͱUTF-16 Amazon Ϣχίʔυઓه ΑΓҾ༻
Unicode จࣈίʔυʮʯݚڀɹվగୈ2൛ P431ΑΓҾ༻
Unicode จࣈίʔυʮʯݚڀɹվగୈ2൛ P430ΑΓҾ༻
UTF-8 • Unicode Transformation Format-8 • 1όΠτ୯Ґೖग़ྗʢ8bitʣ • ASCII ޓ
• 1 ~ 6όΠτͷՄมίʔυ
UTF-8 UTF-8 / WikipediaΑΓҾ༻
UTF-8 Pros/Cons • Pros • ASCII த৺ͷσʔλͷ߹΄΅ಉ͡αΠζʢASCIIޓʣ • จࣈͷछྨ͕ଟ͍ʢUnicodeʣ •
Cons • ࣈฏԾ໊͕ 3 όΠτ • ෆཁͳBOMΛ༩ग़དྷͯ͠·͏
Excel • ExcelBOMͳ͠UTF-8ܗࣜͷCSVϑΝΠϧΛ Shift_JISͰղऍ͢Δ BOM͋ΓUTF-8 BOMͳ͠UTF-8ʢShift_JISͰղऍʣ
͢͜ͱ • จࣈίʔυͷجૅ • Unicode / UTF-8 ͷ֓ཁ • UTF-8
ͷΤϯίʔυ/σίʔυ • UTF-8 ͷ੬ऑੑ
UTF-8ͷσίʔυʹઓ
ʢྫʣ 0xCE94ʹରԠ͢ΔจࣈΛ ٻΊΔ
UTF-8ͷσίʔυنଇʢҰ෦ʣ • ઌ಄7Ϗοτ·Ͱنଇ͕͋Δʢলུʣ จࣈίʔυʮʯݚڀɹվగୈ2൛ P448ΑΓҾ༻
0xCE94 • 0xCE = 11001110 • 110xxxxx → xxxxx =
01110 • 0x94 = 10010100 • 10yyyyyy → yyyyyy = 010100 • xxxxxyyyyyy = 01110010100 • U+0394 = Δ
ͳΤϯίʔυ • 0xC0 = 11000000 • 110xxxxx → xxxxx =
00000 • 0xAF = 10101111 • 10yyyyyy → yyyyyy = 101111 • 0xxx xxyy yyyy = 0000 0010 1111 • U+2F = /
͢͜ͱ • จࣈίʔυͷجૅ • Unicode / UTF-8 ͷ֓ཁ • UTF-8
ͷΤϯίʔυ/σίʔυ • UTF-8 ͷ੬ऑੑ
σΟϨΫτϦɾτϥόʔαϧ • ҙਤ͠ͳ͍ϑΝΠϧΞΫηεͰ͖Δ੬ऑੑ • ../../../../../../../../../etc/passwd • / = 0x2FͷΈΛఆ͍ͯ͠Δͱ… •
→ 0xC0AF = / ͕ڐ༰͞Εͯةݥ
Salesforceͩͱ… • SalesforceʢApexʣͰҙͷ16ਐΛUTF-8 Ͱσίʔυ͢Δ // Apex System.debug(EncodingUtil.urlDecode('%e3%81%82', ‘utf-8')); // ͋
ٙΘ͍͠੬ऑੑͷใࠂ
·ͱΊ • UTF-8UnicodeͱݺΕΔจࣈίʔυͷ จࣈූ߸Խํࣜͷ̍ͭ • ASCIIޓͰ1~6όΠτͷՄมίʔυ • BOMͱͳΤϯίʔυʹҙ
࢝ จࣈίʔυͱաͦ͝͏