Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
UTF-8入門
Search
yn2011
December 27, 2018
Programming
0
200
UTF-8入門
文字コード / Unicode / UTF-8のデコード例 / UTF-8の脆弱性
yn2011
December 27, 2018
Tweet
Share
More Decks by yn2011
See All by yn2011
シェル芸入門
yn2011
1
1k
オブジェクト指向プログラミングについて調べてみた
yn2011
0
320
初心者系エンジニアにおすすめの技術書3冊
yn2011
0
220
Other Decks in Programming
See All in Programming
[KNOTS 2026登壇資料]AIで拡張‧交差する プロダクト開発のプロセス および携わるメンバーの役割
hisatake
0
250
Amazon Bedrockを活用したRAGの品質管理パイプライン構築
tosuri13
4
260
AI時代の認知負荷との向き合い方
optfit
0
150
Automatic Grammar Agreementと Markdown Extended Attributes について
kishikawakatsumi
0
180
ThorVG Viewer In VS Code
nors
0
770
余白を設計しフロントエンド開発を 加速させる
tsukuha
7
2.1k
CSC307 Lecture 03
javiergs
PRO
1
490
メルカリのリーダビリティチームが取り組む、AI時代のスケーラブルな品質文化
cloverrose
2
510
Rust 製のコードエディタ “Zed” を使ってみた
nearme_tech
PRO
0
150
AIフル活用時代だからこそ学んでおきたい働き方の心得
shinoyu
0
130
Fragmented Architectures
denyspoltorak
0
150
AI & Enginnering
codelynx
0
110
Featured
See All Featured
The Illustrated Guide to Node.js - THAT Conference 2024
reverentgeek
0
250
Navigating the moral maze — ethical principles for Al-driven product design
skipperchong
2
240
Measuring Dark Social's Impact On Conversion and Attribution
stephenakadiri
1
120
Large-scale JavaScript Application Architecture
addyosmani
515
110k
Agile Leadership in an Agile Organization
kimpetersen
PRO
0
79
Technical Leadership for Architectural Decision Making
baasie
1
240
How To Stay Up To Date on Web Technology
chriscoyier
791
250k
Un-Boring Meetings
codingconduct
0
200
How to Grow Your eCommerce with AI & Automation
katarinadahlin
PRO
0
100
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
25
1.7k
Unlocking the hidden potential of vector embeddings in international SEO
frankvandijk
0
170
Making the Leap to Tech Lead
cromwellryan
135
9.7k
Transcript
UTF-8ೖ 2018/12/27 ΄ΖΑ͍ͯͬ͘ @yn2011
ࣗݾհ • SalesforceͱJavaScriptͷਓ • ࠷ۙͷझຯγΣϧܳ
͢͜ͱ • จࣈίʔυͷجૅ • Unicode / UTF-8 ͷ֓ཁ • UTF-8
ͷΤϯίʔυ/σίʔυ • UTF-8 ͷ੬ऑੑ
͢͜ͱ • จࣈίʔυͷجૅ • Unicode / UTF-8 ͷ֓ཁ • UTF-8
ͷΤϯίʔυ/σίʔυ • UTF-8 ͷ੬ऑੑ
จࣈίʔυʁ
จࣈίʔυ ͱ จࣈූ߸ԽํࣜΛ ۠ผ͢Δ
จࣈίʔυʢූ߸Խจࣈू߹ʣ • ֤จࣈʹରԠ͢ΔϏοτͷΈ߹ΘͤΛఆٛ • e.g. ASCII, JIS X 0208, Unicode
… UnicodeͷจࣈίʔυදͷྫʢҰ෦ʣ UnicodeҰཡ 3000-3FFF / WikipediaΑΓҾ༻
จࣈූ߸Խํࣜ • จࣈූ߸Խํࣜจࣈίʔυͷӡ༻نଇ • e.g. Unicode:UTF-8, UTF-16.. UTF-8ͷྫʢҰ෦ʣ ΦϨϯδ ORANGE-FACTORY
UTF-8ͷจࣈίʔυදΑΓҾ༻
WindowsͷϝϞா • ࠞཚ͢Δʢఆ൪ʣ
͢͜ͱ • จࣈίʔυͷجૅ • Unicode / UTF-8 ͷ֓ཁ • UTF-8
ͷΤϯίʔυ/σίʔυ • UTF-8 ͷ੬ऑੑ
UTF-8ʁ
UTF-8 • UTF-8UnicodeͱݺΕΔจࣈίʔυͷ จࣈූ߸Խํࣜͷ̍ͭ UTF-8 / Wikipedia ΑΓҾ༻
Unicode • ੈքதͷจࣈΛूͨ͠จࣈίʔυ • ଟݴޠରԠͷίετݮ • ओͳූ߸ԽํࣜʹUTF-8ͱUTF-16 Amazon Ϣχίʔυઓه ΑΓҾ༻
Unicode จࣈίʔυʮʯݚڀɹվగୈ2൛ P431ΑΓҾ༻
Unicode จࣈίʔυʮʯݚڀɹվగୈ2൛ P430ΑΓҾ༻
UTF-8 • Unicode Transformation Format-8 • 1όΠτ୯Ґೖग़ྗʢ8bitʣ • ASCII ޓ
• 1 ~ 6όΠτͷՄมίʔυ
UTF-8 UTF-8 / WikipediaΑΓҾ༻
UTF-8 Pros/Cons • Pros • ASCII த৺ͷσʔλͷ߹΄΅ಉ͡αΠζʢASCIIޓʣ • จࣈͷछྨ͕ଟ͍ʢUnicodeʣ •
Cons • ࣈฏԾ໊͕ 3 όΠτ • ෆཁͳBOMΛ༩ग़དྷͯ͠·͏
Excel • ExcelBOMͳ͠UTF-8ܗࣜͷCSVϑΝΠϧΛ Shift_JISͰղऍ͢Δ BOM͋ΓUTF-8 BOMͳ͠UTF-8ʢShift_JISͰղऍʣ
͢͜ͱ • จࣈίʔυͷجૅ • Unicode / UTF-8 ͷ֓ཁ • UTF-8
ͷΤϯίʔυ/σίʔυ • UTF-8 ͷ੬ऑੑ
UTF-8ͷσίʔυʹઓ
ʢྫʣ 0xCE94ʹରԠ͢ΔจࣈΛ ٻΊΔ
UTF-8ͷσίʔυنଇʢҰ෦ʣ • ઌ಄7Ϗοτ·Ͱنଇ͕͋Δʢলུʣ จࣈίʔυʮʯݚڀɹվగୈ2൛ P448ΑΓҾ༻
0xCE94 • 0xCE = 11001110 • 110xxxxx → xxxxx =
01110 • 0x94 = 10010100 • 10yyyyyy → yyyyyy = 010100 • xxxxxyyyyyy = 01110010100 • U+0394 = Δ
ͳΤϯίʔυ • 0xC0 = 11000000 • 110xxxxx → xxxxx =
00000 • 0xAF = 10101111 • 10yyyyyy → yyyyyy = 101111 • 0xxx xxyy yyyy = 0000 0010 1111 • U+2F = /
͢͜ͱ • จࣈίʔυͷجૅ • Unicode / UTF-8 ͷ֓ཁ • UTF-8
ͷΤϯίʔυ/σίʔυ • UTF-8 ͷ੬ऑੑ
σΟϨΫτϦɾτϥόʔαϧ • ҙਤ͠ͳ͍ϑΝΠϧΞΫηεͰ͖Δ੬ऑੑ • ../../../../../../../../../etc/passwd • / = 0x2FͷΈΛఆ͍ͯ͠Δͱ… •
→ 0xC0AF = / ͕ڐ༰͞Εͯةݥ
Salesforceͩͱ… • SalesforceʢApexʣͰҙͷ16ਐΛUTF-8 Ͱσίʔυ͢Δ // Apex System.debug(EncodingUtil.urlDecode('%e3%81%82', ‘utf-8')); // ͋
ٙΘ͍͠੬ऑੑͷใࠂ
·ͱΊ • UTF-8UnicodeͱݺΕΔจࣈίʔυͷ จࣈූ߸Խํࣜͷ̍ͭ • ASCIIޓͰ1~6όΠτͷՄมίʔυ • BOMͱͳΤϯίʔυʹҙ
࢝ จࣈίʔυͱաͦ͝͏