$30 off During Our Annual Pro Sale. View Details »
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Converting Legacy Applications To UTF-8
Search
Tim Swann
June 04, 2015
Technology
0
72
Converting Legacy Applications To UTF-8
What are all the angles to be considered when you decide to standardise your text to UTF-8
Tim Swann
June 04, 2015
Tweet
Share
More Decks by Tim Swann
See All by Tim Swann
PHP Collection Classes | PHPBelfast
faffyman
0
770
Graph data with Neo4J PHP & Neoxygen
faffyman
0
2k
Other Decks in Technology
See All in Technology
Kubernetes Multi-tenancy: Principles and Practices for Large Scale Internal Platforms
hhiroshell
0
120
新 Security HubがついにGA!仕組みや料金を深堀り #AWSreInvent #regrowth / AWS Security Hub Advanced GA
masahirokawahara
1
1.8k
re:Inventで気になったサービスを10分でいけるところまでお話しします
yama3133
1
120
GitHub Copilotを使いこなす 実例に学ぶAIコーディング活用術
74th
3
2.5k
ChatGPTで論⽂は読めるのか
spatial_ai_network
3
12k
AWSセキュリティアップデートとAWSを育てる話
cmusudakeisuke
0
230
計算機科学をRubyと歩む 〜DFA型正規表現エンジンをつくる~
ydah
3
230
生成AIでテスト設計はどこまでできる? 「テスト粒度」を操るテーラリング術
shota_kusaba
0
670
Debugging Edge AI on Zephyr and Lessons Learned
iotengineer22
0
170
エンジニアリングマネージャー はじめての目標設定と評価
halkt
0
270
SSO方式とJumpアカウント方式の比較と設計方針
yuobayashi
7
590
多様なデジタルアイデンティティを攻撃からどうやって守るのか / 20251212
ayokura
0
420
Featured
See All Featured
The Pragmatic Product Professional
lauravandoore
37
7.1k
Reflections from 52 weeks, 52 projects
jeffersonlam
355
21k
Documentation Writing (for coders)
carmenintech
76
5.2k
Optimising Largest Contentful Paint
csswizardry
37
3.5k
The Illustrated Children's Guide to Kubernetes
chrisshort
51
51k
4 Signs Your Business is Dying
shpigford
186
22k
Git: the NoSQL Database
bkeepers
PRO
432
66k
GraphQLの誤解/rethinking-graphql
sonatard
73
11k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
49
3.2k
No one is an island. Learnings from fostering a developers community.
thoeni
21
3.5k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
253
22k
A Tale of Four Properties
chriscoyier
162
23k
Transcript
CHARACTER FLAWS CHANGING A LEGACY APPLICATION TO UTF-8 June 4th
2015
• Legacy Applications are a mish-mash of encodings • Ever
dealt with Irish names? • French / German / Eastern European? • What about a €uro Symbol? THE PROBLEM
Standardise your character encoding THE SOLUTION Simple - Right?
<?php header('Content-Type: text/html; charset=UTF-8') ; ?> <meta http-equiv="Content-Type"
content="text/html; charset=UTF-8" /> <meta charset="UTF-8" /> HEADERS All Done… …Just Like Austria in 1982 - Pack up and go home
Set Character Encodings Everywhere CHARACTER IN ≠ CHARACTER OUT Front
End Pages ✔ Database ✔ Tables ✔ Columns ✔ Forms ✔ Server(s) ✔
DATABASE SET NAMES utf8 ; Connections Need to be UTF-8
Aware
DATABASE ALTER DATABASE database_name DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci
Create a UTF-8 Database
DATABASE Change Existing Tables to UTF-8 ALTER TABLE table_name DEFAULT
CHARACTER SET utf8 COLLATE utf8_unicode_ci ;
DATABASE ALTER TABLE table_name CHANGE title title VARCHAR( 128 )
CHARACTER SET utf8 COLLATE utf8_unicode_ci ; Columns Need to be UTF-8
FORMS <form action=“post.php" method="post" enctype="multipart/form-data" accept-charset="UTF-8" > POST only
UTF-8 data
PHP INI default_charset = "utf-8"
PHP TEXT CONVERSION <?php utf8_encode ( $string) ; PHP Functions
Only Encodes ISO-8859-1 Strings
PHP TEXT CONVERSION <?php iconv('ISO-8859-1', 'UTF-8//TRANSLIT', $string); // Translit
= find a similar char e.. ä => a mb_convert_encoding($string, 'UTF-8', ‘ISO-8859-1') // requires php-mbstring extension for multi-bytue support PHP Functions
PHP TEXT CONVERSION <?php htmlentities ( $string, ENT_QUOTES, 'UTF-8') ;
PHP Functions Specify encoding when escaping output
PHP TEXT CONVERSION <?php new UConverter([ string $destination_encoding [, string
$source_encoding ]] ) $unconvertor->convert( $string ) ; Since PHP 5.5 - UConvertor
PHP SOURCE FILES - YOUR IDE All Team members should
have same settings