Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Converting Legacy Applications To UTF-8
Search
Tim Swann
June 04, 2015
Technology
0
70
Converting Legacy Applications To UTF-8
What are all the angles to be considered when you decide to standardise your text to UTF-8
Tim Swann
June 04, 2015
Tweet
Share
More Decks by Tim Swann
See All by Tim Swann
PHP Collection Classes | PHPBelfast
faffyman
0
750
Graph data with Neo4J PHP & Neoxygen
faffyman
0
2k
Other Decks in Technology
See All in Technology
データ基盤からデータベースまで?広がるユースケースのDatabricksについて教えるよ!
akuwano
3
160
助けて! XからWaylandに移行しないと新しいGNOMEが使えなくなっちゃう 2025-07-12
nobutomurata
2
140
Contributing to Rails? Start with the Gems You Already Use
yahonda
2
120
United™️ Airlines®️ Customer®️ USA Contact Numbers: Complete 2025 Support Guide
flyunitedguide
0
780
CDK Vibe Coding Fes
tomoki10
1
530
Delta airlines Customer®️ USA Contact Numbers: Complete 2025 Support Guide
deltahelp
0
1.1k
american airlines®️ USA Contact Numbers: Complete 2025 Support Guide
supportflight
1
120
〜『世界中の家族のこころのインフラ』を目指して”次の10年”へ〜 SREが導いたグローバルサービスの信頼性向上戦略とその舞台裏 / Towards the Next Decade: Enhancing Global Service Reliability
kohbis
3
1.1k
OpenTelemetryセマンティック規約の恩恵とMackerel APMにおける活用例 / SRE NEXT 2025
mackerelio
3
1.6k
モニタリング統一への道のり - 分散モニタリングツール統合のためのオブザーバビリティプロジェクト
niftycorp
PRO
1
360
オフィスビルを監視しよう:フィジカル×デジタルにまたがるSLI/SLO設計と運用の難しさ / Monitoring Office Buildings: The Challenge of Physical-Digital SLI/SLO Design & Operation
bitkey
1
350
SEQUENCE object comparison - db tech showcase 2025 LT2
nori_shinoda
0
280
Featured
See All Featured
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
138
34k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
161
15k
Large-scale JavaScript Application Architecture
addyosmani
512
110k
Writing Fast Ruby
sferik
628
62k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
667
120k
Keith and Marios Guide to Fast Websites
keithpitt
411
22k
A better future with KSS
kneath
238
17k
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
PRO
181
54k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
53
2.9k
4 Signs Your Business is Dying
shpigford
184
22k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
251
21k
Improving Core Web Vitals using Speculation Rules API
sergeychernyshev
18
980
Transcript
CHARACTER FLAWS CHANGING A LEGACY APPLICATION TO UTF-8 June 4th
2015
• Legacy Applications are a mish-mash of encodings • Ever
dealt with Irish names? • French / German / Eastern European? • What about a €uro Symbol? THE PROBLEM
Standardise your character encoding THE SOLUTION Simple - Right?
<?php header('Content-Type: text/html; charset=UTF-8') ; ?> <meta http-equiv="Content-Type"
content="text/html; charset=UTF-8" /> <meta charset="UTF-8" /> HEADERS All Done… …Just Like Austria in 1982 - Pack up and go home
Set Character Encodings Everywhere CHARACTER IN ≠ CHARACTER OUT Front
End Pages ✔ Database ✔ Tables ✔ Columns ✔ Forms ✔ Server(s) ✔
DATABASE SET NAMES utf8 ; Connections Need to be UTF-8
Aware
DATABASE ALTER DATABASE database_name DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci
Create a UTF-8 Database
DATABASE Change Existing Tables to UTF-8 ALTER TABLE table_name DEFAULT
CHARACTER SET utf8 COLLATE utf8_unicode_ci ;
DATABASE ALTER TABLE table_name CHANGE title title VARCHAR( 128 )
CHARACTER SET utf8 COLLATE utf8_unicode_ci ; Columns Need to be UTF-8
FORMS <form action=“post.php" method="post" enctype="multipart/form-data" accept-charset="UTF-8" > POST only
UTF-8 data
PHP INI default_charset = "utf-8"
PHP TEXT CONVERSION <?php utf8_encode ( $string) ; PHP Functions
Only Encodes ISO-8859-1 Strings
PHP TEXT CONVERSION <?php iconv('ISO-8859-1', 'UTF-8//TRANSLIT', $string); // Translit
= find a similar char e.. ä => a mb_convert_encoding($string, 'UTF-8', ‘ISO-8859-1') // requires php-mbstring extension for multi-bytue support PHP Functions
PHP TEXT CONVERSION <?php htmlentities ( $string, ENT_QUOTES, 'UTF-8') ;
PHP Functions Specify encoding when escaping output
PHP TEXT CONVERSION <?php new UConverter([ string $destination_encoding [, string
$source_encoding ]] ) $unconvertor->convert( $string ) ; Since PHP 5.5 - UConvertor
PHP SOURCE FILES - YOUR IDE All Team members should
have same settings