Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Going international
Search
Sponsored
·
SiteGround - Reliable hosting with speed, security, and support you can count on.
→
Apostolis Bessas
July 03, 2012
Programming
95
2
Share
Going international
issues with internationalizing your python application.
Apostolis Bessas
July 03, 2012
Other Decks in Programming
See All in Programming
AI時代の脳疲弊と向き合う ~言語学としてのPHP~
sakuraikotone
1
1.9k
Make GenAI Production-Ready with Kubernetes Patterns
bibryam
0
110
YJITとZJITにはイカなる違いがあるのか?
nakiym
0
200
Linux Kernelの1文字のミスで 権限昇格ができた話
rqda
0
2.3k
夢の無限スパゲッティ製造機 -実装篇- #phpstudy
o0h
PRO
0
200
今年もTECHSCOREブログを書き続けます!
hiraoku101
0
250
10年分の技術的負債、完済へ ― Claude Code主導のAI駆動開発でスポーツブルを丸ごとリプレイスした話
takuya_houshima
0
2.4k
仕様漏れ実装漏れをなくすトレーサビリティAI基盤のご紹介
orgachem
PRO
9
5.6k
UIの境界線をデザインする | React Tokyo #15 メイントーク
sasagar
1
280
ふりがな Deep Dive try! Swift Tokyo 2026
watura
0
190
RSAが破られる前に知っておきたい 耐量子計算機暗号(PQC)入門 / Intro to PQC: Preparing for the Post-RSA Era
mackey0225
3
130
How Swift's Type System Guides AI Agents
koher
0
230
Featured
See All Featured
From π to Pie charts
rasagy
0
160
Applied NLP in the Age of Generative AI
inesmontani
PRO
4
2.2k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
10
1.1k
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
31
2.7k
Building an army of robots
kneath
306
46k
brightonSEO & MeasureFest 2025 - Christian Goodrich - Winning strategies for Black Friday CRO & PPC
cargoodrich
3
680
Accessibility Awareness
sabderemane
0
99
[RailsConf 2023] Rails as a piece of cake
palkan
59
6.5k
Deep Space Network (abreviated)
tonyrice
0
110
Beyond borders and beyond the search box: How to win the global "messy middle" with AI-driven SEO
davidcarrasco
3
110
XXLCSS - How to scale CSS and keep your sanity
sugarenia
250
1.3M
The Pragmatic Product Professional
lauravandoore
37
7.2k
Transcript
. . . . . . . Going International Apostolis
Bessas
[email protected]
@mpessas July 3, 2012
. . Transifex
Unicode
Bits & Bytes . . 10010001100101110110011011001101111010110001000001010111110 111111100101101100110010001000010001010 4 / 37
Bits & Bytes . . 10010001100101110110011011001101111010110001000001010111110 111111100101101100110010001000010001010 . . Encodings
4 / 37
Character encoding . . A character encoding system consists of
a code that pairs each character from a given repertoire with something else. Wikipedia 5 / 37
ASCII § 7-bit code § Characters for the English alphabet.
6 / 37
ASCII § 7-bit code § Characters for the English alphabet.
6 / 37
Unicode . . Assign every possible character a unique code
point. 7 / 37
Unicode . . Assign every possible character a unique code
point. § A → U+0041 § a → U+0061 7 / 37
UTF-8 . . Just another character encoding for Unicode. 8
/ 37
Python 2.x 9 / 37
str s = 'A string' § Encoded strings. § ASCII
by default. 10 / 37
unicode # -*- coding: utf-8 -*- u = u'A string'
§ Strings stored in the internal representation. § Unicode literals 11 / 37
Conversion u.encode('UTF-8').decode('UTF-8') 12 / 37
Best practices § Always use unicode strings. § Decode in
input and encode in output. § Test against unicode strings. 13 / 37
Best practices § Always use unicode strings. § Decode in
input and encode in output. § Test against unicode strings. import codecs codecs.open(filename, encoding=encoding) 13 / 37
Python 3 § Strings and bytes 14 / 37
Python 3 § Strings and bytes (Unicode literals are back
in 3.3) 14 / 37
Python 3 § Strings and bytes (Unicode literals are back
in 3.3) § No need to use the codecs module any more. 14 / 37
i18n & l10n
Formats § Gettext (PO files) § TS files (Qt) §
YAML 16 / 37
Choice? Use a real format: § Plurals support § Context
§ Comments § Suggestions 17 / 37
Gettext § Mark translation strings. § Extract them (PO files).
§ Translate them. § Compile them (MO files). § Load in the application. 18 / 37
Source code . . https://github.com/mpessas/going_international/ 19 / 37
Initialization import gettext # Set up message catalog access t
= gettext.translation( 'myapplication', 'locale', fallback=True ) _ = t.ugettext 20 / 37
Usage def greet_user(user): print _(u'Hello, %s.') % user 21 /
37
Plurals children = {'John': 1, 'Mary': 3} def report_children(user): print
t.ungettext( u'You have %s child', u'You have %s children', children[user] ) % children[user] 22 / 37
Extract xgettext -d myapplication -o app.pot l10n.py vim app.pot 23
/ 37
POT file headers #, fuzzy msgid "" msgstr "" "Project-Id-Version:
0.1\n" "Report-Msgid-Bugs-To: http://github.com/mpessas/ going_international/issues\n" "POT-Creation-Date: 2012-06-30 09:45+0300\n" "PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n" "Last-Translator: FULL NAME <EMAIL@ADDRESS>\n" "Language-Team: LANGUAGE <
[email protected]
>\n" "Language: \n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "Plural-Forms: nplurals=INTEGER; plural=EXPRESSION;\n" 24 / 37
POT file content #: l10n.py:10 #, python-format msgid "Hello, %s."
msgstr "" #: l10n.py:17 #, python-format msgid "You have %s child" msgid_plural "You have %s children" msgstr[0] "" msgstr[1] "" 25 / 37
PO files mkdir -p locale/en/LC_MESSAGES/ msginit -i app.pot -o locale/en/LC_MESSAGES/en.po
-l en msgfmt locale/en/LC_MESSAGES/en.po -o \ locale/en/LC_MESSAGES/myapplication.mo mkdir -p locale/el/LC_MESSAGES/ msginit -i app.pot -o locale/el/LC_MESSAGES/el.po -l el vim locale/el/LC_MESSAGES/el.po msgfmt locale/el/LC_MESSAGES/el.po -o \ locale/el/LC_MESSAGES/myapplication.mo mkdir -p locale/it/LC_MESSAGES/ msginit -i app.pot -o locale/it/LC_MESSAGES/it.po -l it vim locale/it/LC_MESSAGES/it.po msgfmt locale/el/LC_MESSAGES/el.po -o \ locale/el/LC_MESSAGES/myapplication.mo 26 / 37
PO header msgid "" msgstr "" "Project-Id-Version: 0.1\n" "Report-Msgid-Bugs-To: \
http://github.com/mpessas/going_international/issues\n" "POT-Creation-Date: 2012-06-30 09:45+0300\n" "PO-Revision-Date: 2012-06-30 09:51+0300\n" "Last-Translator: <
[email protected]
>\n" "Language-Team: Italian\n" "Language: it\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "Plural-Forms: nplurals=2; plural=(n != 1);\n" 27 / 37
PO content #: l10n.py:10 #, python-format msgid "Hello, %s." msgstr
"Ciao, %s." #: l10n.py:17 #, python-format msgid "You have %s child" msgid_plural "You have %s children" msgstr[0] "" msgstr[1] "" 28 / 37
Execution bash> LANG=it python2 l10n.py Ciao, John. You have 1
child Ciao, Mary. You have 3 children 29 / 37
Plural equation for arabic n == 0 ? 0 :
n == 1 ? 1 : n == 2 ? 2 : n % 100 >= 3 && n % 100 <= 10 ? 3 : n % 100 >= 11 && n % 100 <= 99 ? 4 : 5 30 / 37
Timezone handling
The mess with timezones § Daylight Saving Time (DST) §
Past changes 32 / 37
UTC § Coordinated Universal Time § All timezones are based
on that. 33 / 37
UTC § Coordinated Universal Time § All timezones are based
on that. . . Internally, only use times based on UTC. Convert them to localtime on output. 33 / 37
datetime § Naive (does not have timezone information attached) §
Aware (has timezone information attached) 34 / 37
datetime § Naive (does not have timezone information attached) §
Aware (has timezone information attached) . . The two do not work together. 34 / 37
pytz § Timezone database § Saner conversions 35 / 37
Usage import pytz from datetime import datetime u = datetime.utcnow().replace(tzinfo=pytz.utc)
r = u.astimezone(pytz.timezone('Europe/Rome')) 36 / 37
Questions?