Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Going international
Search
Apostolis Bessas
July 03, 2012
Programming
2
94
Going international
issues with internationalizing your python application.
Apostolis Bessas
July 03, 2012
Tweet
Share
Other Decks in Programming
See All in Programming
Automatic Grammar Agreementと Markdown Extended Attributes について
kishikawakatsumi
0
180
フルサイクルエンジニアリングをAI Agentで全自動化したい 〜構想と現在地〜
kamina_zzz
0
400
AI巻き込み型コードレビューのススメ
nealle
0
130
QAフローを最適化し、品質水準を満たしながらリリースまでの期間を最短化する #RSGT2026
shibayu36
2
4.3k
それ、本当に安全? ファイルアップロードで見落としがちなセキュリティリスクと対策
penpeen
7
2.4k
なるべく楽してバックエンドに型をつけたい!(楽とは言ってない)
hibiki_cube
0
140
Data-Centric Kaggle
isax1015
2
760
LLM Observabilityによる 対話型音声AIアプリケーションの安定運用
gekko0114
2
420
CSC307 Lecture 02
javiergs
PRO
1
770
CSC307 Lecture 01
javiergs
PRO
0
690
Oxlint JS plugins
kazupon
1
800
ThorVG Viewer In VS Code
nors
0
770
Featured
See All Featured
Deep Space Network (abreviated)
tonyrice
0
46
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
31
3.1k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
133
19k
AI Search: Implications for SEO and How to Move Forward - #ShenzhenSEOConference
aleyda
1
1.1k
Leveraging LLMs for student feedback in introductory data science courses - posit::conf(2025)
minecr
0
140
The Invisible Side of Design
smashingmag
302
51k
Bash Introduction
62gerente
615
210k
The B2B funnel & how to create a winning content strategy
katarinadahlin
PRO
0
270
Leadership Guide Workshop - DevTernity 2021
reverentgeek
1
200
ラッコキーワード サービス紹介資料
rakko
1
2.2M
Dominate Local Search Results - an insider guide to GBP, reviews, and Local SEO
greggifford
PRO
0
77
Marketing Yourself as an Engineer | Alaka | Gurzu
gurzu
0
130
Transcript
. . . . . . . Going International Apostolis
Bessas
[email protected]
@mpessas July 3, 2012
. . Transifex
Unicode
Bits & Bytes . . 10010001100101110110011011001101111010110001000001010111110 111111100101101100110010001000010001010 4 / 37
Bits & Bytes . . 10010001100101110110011011001101111010110001000001010111110 111111100101101100110010001000010001010 . . Encodings
4 / 37
Character encoding . . A character encoding system consists of
a code that pairs each character from a given repertoire with something else. Wikipedia 5 / 37
ASCII § 7-bit code § Characters for the English alphabet.
6 / 37
ASCII § 7-bit code § Characters for the English alphabet.
6 / 37
Unicode . . Assign every possible character a unique code
point. 7 / 37
Unicode . . Assign every possible character a unique code
point. § A → U+0041 § a → U+0061 7 / 37
UTF-8 . . Just another character encoding for Unicode. 8
/ 37
Python 2.x 9 / 37
str s = 'A string' § Encoded strings. § ASCII
by default. 10 / 37
unicode # -*- coding: utf-8 -*- u = u'A string'
§ Strings stored in the internal representation. § Unicode literals 11 / 37
Conversion u.encode('UTF-8').decode('UTF-8') 12 / 37
Best practices § Always use unicode strings. § Decode in
input and encode in output. § Test against unicode strings. 13 / 37
Best practices § Always use unicode strings. § Decode in
input and encode in output. § Test against unicode strings. import codecs codecs.open(filename, encoding=encoding) 13 / 37
Python 3 § Strings and bytes 14 / 37
Python 3 § Strings and bytes (Unicode literals are back
in 3.3) 14 / 37
Python 3 § Strings and bytes (Unicode literals are back
in 3.3) § No need to use the codecs module any more. 14 / 37
i18n & l10n
Formats § Gettext (PO files) § TS files (Qt) §
YAML 16 / 37
Choice? Use a real format: § Plurals support § Context
§ Comments § Suggestions 17 / 37
Gettext § Mark translation strings. § Extract them (PO files).
§ Translate them. § Compile them (MO files). § Load in the application. 18 / 37
Source code . . https://github.com/mpessas/going_international/ 19 / 37
Initialization import gettext # Set up message catalog access t
= gettext.translation( 'myapplication', 'locale', fallback=True ) _ = t.ugettext 20 / 37
Usage def greet_user(user): print _(u'Hello, %s.') % user 21 /
37
Plurals children = {'John': 1, 'Mary': 3} def report_children(user): print
t.ungettext( u'You have %s child', u'You have %s children', children[user] ) % children[user] 22 / 37
Extract xgettext -d myapplication -o app.pot l10n.py vim app.pot 23
/ 37
POT file headers #, fuzzy msgid "" msgstr "" "Project-Id-Version:
0.1\n" "Report-Msgid-Bugs-To: http://github.com/mpessas/ going_international/issues\n" "POT-Creation-Date: 2012-06-30 09:45+0300\n" "PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n" "Last-Translator: FULL NAME <EMAIL@ADDRESS>\n" "Language-Team: LANGUAGE <
[email protected]
>\n" "Language: \n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "Plural-Forms: nplurals=INTEGER; plural=EXPRESSION;\n" 24 / 37
POT file content #: l10n.py:10 #, python-format msgid "Hello, %s."
msgstr "" #: l10n.py:17 #, python-format msgid "You have %s child" msgid_plural "You have %s children" msgstr[0] "" msgstr[1] "" 25 / 37
PO files mkdir -p locale/en/LC_MESSAGES/ msginit -i app.pot -o locale/en/LC_MESSAGES/en.po
-l en msgfmt locale/en/LC_MESSAGES/en.po -o \ locale/en/LC_MESSAGES/myapplication.mo mkdir -p locale/el/LC_MESSAGES/ msginit -i app.pot -o locale/el/LC_MESSAGES/el.po -l el vim locale/el/LC_MESSAGES/el.po msgfmt locale/el/LC_MESSAGES/el.po -o \ locale/el/LC_MESSAGES/myapplication.mo mkdir -p locale/it/LC_MESSAGES/ msginit -i app.pot -o locale/it/LC_MESSAGES/it.po -l it vim locale/it/LC_MESSAGES/it.po msgfmt locale/el/LC_MESSAGES/el.po -o \ locale/el/LC_MESSAGES/myapplication.mo 26 / 37
PO header msgid "" msgstr "" "Project-Id-Version: 0.1\n" "Report-Msgid-Bugs-To: \
http://github.com/mpessas/going_international/issues\n" "POT-Creation-Date: 2012-06-30 09:45+0300\n" "PO-Revision-Date: 2012-06-30 09:51+0300\n" "Last-Translator: <
[email protected]
>\n" "Language-Team: Italian\n" "Language: it\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "Plural-Forms: nplurals=2; plural=(n != 1);\n" 27 / 37
PO content #: l10n.py:10 #, python-format msgid "Hello, %s." msgstr
"Ciao, %s." #: l10n.py:17 #, python-format msgid "You have %s child" msgid_plural "You have %s children" msgstr[0] "" msgstr[1] "" 28 / 37
Execution bash> LANG=it python2 l10n.py Ciao, John. You have 1
child Ciao, Mary. You have 3 children 29 / 37
Plural equation for arabic n == 0 ? 0 :
n == 1 ? 1 : n == 2 ? 2 : n % 100 >= 3 && n % 100 <= 10 ? 3 : n % 100 >= 11 && n % 100 <= 99 ? 4 : 5 30 / 37
Timezone handling
The mess with timezones § Daylight Saving Time (DST) §
Past changes 32 / 37
UTC § Coordinated Universal Time § All timezones are based
on that. 33 / 37
UTC § Coordinated Universal Time § All timezones are based
on that. . . Internally, only use times based on UTC. Convert them to localtime on output. 33 / 37
datetime § Naive (does not have timezone information attached) §
Aware (has timezone information attached) 34 / 37
datetime § Naive (does not have timezone information attached) §
Aware (has timezone information attached) . . The two do not work together. 34 / 37
pytz § Timezone database § Saner conversions 35 / 37
Usage import pytz from datetime import datetime u = datetime.utcnow().replace(tzinfo=pytz.utc)
r = u.astimezone(pytz.timezone('Europe/Rome')) 36 / 37
Questions?