Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Going international
Search
Apostolis Bessas
July 03, 2012
Programming
2
95
Going international
issues with internationalizing your python application.
Apostolis Bessas
July 03, 2012
Tweet
Share
Other Decks in Programming
See All in Programming
Agentic AI: Evolution oder Revolution
mobilelarson
PRO
0
200
Codex CLIのSubagentsによる並列API実装 / Parallel API Implementation with Codex CLI Subagents
takatty
2
670
Laravel Nightwatchの裏側 - Laravel公式Observabilityツールを支える設計と実装
avosalmon
1
260
Smarter Angular mit Transformers.js & Prompt API
christianliebel
PRO
1
100
モックわからないマン卒業記 ~振る舞いを起点に見直した、フロントエンドテストにおけるモックの使いどころ~
tasukuwatanabe
3
430
20260313 - Grafana & Friends Taipei #1 - Kubernetes v1.36 的開發雜記:那些困在 Alpha 加護病房太久的 Metrics
tico88612
0
240
「効かない!」依存性注入(DI)を活用したAPI Platformのエラーハンドリング奮闘記
mkmk884
0
270
夢の無限スパゲッティ製造機 -実装篇- #phpstudy
o0h
PRO
0
160
Angular-Apps smarter machen mit Gen AI: Lokal und offlinefähig - Hands-on Workshop!
christianliebel
PRO
0
140
AI時代のシステム設計:ドメインモデルで変更しやすさを守る設計戦略
masuda220
PRO
6
1.1k
AI 開発合宿を通して得た学び
niftycorp
PRO
0
180
Nostalgia Meets Technology: Super Mario with TypeScript
manfredsteyer
PRO
0
110
Featured
See All Featured
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
32
2.6k
The Cult of Friendly URLs
andyhume
79
6.8k
Git: the NoSQL Database
bkeepers
PRO
432
67k
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
12
1.5k
How GitHub (no longer) Works
holman
316
150k
Principles of Awesome APIs and How to Build Them.
keavy
128
17k
Between Models and Reality
mayunak
2
240
Kristin Tynski - Automating Marketing Tasks With AI
techseoconnect
PRO
0
200
A Tale of Four Properties
chriscoyier
163
24k
Code Review Best Practice
trishagee
74
20k
Organizational Design Perspectives: An Ontology of Organizational Design Elements
kimpetersen
PRO
1
650
Gemini Prompt Engineering: Practical Techniques for Tangible AI Outcomes
mfonobong
2
330
Transcript
. . . . . . . Going International Apostolis
Bessas
[email protected]
@mpessas July 3, 2012
. . Transifex
Unicode
Bits & Bytes . . 10010001100101110110011011001101111010110001000001010111110 111111100101101100110010001000010001010 4 / 37
Bits & Bytes . . 10010001100101110110011011001101111010110001000001010111110 111111100101101100110010001000010001010 . . Encodings
4 / 37
Character encoding . . A character encoding system consists of
a code that pairs each character from a given repertoire with something else. Wikipedia 5 / 37
ASCII § 7-bit code § Characters for the English alphabet.
6 / 37
ASCII § 7-bit code § Characters for the English alphabet.
6 / 37
Unicode . . Assign every possible character a unique code
point. 7 / 37
Unicode . . Assign every possible character a unique code
point. § A → U+0041 § a → U+0061 7 / 37
UTF-8 . . Just another character encoding for Unicode. 8
/ 37
Python 2.x 9 / 37
str s = 'A string' § Encoded strings. § ASCII
by default. 10 / 37
unicode # -*- coding: utf-8 -*- u = u'A string'
§ Strings stored in the internal representation. § Unicode literals 11 / 37
Conversion u.encode('UTF-8').decode('UTF-8') 12 / 37
Best practices § Always use unicode strings. § Decode in
input and encode in output. § Test against unicode strings. 13 / 37
Best practices § Always use unicode strings. § Decode in
input and encode in output. § Test against unicode strings. import codecs codecs.open(filename, encoding=encoding) 13 / 37
Python 3 § Strings and bytes 14 / 37
Python 3 § Strings and bytes (Unicode literals are back
in 3.3) 14 / 37
Python 3 § Strings and bytes (Unicode literals are back
in 3.3) § No need to use the codecs module any more. 14 / 37
i18n & l10n
Formats § Gettext (PO files) § TS files (Qt) §
YAML 16 / 37
Choice? Use a real format: § Plurals support § Context
§ Comments § Suggestions 17 / 37
Gettext § Mark translation strings. § Extract them (PO files).
§ Translate them. § Compile them (MO files). § Load in the application. 18 / 37
Source code . . https://github.com/mpessas/going_international/ 19 / 37
Initialization import gettext # Set up message catalog access t
= gettext.translation( 'myapplication', 'locale', fallback=True ) _ = t.ugettext 20 / 37
Usage def greet_user(user): print _(u'Hello, %s.') % user 21 /
37
Plurals children = {'John': 1, 'Mary': 3} def report_children(user): print
t.ungettext( u'You have %s child', u'You have %s children', children[user] ) % children[user] 22 / 37
Extract xgettext -d myapplication -o app.pot l10n.py vim app.pot 23
/ 37
POT file headers #, fuzzy msgid "" msgstr "" "Project-Id-Version:
0.1\n" "Report-Msgid-Bugs-To: http://github.com/mpessas/ going_international/issues\n" "POT-Creation-Date: 2012-06-30 09:45+0300\n" "PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n" "Last-Translator: FULL NAME <EMAIL@ADDRESS>\n" "Language-Team: LANGUAGE <
[email protected]
>\n" "Language: \n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "Plural-Forms: nplurals=INTEGER; plural=EXPRESSION;\n" 24 / 37
POT file content #: l10n.py:10 #, python-format msgid "Hello, %s."
msgstr "" #: l10n.py:17 #, python-format msgid "You have %s child" msgid_plural "You have %s children" msgstr[0] "" msgstr[1] "" 25 / 37
PO files mkdir -p locale/en/LC_MESSAGES/ msginit -i app.pot -o locale/en/LC_MESSAGES/en.po
-l en msgfmt locale/en/LC_MESSAGES/en.po -o \ locale/en/LC_MESSAGES/myapplication.mo mkdir -p locale/el/LC_MESSAGES/ msginit -i app.pot -o locale/el/LC_MESSAGES/el.po -l el vim locale/el/LC_MESSAGES/el.po msgfmt locale/el/LC_MESSAGES/el.po -o \ locale/el/LC_MESSAGES/myapplication.mo mkdir -p locale/it/LC_MESSAGES/ msginit -i app.pot -o locale/it/LC_MESSAGES/it.po -l it vim locale/it/LC_MESSAGES/it.po msgfmt locale/el/LC_MESSAGES/el.po -o \ locale/el/LC_MESSAGES/myapplication.mo 26 / 37
PO header msgid "" msgstr "" "Project-Id-Version: 0.1\n" "Report-Msgid-Bugs-To: \
http://github.com/mpessas/going_international/issues\n" "POT-Creation-Date: 2012-06-30 09:45+0300\n" "PO-Revision-Date: 2012-06-30 09:51+0300\n" "Last-Translator: <
[email protected]
>\n" "Language-Team: Italian\n" "Language: it\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "Plural-Forms: nplurals=2; plural=(n != 1);\n" 27 / 37
PO content #: l10n.py:10 #, python-format msgid "Hello, %s." msgstr
"Ciao, %s." #: l10n.py:17 #, python-format msgid "You have %s child" msgid_plural "You have %s children" msgstr[0] "" msgstr[1] "" 28 / 37
Execution bash> LANG=it python2 l10n.py Ciao, John. You have 1
child Ciao, Mary. You have 3 children 29 / 37
Plural equation for arabic n == 0 ? 0 :
n == 1 ? 1 : n == 2 ? 2 : n % 100 >= 3 && n % 100 <= 10 ? 3 : n % 100 >= 11 && n % 100 <= 99 ? 4 : 5 30 / 37
Timezone handling
The mess with timezones § Daylight Saving Time (DST) §
Past changes 32 / 37
UTC § Coordinated Universal Time § All timezones are based
on that. 33 / 37
UTC § Coordinated Universal Time § All timezones are based
on that. . . Internally, only use times based on UTC. Convert them to localtime on output. 33 / 37
datetime § Naive (does not have timezone information attached) §
Aware (has timezone information attached) 34 / 37
datetime § Naive (does not have timezone information attached) §
Aware (has timezone information attached) . . The two do not work together. 34 / 37
pytz § Timezone database § Saner conversions 35 / 37
Usage import pytz from datetime import datetime u = datetime.utcnow().replace(tzinfo=pytz.utc)
r = u.astimezone(pytz.timezone('Europe/Rome')) 36 / 37
Questions?