Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Going international
Search
Apostolis Bessas
July 03, 2012
Programming
2
92
Going international
issues with internationalizing your python application.
Apostolis Bessas
July 03, 2012
Tweet
Share
Other Decks in Programming
See All in Programming
公共交通オープンデータ × モバイルUX 複雑な運行情報を 『直感』に変換する技術
tinykitten
PRO
0
180
実は歴史的なアップデートだと思う AWS Interconnect - multicloud
maroon1st
0
310
ZJIT: The Ruby 4 JIT Compiler / Ruby Release 30th Anniversary Party
k0kubun
1
310
AI Agent Dojo #4: watsonx Orchestrate ADK体験
oniak3ibm
PRO
0
120
AI前提で考えるiOSアプリのモダナイズ設計
yuukiw00w
0
210
perlをWebAssembly上で動かすと何が嬉しいの??? / Where does Perl-on-Wasm actually make sense?
mackee
0
300
Findy AI+の開発、運用におけるMCP活用事例
starfish719
0
2.1k
The Art of Re-Architecture - Droidcon India 2025
siddroid
0
160
Canon EOS R50 V と R5 Mark II 購入でみえてきた最近のデジイチ VR180 事情、そして VR180 静止画に活路を見出すまで
karad
0
140
gunshi
kazupon
1
140
Patterns of Patterns
denyspoltorak
0
420
Go コードベースの構成と AI コンテキスト定義
andpad
0
160
Featured
See All Featured
Digital Ethics as a Driver of Design Innovation
axbom
PRO
0
140
Efficient Content Optimization with Google Search Console & Apps Script
katarinadahlin
PRO
0
280
DBのスキルで生き残る技術 - AI時代におけるテーブル設計の勘所
soudai
PRO
61
48k
Documentation Writing (for coders)
carmenintech
77
5.2k
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
25
1.7k
Dominate Local Search Results - an insider guide to GBP, reviews, and Local SEO
greggifford
PRO
0
31
Unsuck your backbone
ammeep
671
58k
A brief & incomplete history of UX Design for the World Wide Web: 1989–2019
jct
1
270
A Guide to Academic Writing Using Generative AI - A Workshop
ks91
PRO
0
170
How to Ace a Technical Interview
jacobian
281
24k
How to Build an AI Search Optimization Roadmap - Criteria and Steps to Take #SEOIRL
aleyda
1
1.8k
Lessons Learnt from Crawling 1000+ Websites
charlesmeaden
PRO
0
1k
Transcript
. . . . . . . Going International Apostolis
Bessas
[email protected]
@mpessas July 3, 2012
. . Transifex
Unicode
Bits & Bytes . . 10010001100101110110011011001101111010110001000001010111110 111111100101101100110010001000010001010 4 / 37
Bits & Bytes . . 10010001100101110110011011001101111010110001000001010111110 111111100101101100110010001000010001010 . . Encodings
4 / 37
Character encoding . . A character encoding system consists of
a code that pairs each character from a given repertoire with something else. Wikipedia 5 / 37
ASCII § 7-bit code § Characters for the English alphabet.
6 / 37
ASCII § 7-bit code § Characters for the English alphabet.
6 / 37
Unicode . . Assign every possible character a unique code
point. 7 / 37
Unicode . . Assign every possible character a unique code
point. § A → U+0041 § a → U+0061 7 / 37
UTF-8 . . Just another character encoding for Unicode. 8
/ 37
Python 2.x 9 / 37
str s = 'A string' § Encoded strings. § ASCII
by default. 10 / 37
unicode # -*- coding: utf-8 -*- u = u'A string'
§ Strings stored in the internal representation. § Unicode literals 11 / 37
Conversion u.encode('UTF-8').decode('UTF-8') 12 / 37
Best practices § Always use unicode strings. § Decode in
input and encode in output. § Test against unicode strings. 13 / 37
Best practices § Always use unicode strings. § Decode in
input and encode in output. § Test against unicode strings. import codecs codecs.open(filename, encoding=encoding) 13 / 37
Python 3 § Strings and bytes 14 / 37
Python 3 § Strings and bytes (Unicode literals are back
in 3.3) 14 / 37
Python 3 § Strings and bytes (Unicode literals are back
in 3.3) § No need to use the codecs module any more. 14 / 37
i18n & l10n
Formats § Gettext (PO files) § TS files (Qt) §
YAML 16 / 37
Choice? Use a real format: § Plurals support § Context
§ Comments § Suggestions 17 / 37
Gettext § Mark translation strings. § Extract them (PO files).
§ Translate them. § Compile them (MO files). § Load in the application. 18 / 37
Source code . . https://github.com/mpessas/going_international/ 19 / 37
Initialization import gettext # Set up message catalog access t
= gettext.translation( 'myapplication', 'locale', fallback=True ) _ = t.ugettext 20 / 37
Usage def greet_user(user): print _(u'Hello, %s.') % user 21 /
37
Plurals children = {'John': 1, 'Mary': 3} def report_children(user): print
t.ungettext( u'You have %s child', u'You have %s children', children[user] ) % children[user] 22 / 37
Extract xgettext -d myapplication -o app.pot l10n.py vim app.pot 23
/ 37
POT file headers #, fuzzy msgid "" msgstr "" "Project-Id-Version:
0.1\n" "Report-Msgid-Bugs-To: http://github.com/mpessas/ going_international/issues\n" "POT-Creation-Date: 2012-06-30 09:45+0300\n" "PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n" "Last-Translator: FULL NAME <EMAIL@ADDRESS>\n" "Language-Team: LANGUAGE <
[email protected]
>\n" "Language: \n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "Plural-Forms: nplurals=INTEGER; plural=EXPRESSION;\n" 24 / 37
POT file content #: l10n.py:10 #, python-format msgid "Hello, %s."
msgstr "" #: l10n.py:17 #, python-format msgid "You have %s child" msgid_plural "You have %s children" msgstr[0] "" msgstr[1] "" 25 / 37
PO files mkdir -p locale/en/LC_MESSAGES/ msginit -i app.pot -o locale/en/LC_MESSAGES/en.po
-l en msgfmt locale/en/LC_MESSAGES/en.po -o \ locale/en/LC_MESSAGES/myapplication.mo mkdir -p locale/el/LC_MESSAGES/ msginit -i app.pot -o locale/el/LC_MESSAGES/el.po -l el vim locale/el/LC_MESSAGES/el.po msgfmt locale/el/LC_MESSAGES/el.po -o \ locale/el/LC_MESSAGES/myapplication.mo mkdir -p locale/it/LC_MESSAGES/ msginit -i app.pot -o locale/it/LC_MESSAGES/it.po -l it vim locale/it/LC_MESSAGES/it.po msgfmt locale/el/LC_MESSAGES/el.po -o \ locale/el/LC_MESSAGES/myapplication.mo 26 / 37
PO header msgid "" msgstr "" "Project-Id-Version: 0.1\n" "Report-Msgid-Bugs-To: \
http://github.com/mpessas/going_international/issues\n" "POT-Creation-Date: 2012-06-30 09:45+0300\n" "PO-Revision-Date: 2012-06-30 09:51+0300\n" "Last-Translator: <
[email protected]
>\n" "Language-Team: Italian\n" "Language: it\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "Plural-Forms: nplurals=2; plural=(n != 1);\n" 27 / 37
PO content #: l10n.py:10 #, python-format msgid "Hello, %s." msgstr
"Ciao, %s." #: l10n.py:17 #, python-format msgid "You have %s child" msgid_plural "You have %s children" msgstr[0] "" msgstr[1] "" 28 / 37
Execution bash> LANG=it python2 l10n.py Ciao, John. You have 1
child Ciao, Mary. You have 3 children 29 / 37
Plural equation for arabic n == 0 ? 0 :
n == 1 ? 1 : n == 2 ? 2 : n % 100 >= 3 && n % 100 <= 10 ? 3 : n % 100 >= 11 && n % 100 <= 99 ? 4 : 5 30 / 37
Timezone handling
The mess with timezones § Daylight Saving Time (DST) §
Past changes 32 / 37
UTC § Coordinated Universal Time § All timezones are based
on that. 33 / 37
UTC § Coordinated Universal Time § All timezones are based
on that. . . Internally, only use times based on UTC. Convert them to localtime on output. 33 / 37
datetime § Naive (does not have timezone information attached) §
Aware (has timezone information attached) 34 / 37
datetime § Naive (does not have timezone information attached) §
Aware (has timezone information attached) . . The two do not work together. 34 / 37
pytz § Timezone database § Saner conversions 35 / 37
Usage import pytz from datetime import datetime u = datetime.utcnow().replace(tzinfo=pytz.utc)
r = u.astimezone(pytz.timezone('Europe/Rome')) 36 / 37
Questions?