Slide 1

Slide 1 text

Internationalization and Localization Done Right Ruchi Varshney PYCON 2013

Slide 2

Slide 2 text

Overview § Internationalization § Localization § Testing and Maintenance § Tips and Gotchas

Slide 3

Slide 3 text

Internationalization vs Localization § Internationalization (I18N) is the process of prepping your app to handle multiple languages § Localization (L10N) is the process of adding the right resources to handle a particular language in your app

Slide 4

Slide 4 text

Why Do It? § Support an international audience § Easy to make your app ready for localization from day one § Drive some good code practices

Slide 5

Slide 5 text

Internationalization (I18N) Python Source, Templates and JS

Slide 6

Slide 6 text

Python I18N with gettext § gettext provides the baseline for all internationalization in Python § Wrapper around the GNU gettext catalog API § Requires the GNU gettext package on all your server machines

Slide 7

Slide 7 text

Python I18N with gettext • Django provides a simple wrapper around Python gettext functions • from django.utils.translation import ugettext, ungettext • # Simple • msg = ugettext("Today is %(month)s %(day)s.") % {'month': m, 'day': d} • # Plural • msg = ungettext("%(num)d apple", "%(num)d apples", count) % {'num': count}

Slide 8

Slide 8 text

Django Template I18N • Django templates provide trans and blocktrans tags • {% trans "My Fruit Store" %} • {% blocktrans %} • Click herefor more info • {% endblocktrans %} • {% blocktrans count counter=apple|length %} • Checkout my apple • {% plural %} • Checkout my {{ counter }} apples • {% endblocktrans %}

Slide 9

Slide 9 text

Handling Dates and Numbers • Babel is a package that provides standard date and number formatting • $ pip install pybabel • from babel import dates, numbers • print dates.format_datetime(date, format='full', locale='de', tzinfo=tz) • >> Sonntag, 17. Februar 2013 06:30:00 Vereinigte Staaten (Los Angeles) • print numbers.format_decimal(1.234, locale='de') • >> 1,234

Slide 10

Slide 10 text

JavaScript I18N § JavaScript does not have access to gettext functions by default § Django provides a view that returns a JS library with gettext, ngettext and interpolate functions • urlpatterns = patterns('', • (r'^jsi18n/$', 'django.views.i18n.javascript_catalog') • ) • • // Named interpolation • var msg = interpolate(gettext("Welcome, %(user)s!"), {user: name}, true);

Slide 11

Slide 11 text

Locale Detection § Django LocaleMiddleware determines the locale to activate § request.session['django_language'] § django_language cookie § Accept-Language HTTP header set by the browser based on language preferences § settings.LANGUAGE_CODE = 'en-us' § If you save user language preferences in your database, override the request session key in your own middleware class – – request.session['django_language'] = request.user.lang_pref

Slide 12

Slide 12 text

Localization (L10N) with Babel

Slide 13

Slide 13 text

Localization Process _("Hello") Source Files "Hello" PO File Translation Service "Bonjour" PO File Django App Message Catalog MO File Website Bonjour Babel Extraction Translation Babel Catalog Init/Update Babel Compile FTP

Slide 14

Slide 14 text

String Extraction with Babel • Babel also handles string extraction from source code • $ pybabel extract --mapping-file babel.cfg --output out.po • # babel.cfg • [python: **.py] • [django: **/templates/**.html] • [extractors] • python = babel.messages.extract:extract_python • django = babeldjango.extract:extract_django

Slide 15

Slide 15 text

JavaScript String Extraction • JS strings are extracted in a different domain to optimize the number of strings sent from the server • $ pybabel extract --mapping-file babel_js.cfg --output out_js.po . • # babel_js.cfg • [javascript: **.js] • extract_messages = $._, jQuery._ • [extractors] • javascript = babel.messages.extract:extract_javascript

Slide 16

Slide 16 text

Message Catalog (PO) File Format • # Translations for MYAPP. • # Copyright (c) 2013 ORGANIZATION • #: app/views.py:20 • #, python-format • msgid "Welcome, %(username)s!" • msgstr "" • #: templates/app.html:35 • msgid "%(num)d apple." • msgid_plural "%(num)d apples." • msgstr[0] "" • msgstr[1] ""

Slide 17

Slide 17 text

Translation • Message catalog (PO) files are shipped to external translation services or translation APIs to get back translations • # German Translations for MYAPP. • # Copyright (c) 2013 ORGANIZATION • #: app/views.py:20 • #, python-format • msgid "Welcome, %(username)s!" • msgstr "Willkommen, %(username)s!” • ...

Slide 18

Slide 18 text

Integrating Translations § Django app catalog is initialized or updated with the received translation file • $ pybabel --domain django --locale de --input-file • • updating catalog /de/LC_MESSAGES/django.po based on input.po • updating catalog /de/LC_MESSAGES/djangojs.po based on inputjs.po § Catalog is then compiled into efficient machine binaries (.mo) files • • $ pybabel compile --domain django --locale de • • compiling catalog to /de/LC_MESSAGES/django.mo • compiling catalog to /de/LC_MESSAGES/djangojs.mo

Slide 19

Slide 19 text

Testing and Maintenance

Slide 20

Slide 20 text

Test Translations • Enable a test browser locale in Django settings and generate test translations with Potpie • $ pip install potpie • $ potpie --type in.po out.po • = brackets, planguage, unicode, extend, mixed • #: Potpie mixed mode translation • #: templates/settings.html:18 • msgid "User Settings for %(username)s" • msgstr "[Ŭşḗř Şḗŧŧīƞɠş ƒǿř %(username)s ẛLjϖDž 衋 ſNjΐϕ]"

Slide 21

Slide 21 text

Test Translations • Find missing strings and push the limits of your UI testing with extended length unicode strings

Slide 22

Slide 22 text

Working with Translation Services § Integration with third-party services § File transfers, API integrations, PM intervention § Latency § Control feature roll-out by locale § Intermediate translations through Google APIs § Cost § Translation Memory § Scales better as you get translations for more locales

Slide 23

Slide 23 text

Tips and Gotchas

Slide 24

Slide 24 text

Comments for Translators • Translators need context for strings in your application • # TRANSLATOR: The date fruit, not calendar date • str = ugettext("%(num)s dates") % {'num': date_count} • {% comment %}TRANSLATOR: The date fruit, not calendar date{% endcomment %} • • • $ pybabel extract --add-comments TRANSLATOR: • #. TRANSLATOR: The date fruit, not calendar date • #: fruits/app.py:20 • msgid "%(num)s dates"

Slide 25

Slide 25 text

Same But Different? • What if we needed the same word but in different contexts? • # Use pgettext/npgettext to add context • fruit_str = pgettext("Date Fruit", "Date") • calendar_str = pgettext("Calendar Date", "Date") • # Message catalog has msgctxt to ensure unique mapping • #: fruits/app.py:20 • msgctxt "Date Fruit" • msgid "Date" • msgstr "" • ...

Slide 26

Slide 26 text

Custom Template Tags • Babel does not parse translations that are in custom or extension template tags • {% autoescape false %} • {% trans %}"This text is safe, but Babel might miss me"{% endtrans %} • {% endautoescape %} • # Add comma-separated extensions option in babel.cfg • [jinja2: **.html] • extensions=jinja2.ext.autoescape,...

Slide 27

Slide 27 text

Lazy Translations § Locale is unknown at module load time when constants, model and form fields are initialized § ugettext_lazy returns lazy string references that can be later evaluated in a locale-aware context • from django.utils.translation import ugettext_lazy • • class MyFruit(models.Model): • name = models.CharField(help_text=ugettext_lazy('Name your fruit!'))

Slide 28

Slide 28 text

Lazy Translations § Lazy string references are proxy objects that do not know how to convert themselves to bytestring • In [1]: ugettext_lazy("Hello") • Out[1]: § Watch out for lazy string concats, bytestring interpolation, exception handlers and JSON encoding (needs LazyEncoder) • In [1]: "Hello %s" % ugettext_lazy("World") • Out[1]: 'Hello '

Slide 29

Slide 29 text

Import Aliases • Babel does not detect gettext import aliases since it simply parses through files one line at a time • • # Babel will miss this by default • from django.utils.translation import ugettext_lazy as _lazy • lazy_str = _lazy("Lazy") • • # Need to explicitly mention other aliases • pybabel extract --keyword "_lazy"

Slide 30

Slide 30 text

Smarter JavaScript I18N • JS view function runs on every request. Browser can cache JS files, so it can be pre-generated at deploy time and served statically instead. • // i18n_de.js • var catalog = new Array(); • catalog["Welcome, %(user)s!"] = "Willkommen, %(user)s!”; • ... • function gettext(msgid) {...}; • function ngettext(singular, plural, count) {...}; • function interpolate(fmt, obj, named) {...};

Slide 31

Slide 31 text

Database Strings § Avoid storing strings that need translation, build rich enum classes instead • # Maps enum values, canonical names, display names • class MyFruitEnum(CoolEnum): • APPLE = _MyFruitValue(1, 'apple', ugettext_lazy('Apple')) • BANANA = _MyFruitValue(2, 'banana', ugettext_lazy('Banana')) § If you really need to do so, use django-dbgettext to make DB strings available in the message catalog

Slide 32

Slide 32 text

Takeaways § Babel and Potpie are powerful internationalization tools for Python apps § Support for JS internationalization works really great for mobile web apps § Internationalize early and be ready for an international audience from day one!

Slide 33

Slide 33 text

Other Resources § Python gettext § http://docs.python.org/dev/library/gettext § Babel § http://babel.edgewall.org § Django i18n § https://docs.djangoproject.com/en/dev/topics/i18n

Slide 34

Slide 34 text

Other Resources § Potpie § http://pypi.python.org/pypi/potpie § Lazy JSON Encoding § https://docs.djangoproject.com/en/dev/topics/serialization/#id2 § Jinja2 i18n § http://jinja.pocoo.org/docs/integration

Slide 35

Slide 35 text

Questions? [email protected] @rvarshney github.com/rvarshney Slides @ http://bit.ly/pyconi18n