Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Internationalization and Localization Done Right by Ruchi Varshney

PyCon 2013
March 18, 2013

Internationalization and Localization Done Right by Ruchi Varshney

PyCon 2013

March 18, 2013
Tweet

More Decks by PyCon 2013

Other Decks in Technology

Transcript

  1. Internationalization vs Localization § Internationalization (I18N) is the process of

    prepping your app to handle multiple languages § Localization (L10N) is the process of adding the right resources to handle a particular language in your app
  2. Why Do It? § Support an international audience § Easy

    to make your app ready for localization from day one § Drive some good code practices
  3. Python I18N with gettext § gettext provides the baseline for

    all internationalization in Python § Wrapper around the GNU gettext catalog API § Requires the GNU gettext package on all your server machines
  4. Python I18N with gettext • Django provides a simple wrapper

    around Python gettext functions • from django.utils.translation import ugettext, ungettext • # Simple • msg = ugettext("Today is %(month)s %(day)s.") % {'month': m, 'day': d} • # Plural • msg = ungettext("%(num)d apple", "%(num)d apples", count) % {'num': count}
  5. Django Template I18N • Django templates provide trans and blocktrans

    tags • {% trans "My Fruit Store" %} • {% blocktrans %} • Click <a href="{{ link }}">here</a>for more info • {% endblocktrans %} • {% blocktrans count counter=apple|length %} • Checkout my apple • {% plural %} • Checkout my {{ counter }} apples • {% endblocktrans %}
  6. Handling Dates and Numbers • Babel is a package that

    provides standard date and number formatting • $ pip install pybabel • from babel import dates, numbers • print dates.format_datetime(date, format='full', locale='de', tzinfo=tz) • >> Sonntag, 17. Februar 2013 06:30:00 Vereinigte Staaten (Los Angeles) • print numbers.format_decimal(1.234, locale='de') • >> 1,234
  7. JavaScript I18N § JavaScript does not have access to gettext

    functions by default § Django provides a view that returns a JS library with gettext, ngettext and interpolate functions • urlpatterns = patterns('', • (r'^jsi18n/$', 'django.views.i18n.javascript_catalog') • ) • • // Named interpolation • var msg = interpolate(gettext("Welcome, %(user)s!"), {user: name}, true);
  8. Locale Detection § Django LocaleMiddleware determines the locale to activate

    § request.session['django_language'] § django_language cookie § Accept-Language HTTP header set by the browser based on language preferences § settings.LANGUAGE_CODE = 'en-us' § If you save user language preferences in your database, override the request session key in your own middleware class – – request.session['django_language'] = request.user.lang_pref
  9. Localization Process _("Hello") Source Files "Hello" PO File Translation Service

    "Bonjour" PO File Django App Message Catalog MO File Website Bonjour Babel Extraction Translation Babel Catalog Init/Update Babel Compile FTP
  10. String Extraction with Babel • Babel also handles string extraction

    from source code • $ pybabel extract --mapping-file babel.cfg --output out.po <APP_DIR> • # babel.cfg • [python: **.py] • [django: **/templates/**.html] • [extractors] • python = babel.messages.extract:extract_python • django = babeldjango.extract:extract_django
  11. JavaScript String Extraction • JS strings are extracted in a

    different domain to optimize the number of strings sent from the server • $ pybabel extract --mapping-file babel_js.cfg --output out_js.po . • # babel_js.cfg • [javascript: **.js] • extract_messages = $._, jQuery._ • [extractors] • javascript = babel.messages.extract:extract_javascript
  12. Message Catalog (PO) File Format • # Translations for MYAPP.

    • # Copyright (c) 2013 ORGANIZATION • #: app/views.py:20 • #, python-format • msgid "Welcome, %(username)s!" • msgstr "" • #: templates/app.html:35 • msgid "%(num)d apple." • msgid_plural "%(num)d apples." • msgstr[0] "" • msgstr[1] ""
  13. Translation • Message catalog (PO) files are shipped to external

    translation services or translation APIs to get back translations • # German Translations for MYAPP. • # Copyright (c) 2013 ORGANIZATION • #: app/views.py:20 • #, python-format • msgid "Welcome, %(username)s!" • msgstr "Willkommen, %(username)s!” • ...
  14. Integrating Translations § Django app catalog is initialized or updated

    with the received translation file • $ pybabel <init/update> --domain django<js> --locale de --input-file <po> • • updating catalog <LOCALE_PATH>/de/LC_MESSAGES/django.po based on input.po • updating catalog <LOCALE_PATH>/de/LC_MESSAGES/djangojs.po based on inputjs.po § Catalog is then compiled into efficient machine binaries (.mo) files • • $ pybabel compile --domain django<js> --locale de • • compiling catalog to <LOCALE_PATH>/de/LC_MESSAGES/django.mo • compiling catalog to <LOCALE_PATH>/de/LC_MESSAGES/djangojs.mo
  15. Test Translations • Enable a test browser locale in Django

    settings and generate test translations with Potpie • $ pip install potpie • $ potpie --type <type> in.po out.po • <type> = brackets, planguage, unicode, extend, mixed • #: Potpie mixed mode translation • #: templates/settings.html:18 • msgid "User Settings for %(username)s" • msgstr "[Ŭşḗř Şḗŧŧīƞɠş ƒǿř %(username)s ẛLjϖDž 衋 ſNjΐϕ]"
  16. Test Translations • Find missing strings and push the limits

    of your UI testing with extended length unicode strings
  17. Working with Translation Services § Integration with third-party services §

    File transfers, API integrations, PM intervention § Latency § Control feature roll-out by locale § Intermediate translations through Google APIs § Cost § Translation Memory § Scales better as you get translations for more locales
  18. Comments for Translators • Translators need context for strings in

    your application • # TRANSLATOR: The date fruit, not calendar date • str = ugettext("%(num)s dates") % {'num': date_count} • {% comment %}TRANSLATOR: The date fruit, not calendar date{% endcomment %} • • • $ pybabel extract --add-comments TRANSLATOR: <options> <dir> • #. TRANSLATOR: The date fruit, not calendar date • #: fruits/app.py:20 • msgid "%(num)s dates"
  19. Same But Different? • What if we needed the same

    word but in different contexts? • # Use pgettext/npgettext to add context • fruit_str = pgettext("Date Fruit", "Date") • calendar_str = pgettext("Calendar Date", "Date") • # Message catalog has msgctxt to ensure unique mapping • #: fruits/app.py:20 • msgctxt "Date Fruit" • msgid "Date" • msgstr "" • ...
  20. Custom Template Tags • Babel does not parse translations that

    are in custom or extension template tags • {% autoescape false %} • {% trans %}"This text is safe, but Babel might miss me"{% endtrans %} • {% endautoescape %} • # Add comma-separated extensions option in babel.cfg • [jinja2: **.html] • extensions=jinja2.ext.autoescape,...
  21. Lazy Translations § Locale is unknown at module load time

    when constants, model and form fields are initialized § ugettext_lazy returns lazy string references that can be later evaluated in a locale-aware context • from django.utils.translation import ugettext_lazy • • class MyFruit(models.Model): • name = models.CharField(help_text=ugettext_lazy('Name your fruit!'))
  22. Lazy Translations § Lazy string references are proxy objects that

    do not know how to convert themselves to bytestring • In [1]: ugettext_lazy("Hello") • Out[1]: <django.utils.functional.__proxy__ at 0x6c8ec90> § Watch out for lazy string concats, bytestring interpolation, exception handlers and JSON encoding (needs LazyEncoder) • In [1]: "Hello %s" % ugettext_lazy("World") • Out[1]: 'Hello <django.utils.functional.__proxy__ object at 0x6d7c250>'
  23. Import Aliases • Babel does not detect gettext import aliases

    since it simply parses through files one line at a time • • # Babel will miss this by default • from django.utils.translation import ugettext_lazy as _lazy • lazy_str = _lazy("Lazy") • • # Need to explicitly mention other aliases • pybabel extract --keyword "_lazy" <options> <dir>
  24. Smarter JavaScript I18N • JS view function runs on every

    request. Browser can cache JS files, so it can be pre-generated at deploy time and served statically instead. • // i18n_de.js • var catalog = new Array(); • catalog["Welcome, %(user)s!"] = "Willkommen, %(user)s!”; • ... • function gettext(msgid) {...}; • function ngettext(singular, plural, count) {...}; • function interpolate(fmt, obj, named) {...};
  25. Database Strings § Avoid storing strings that need translation, build

    rich enum classes instead • # Maps enum values, canonical names, display names • class MyFruitEnum(CoolEnum): • APPLE = _MyFruitValue(1, 'apple', ugettext_lazy('Apple')) • BANANA = _MyFruitValue(2, 'banana', ugettext_lazy('Banana')) § If you really need to do so, use django-dbgettext to make DB strings available in the message catalog
  26. Takeaways § Babel and Potpie are powerful internationalization tools for

    Python apps § Support for JS internationalization works really great for mobile web apps § Internationalize early and be ready for an international audience from day one!
  27. Other Resources § Python gettext § http://docs.python.org/dev/library/gettext § Babel §

    http://babel.edgewall.org § Django i18n § https://docs.djangoproject.com/en/dev/topics/i18n
  28. Other Resources § Potpie § http://pypi.python.org/pypi/potpie § Lazy JSON Encoding

    § https://docs.djangoproject.com/en/dev/topics/serialization/#id2 § Jinja2 i18n § http://jinja.pocoo.org/docs/integration