Internationalization and Localization Done Right by Ruchi Varshney

Internationalization and Localization Done Right by Ruchi Varshney

Afcfefa1f067d10bd021de0cc2e5e806?s=128

PyCon 2013

March 18, 2013
Tweet

Transcript

  1. Internationalization and Localization Done Right Ruchi Varshney PYCON 2013

  2. Overview § Internationalization § Localization § Testing and Maintenance §

    Tips and Gotchas
  3. Internationalization vs Localization § Internationalization (I18N) is the process of

    prepping your app to handle multiple languages § Localization (L10N) is the process of adding the right resources to handle a particular language in your app
  4. Why Do It? § Support an international audience § Easy

    to make your app ready for localization from day one § Drive some good code practices
  5. Internationalization (I18N) Python Source, Templates and JS

  6. Python I18N with gettext § gettext provides the baseline for

    all internationalization in Python § Wrapper around the GNU gettext catalog API § Requires the GNU gettext package on all your server machines
  7. Python I18N with gettext • Django provides a simple wrapper

    around Python gettext functions • from django.utils.translation import ugettext, ungettext • # Simple • msg = ugettext("Today is %(month)s %(day)s.") % {'month': m, 'day': d} • # Plural • msg = ungettext("%(num)d apple", "%(num)d apples", count) % {'num': count}
  8. Django Template I18N • Django templates provide trans and blocktrans

    tags • {% trans "My Fruit Store" %} • {% blocktrans %} • Click <a href="{{ link }}">here</a>for more info • {% endblocktrans %} • {% blocktrans count counter=apple|length %} • Checkout my apple • {% plural %} • Checkout my {{ counter }} apples • {% endblocktrans %}
  9. Handling Dates and Numbers • Babel is a package that

    provides standard date and number formatting • $ pip install pybabel • from babel import dates, numbers • print dates.format_datetime(date, format='full', locale='de', tzinfo=tz) • >> Sonntag, 17. Februar 2013 06:30:00 Vereinigte Staaten (Los Angeles) • print numbers.format_decimal(1.234, locale='de') • >> 1,234
  10. JavaScript I18N § JavaScript does not have access to gettext

    functions by default § Django provides a view that returns a JS library with gettext, ngettext and interpolate functions • urlpatterns = patterns('', • (r'^jsi18n/$', 'django.views.i18n.javascript_catalog') • ) • • // Named interpolation • var msg = interpolate(gettext("Welcome, %(user)s!"), {user: name}, true);
  11. Locale Detection § Django LocaleMiddleware determines the locale to activate

    § request.session['django_language'] § django_language cookie § Accept-Language HTTP header set by the browser based on language preferences § settings.LANGUAGE_CODE = 'en-us' § If you save user language preferences in your database, override the request session key in your own middleware class – – request.session['django_language'] = request.user.lang_pref
  12. Localization (L10N) with Babel

  13. Localization Process _("Hello") Source Files "Hello" PO File Translation Service

    "Bonjour" PO File Django App Message Catalog MO File Website Bonjour Babel Extraction Translation Babel Catalog Init/Update Babel Compile FTP
  14. String Extraction with Babel • Babel also handles string extraction

    from source code • $ pybabel extract --mapping-file babel.cfg --output out.po <APP_DIR> • # babel.cfg • [python: **.py] • [django: **/templates/**.html] • [extractors] • python = babel.messages.extract:extract_python • django = babeldjango.extract:extract_django
  15. JavaScript String Extraction • JS strings are extracted in a

    different domain to optimize the number of strings sent from the server • $ pybabel extract --mapping-file babel_js.cfg --output out_js.po . • # babel_js.cfg • [javascript: **.js] • extract_messages = $._, jQuery._ • [extractors] • javascript = babel.messages.extract:extract_javascript
  16. Message Catalog (PO) File Format • # Translations for MYAPP.

    • # Copyright (c) 2013 ORGANIZATION • #: app/views.py:20 • #, python-format • msgid "Welcome, %(username)s!" • msgstr "" • #: templates/app.html:35 • msgid "%(num)d apple." • msgid_plural "%(num)d apples." • msgstr[0] "" • msgstr[1] ""
  17. Translation • Message catalog (PO) files are shipped to external

    translation services or translation APIs to get back translations • # German Translations for MYAPP. • # Copyright (c) 2013 ORGANIZATION • #: app/views.py:20 • #, python-format • msgid "Welcome, %(username)s!" • msgstr "Willkommen, %(username)s!” • ...
  18. Integrating Translations § Django app catalog is initialized or updated

    with the received translation file • $ pybabel <init/update> --domain django<js> --locale de --input-file <po> • • updating catalog <LOCALE_PATH>/de/LC_MESSAGES/django.po based on input.po • updating catalog <LOCALE_PATH>/de/LC_MESSAGES/djangojs.po based on inputjs.po § Catalog is then compiled into efficient machine binaries (.mo) files • • $ pybabel compile --domain django<js> --locale de • • compiling catalog to <LOCALE_PATH>/de/LC_MESSAGES/django.mo • compiling catalog to <LOCALE_PATH>/de/LC_MESSAGES/djangojs.mo
  19. Testing and Maintenance

  20. Test Translations • Enable a test browser locale in Django

    settings and generate test translations with Potpie • $ pip install potpie • $ potpie --type <type> in.po out.po • <type> = brackets, planguage, unicode, extend, mixed • #: Potpie mixed mode translation • #: templates/settings.html:18 • msgid "User Settings for %(username)s" • msgstr "[Ŭşḗř Şḗŧŧīƞɠş ƒǿř %(username)s ẛLjϖDž 衋 ſNjΐϕ]"
  21. Test Translations • Find missing strings and push the limits

    of your UI testing with extended length unicode strings
  22. Working with Translation Services § Integration with third-party services §

    File transfers, API integrations, PM intervention § Latency § Control feature roll-out by locale § Intermediate translations through Google APIs § Cost § Translation Memory § Scales better as you get translations for more locales
  23. Tips and Gotchas

  24. Comments for Translators • Translators need context for strings in

    your application • # TRANSLATOR: The date fruit, not calendar date • str = ugettext("%(num)s dates") % {'num': date_count} • {% comment %}TRANSLATOR: The date fruit, not calendar date{% endcomment %} • • • $ pybabel extract --add-comments TRANSLATOR: <options> <dir> • #. TRANSLATOR: The date fruit, not calendar date • #: fruits/app.py:20 • msgid "%(num)s dates"
  25. Same But Different? • What if we needed the same

    word but in different contexts? • # Use pgettext/npgettext to add context • fruit_str = pgettext("Date Fruit", "Date") • calendar_str = pgettext("Calendar Date", "Date") • # Message catalog has msgctxt to ensure unique mapping • #: fruits/app.py:20 • msgctxt "Date Fruit" • msgid "Date" • msgstr "" • ...
  26. Custom Template Tags • Babel does not parse translations that

    are in custom or extension template tags • {% autoescape false %} • {% trans %}"This text is safe, but Babel might miss me"{% endtrans %} • {% endautoescape %} • # Add comma-separated extensions option in babel.cfg • [jinja2: **.html] • extensions=jinja2.ext.autoescape,...
  27. Lazy Translations § Locale is unknown at module load time

    when constants, model and form fields are initialized § ugettext_lazy returns lazy string references that can be later evaluated in a locale-aware context • from django.utils.translation import ugettext_lazy • • class MyFruit(models.Model): • name = models.CharField(help_text=ugettext_lazy('Name your fruit!'))
  28. Lazy Translations § Lazy string references are proxy objects that

    do not know how to convert themselves to bytestring • In [1]: ugettext_lazy("Hello") • Out[1]: <django.utils.functional.__proxy__ at 0x6c8ec90> § Watch out for lazy string concats, bytestring interpolation, exception handlers and JSON encoding (needs LazyEncoder) • In [1]: "Hello %s" % ugettext_lazy("World") • Out[1]: 'Hello <django.utils.functional.__proxy__ object at 0x6d7c250>'
  29. Import Aliases • Babel does not detect gettext import aliases

    since it simply parses through files one line at a time • • # Babel will miss this by default • from django.utils.translation import ugettext_lazy as _lazy • lazy_str = _lazy("Lazy") • • # Need to explicitly mention other aliases • pybabel extract --keyword "_lazy" <options> <dir>
  30. Smarter JavaScript I18N • JS view function runs on every

    request. Browser can cache JS files, so it can be pre-generated at deploy time and served statically instead. • // i18n_de.js • var catalog = new Array(); • catalog["Welcome, %(user)s!"] = "Willkommen, %(user)s!”; • ... • function gettext(msgid) {...}; • function ngettext(singular, plural, count) {...}; • function interpolate(fmt, obj, named) {...};
  31. Database Strings § Avoid storing strings that need translation, build

    rich enum classes instead • # Maps enum values, canonical names, display names • class MyFruitEnum(CoolEnum): • APPLE = _MyFruitValue(1, 'apple', ugettext_lazy('Apple')) • BANANA = _MyFruitValue(2, 'banana', ugettext_lazy('Banana')) § If you really need to do so, use django-dbgettext to make DB strings available in the message catalog
  32. Takeaways § Babel and Potpie are powerful internationalization tools for

    Python apps § Support for JS internationalization works really great for mobile web apps § Internationalize early and be ready for an international audience from day one!
  33. Other Resources § Python gettext § http://docs.python.org/dev/library/gettext § Babel §

    http://babel.edgewall.org § Django i18n § https://docs.djangoproject.com/en/dev/topics/i18n
  34. Other Resources § Potpie § http://pypi.python.org/pypi/potpie § Lazy JSON Encoding

    § https://docs.djangoproject.com/en/dev/topics/serialization/#id2 § Jinja2 i18n § http://jinja.pocoo.org/docs/integration
  35. Questions? varshney.ruchi@gmail.com @rvarshney github.com/rvarshney Slides @ http://bit.ly/pyconi18n