Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Internationalization and Localization Done Right by Ruchi Varshney

PyCon 2013
March 18, 2013

Internationalization and Localization Done Right by Ruchi Varshney

PyCon 2013

March 18, 2013
Tweet

More Decks by PyCon 2013

Other Decks in Technology

Transcript

  1. Internationalization and
    Localization Done Right
    Ruchi Varshney
    PYCON 2013

    View Slide

  2. Overview
    §
    Internationalization
    §
    Localization
    §
    Testing and Maintenance
    §
    Tips and Gotchas

    View Slide

  3. Internationalization vs Localization
    §
    Internationalization (I18N) is the process of prepping your app to handle
    multiple languages
    §
    Localization (L10N) is the process of adding the right resources to handle a
    particular language in your app

    View Slide

  4. Why Do It?
    §
    Support an international audience
    §
    Easy to make your app ready for localization from day one
    §
    Drive some good code practices

    View Slide

  5. Internationalization (I18N)
    Python Source, Templates and JS

    View Slide

  6. Python I18N with
    gettext
    § gettext provides the baseline for all internationalization in Python
    §
    Wrapper around the GNU gettext catalog API
    §
    Requires the GNU gettext package on all your server machines

    View Slide

  7. Python I18N with
    gettext

    Django provides a simple wrapper around Python gettext functions

    from django.utils.translation import ugettext, ungettext

    # Simple

    msg = ugettext("Today is %(month)s %(day)s.") % {'month': m, 'day': d}

    # Plural

    msg = ungettext("%(num)d apple", "%(num)d apples", count) % {'num':
    count}

    View Slide

  8. Django Template I18N

    Django templates provide trans and blocktrans tags

    {% trans "My Fruit Store" %}

    {% blocktrans %}

    Click herefor more info

    {% endblocktrans %}

    {% blocktrans count counter=apple|length %}

    Checkout my apple

    {% plural %}

    Checkout my {{ counter }} apples

    {% endblocktrans %}

    View Slide

  9. Handling Dates and Numbers

    Babel is a package that provides standard date and number formatting

    $ pip install pybabel

    from babel import dates, numbers

    print dates.format_datetime(date, format='full', locale='de', tzinfo=tz)

    >> Sonntag, 17. Februar 2013 06:30:00 Vereinigte Staaten (Los Angeles)

    print numbers.format_decimal(1.234, locale='de')

    >> 1,234

    View Slide

  10. JavaScript I18N
    §
    JavaScript does not have access to gettext functions by default
    §
    Django provides a view that returns a JS library with gettext, ngettext and
    interpolate functions

    urlpatterns = patterns('',

    (r'^jsi18n/$', 'django.views.i18n.javascript_catalog')

    )


    // Named interpolation

    var msg = interpolate(gettext("Welcome, %(user)s!"), {user: name},
    true);

    View Slide

  11. Locale Detection
    §
    Django LocaleMiddleware determines the locale to activate
    § request.session['django_language']
    § django_language cookie
    § Accept-Language HTTP header set by the browser based on language preferences
    § settings.LANGUAGE_CODE = 'en-us'
    §
    If you save user language preferences in your database, override the
    request session key in your own middleware class

    – request.session['django_language'] = request.user.lang_pref

    View Slide

  12. Localization (L10N) with Babel

    View Slide

  13. Localization Process
    _("Hello")
    Source Files
    "Hello"
    PO File
    Translation
    Service
    "Bonjour"
    PO File
    Django App
    Message
    Catalog
    MO File
    Website
    Bonjour
    Babel
    Extraction
    Translation
    Babel Catalog
    Init/Update
    Babel
    Compile
    FTP

    View Slide

  14. String Extraction with Babel

    Babel also handles string extraction from source code

    $ pybabel extract --mapping-file babel.cfg --output out.po

    # babel.cfg

    [python: **.py]

    [django: **/templates/**.html]

    [extractors]

    python = babel.messages.extract:extract_python

    django = babeldjango.extract:extract_django

    View Slide

  15. JavaScript String Extraction

    JS strings are extracted in a different domain to optimize the number of
    strings sent from the server

    $ pybabel extract --mapping-file babel_js.cfg --output out_js.po .

    # babel_js.cfg

    [javascript: **.js]

    extract_messages = $._, jQuery._

    [extractors]

    javascript = babel.messages.extract:extract_javascript

    View Slide

  16. Message Catalog (PO) File Format

    # Translations for MYAPP.

    # Copyright (c) 2013 ORGANIZATION

    #: app/views.py:20

    #, python-format

    msgid "Welcome, %(username)s!"

    msgstr ""

    #: templates/app.html:35

    msgid "%(num)d apple."

    msgid_plural "%(num)d apples."

    msgstr[0] ""

    msgstr[1] ""

    View Slide

  17. Translation

    Message catalog (PO) files are shipped to external translation services or
    translation APIs to get back translations

    # German Translations for MYAPP.

    # Copyright (c) 2013 ORGANIZATION

    #: app/views.py:20

    #, python-format

    msgid "Welcome, %(username)s!"

    msgstr "Willkommen, %(username)s!”

    ...

    View Slide

  18. Integrating Translations
    §
    Django app catalog is initialized or updated with the received translation file

    $ pybabel --domain django --locale de --input-file


    • updating catalog /de/LC_MESSAGES/django.po based on input.po
    • updating catalog /de/LC_MESSAGES/djangojs.po based on inputjs.po
    §
    Catalog is then compiled into efficient machine binaries (.mo) files


    $ pybabel compile --domain django --locale de

    • compiling catalog to /de/LC_MESSAGES/django.mo
    • compiling catalog to /de/LC_MESSAGES/djangojs.mo

    View Slide

  19. Testing and Maintenance

    View Slide

  20. Test Translations

    Enable a test browser locale in Django settings and generate test
    translations with Potpie

    $ pip install potpie

    $ potpie --type in.po out.po

    = brackets, planguage, unicode, extend, mixed

    #: Potpie mixed mode translation

    #: templates/settings.html:18

    msgid "User Settings for %(username)s"

    msgstr "[Ŭşḗř Şḗŧŧīƞɠş ƒǿř %(username)s ẛLjϖDž 衋 ſNjΐϕ]"

    View Slide

  21. Test Translations

    Find missing strings and push the limits of your UI testing with extended
    length unicode strings

    View Slide

  22. Working with Translation Services
    §
    Integration with third-party services
    § File transfers, API integrations, PM intervention
    §
    Latency
    § Control feature roll-out by locale
    § Intermediate translations through Google APIs
    §
    Cost
    §
    Translation Memory
    § Scales better as you get translations for more locales

    View Slide

  23. Tips and Gotchas

    View Slide

  24. Comments for Translators

    Translators need context for strings in your application

    # TRANSLATOR: The date fruit, not calendar date

    str = ugettext("%(num)s dates") % {'num': date_count}

    {% comment %}TRANSLATOR: The date fruit, not calendar date{% endcomment
    %}



    $ pybabel extract --add-comments TRANSLATOR:

    #. TRANSLATOR: The date fruit, not calendar date

    #: fruits/app.py:20

    msgid "%(num)s dates"

    View Slide

  25. Same But Different?

    What if we needed the same word but in different contexts?

    # Use pgettext/npgettext to add context

    fruit_str = pgettext("Date Fruit", "Date")

    calendar_str = pgettext("Calendar Date", "Date")

    # Message catalog has msgctxt to ensure unique mapping

    #: fruits/app.py:20

    msgctxt "Date Fruit"

    msgid "Date"

    msgstr ""
    • ...

    View Slide

  26. Custom Template Tags

    Babel does not parse translations that are in custom or extension template
    tags

    {% autoescape false %}

    {% trans %}"This text is safe, but Babel might miss me"{% endtrans %}

    {% endautoescape %}

    # Add comma-separated extensions option in babel.cfg

    [jinja2: **.html]

    extensions=jinja2.ext.autoescape,...

    View Slide

  27. Lazy Translations
    §
    Locale is unknown at module load time when constants, model and form
    fields are initialized
    § ugettext_lazy returns lazy string references that can be later evaluated in a
    locale-aware context

    from django.utils.translation import ugettext_lazy


    class MyFruit(models.Model):

    name = models.CharField(help_text=ugettext_lazy('Name your
    fruit!'))

    View Slide

  28. Lazy Translations
    §
    Lazy string references are proxy objects that do not know how to convert
    themselves to bytestring

    In [1]: ugettext_lazy("Hello")

    Out[1]:
    §
    Watch out for lazy string concats, bytestring interpolation, exception
    handlers and JSON encoding (needs LazyEncoder)

    In [1]: "Hello %s" % ugettext_lazy("World")

    Out[1]: 'Hello 0x6d7c250>'

    View Slide

  29. Import Aliases

    Babel does not detect gettext import aliases since it simply parses through
    files one line at a time


    # Babel will miss this by default

    from django.utils.translation import ugettext_lazy as _lazy

    lazy_str = _lazy("Lazy")


    # Need to explicitly mention other aliases

    pybabel extract --keyword "_lazy"

    View Slide

  30. Smarter JavaScript I18N

    JS view function runs on every request. Browser can cache JS files, so it
    can be pre-generated at deploy time and served statically instead.

    // i18n_de.js

    var catalog = new Array();

    catalog["Welcome, %(user)s!"] = "Willkommen, %(user)s!”;

    ...

    function gettext(msgid) {...};

    function ngettext(singular, plural, count) {...};

    function interpolate(fmt, obj, named) {...};

    View Slide

  31. Database Strings
    §
    Avoid storing strings that need translation, build rich enum classes instead

    # Maps enum values, canonical names, display names

    class MyFruitEnum(CoolEnum):

    APPLE = _MyFruitValue(1, 'apple', ugettext_lazy('Apple'))

    BANANA = _MyFruitValue(2, 'banana', ugettext_lazy('Banana'))
    §
    If you really need to do so, use django-dbgettext to make DB strings
    available in the message catalog

    View Slide

  32. Takeaways
    §
    Babel and Potpie are powerful internationalization tools for Python apps
    §
    Support for JS internationalization works really great for mobile web apps
    §
    Internationalize early and be ready for an international audience from day
    one!

    View Slide

  33. Other Resources
    §
    Python gettext
    § http://docs.python.org/dev/library/gettext
    §
    Babel
    § http://babel.edgewall.org
    §
    Django i18n
    § https://docs.djangoproject.com/en/dev/topics/i18n

    View Slide

  34. Other Resources
    §
    Potpie
    § http://pypi.python.org/pypi/potpie
    §
    Lazy JSON Encoding
    § https://docs.djangoproject.com/en/dev/topics/serialization/#id2
    §
    Jinja2 i18n
    § http://jinja.pocoo.org/docs/integration

    View Slide

  35. Questions?
    [email protected]
    @rvarshney
    github.com/rvarshney
    Slides @ http://bit.ly/pyconi18n

    View Slide