Explanation of GNU Gettext with internationalized 'hello world' program. Developers checklist while writing i18n programs. Presented at MahaOnline Limited in Sion, Mumbai (MH) India.
(i18n) are means of adapting computer software to different languages, regional differences and technical requirements of a target market. Internationalization is a combination of developers task and localization. Which enables a product to be used with multiple scripts and cultures; separating user interface resources in a localizable format. This concept is also known as NLS (National Language Support or Native Language Support).
*.cpp & convert them into objects Compile objects to libraries Compile parent CPP with Required Library Run Time Localization Set Locale (LANG/LANGUAGE Environment variable) and Bind Text-Domain Triggering ‘gettext’ to fetch strings from message catalogs as per set locale.
text file that includes the original texts and the translations. language independent Machine Objects includes the exact same contents as PO file. are compiled to binary format and are used for machine translations. Using Poedit Translations Filtrations .sdf, .xml, .properties, .ini, .rc, .yml, .wordfast, .json, .sub Native Formats
available in every technology in terms of API, Framework, Libraries etc and they work on similar concept of run-time injection, fetching strings from native format. An example could be… GNU gettext for C, C++ and open source tools Microsoft Localization Framework (resource.dll based) For Java: Apache Tapestry and International Components for Unicode BabelFx for Flash and Flex Rich Internet Applications Rails Internationalization (I18n) API for Ruby on Rails http://www.endlesslycurious.com/ 2008/10/
locale program writes information about current locale environment, or all locales to standard output. Environment variables available to locale aware programs: 1. LC_CTYPE (Character classification and case conversion) 2. LC_COLLATE (Collation order) 3. LC_TIME (Date and time formats) 4. LC_NUMERIC (Non-monetary numeric formats) 5. LC_MONETARY (Monetary formats) 6. LC_MESSAGES (Formats of informative, diagnostic messages and interactive responses) 7. LC_PAPER (Paper size) 8. LC_NAME (Name formats) 9. LC_ADDRESS (Address formats and location information) 10. LC_TELEPHONE (Telephone number formats) 11. LC_MEASUREMENT (Measurement units) 12. LC_IDENTIFICATION (Metadata about the locale information) LOCPATH: where locale data is stored. Default is /usr/lib/locale A way to handle localization levels easily…
Compiler) 2. gettext (GNU Internationalized Utilities) 3. gettext-base (GNU Internationalized Utilities for the base system) 4. libc6 (GNU C Shared Libraries) 5. libc6-dev (GNU C Development Libraries) 6. locales (Common files for locale support) 7. libintl (Message translations system compatible i18n library) 8. php-gettext (read gettext MO files directly through PHP) 9. gtranslator (PO File editor for the GNOME desktop) 10.poedit (gettext catalog editor) Things we need in place… Working with GNU Gettext
#include<locale.h> #include<libintl.h> int main(void) { /* initializes the entire current locale as per environment variables set by the user */ setlocale(LC_ALL, “”); /* sets the base directory for the message catalogs */ bindtextdomain(“hello”, “.”); textdomain(“hello”); /* set domain for future gettext() calls */ /* allows the translator to work independently from the programmer */ printf(gettext(“Hello World\n”)); return(0); } Internationalized ‘Hello World’ Program man setlocale, textdomain xgettext, msginit, translate msgfmt, set lang and chmod
Steps… • Extract strings from source file • Create the template for translations xgettext • Create the files to translate using the template • Edit and translate file. • Set Project-Id- Version to {TextDomain} msginit • Create target directories in Text Domain Location bound. • Compile and install translations msgfmt
will avoid code duplication, will let localizers and developers work on updates simultaneously and remove the possibility of damaging code during translation. Externalize all translatable content – Take the text out of the code and place in resource files
make sure to attach the validation rule to the specific country or have the validation rule update when country selection changes. Allow input of international data and foreign scripts
written for a specific language. Avoid constructing strings through concatenation as this makes translation hard – even impossible in certain cases. Avoid string concatenation
languages as the verb will be different depending on the product name. Further, do not use a noun as a parameter in a sentence and avoid reusing strings. Translation tools let linguists recycle previously translated strings during the translation pass. Avoid using given string variable in more than one context
during input > database > output route: Do all string handling with Unicode An internationalized application uses Unicode for all handling of strings and text. This applies to the static text as well as the dynamic text that is communicated between the application and the database.
the exception of some languages where it may shrink. Leave enough room on the layout for expansion and avoid static sizing. If there are strings that should not exceed a certain size, always include comments in the resource file for those items. Provide extra room for text expansion – User Interface
Indian language in many different ways. It is very important to provide context information in the resource file when necessary. Add context information to strings using comments
the regions that speak the same language. Example: dd.mm.yyyy in Bengali; dd- mm-yyyy in Kannada, Gujarati, Hindi, Marathi, Punjabi, Tamil; d-m- yyyy in Telugu, no leading zeroes. Use system functions for date/time and numeric formatting
different for some languages. In line styling will prevent these modifications to be done or require code duplication. Always use external style sheets to define styles for a web application. Avoid using styling tags such as "em", "strong" and "italic" text. Bold font faces cause problems as bold strokes may result in a big blob of ink when the font size is small in printing. If emphasizing a string is needed with bold font face, we can do it by externalizing the style. This way, localizers can decide for font size as per need. Externalize all styles and formatting
comparison This example has been taken from Microsoft MSDN. An internationalized application does not use any manual sorting logic and relies on the underlying framework‟s API for string comparison. This applies to database data as well as the strings that come from resource files, which may be used in form elements and others such as combo boxes. http://msdn.microsoft.com/en -us/goglobal/bb688122