Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Localization of Stack Overflow- QCon China 2014

The Localization of Stack Overflow- QCon China 2014

Slides supporting the "Localization of Stack OVerflow" talk presented at QCon China in April 2014

Marco Cecconi

April 25, 2014
Tweet

More Decks by Marco Cecconi

Other Decks in Programming

Transcript

  1. Javascript - No GC pressure so we don’t care about

    interned strings - Can’t really precompile either - We simply create one set of JS files per language, e.g. “stub.en.js” and “stub.pt.js” - For all that follows, the same APIs are available to Javascript
  2. Language Name Code Category Examples Rules Chinese zh other 0-999;

    1.2... other → everything English en one 1 one → n is 1; other → everything else other 0, 2-999; 1.2, 2.07...
  3. one 1, 21, 31, 41, 51, 61, 71, 81, 101,

    1001, … i % 10 = 1 and i % 100 != 11 few 2~4, 22~24, 32~34, 42~44, 52~54, 62, 102, 1002, … i % 10 = 2..4 and i % 100 != 12..14 many 0, 5~19, 100, 1000, 10000, 100000, 1000000, … i % 10 = 0 or i % 10 = 5..9 or i % 100 = 11..14 other 0.0~1.5, 10.0, 100.0, 1000.0, 10000.0, 100000.0, 1000000.0, … other 0~15, 100, 1000, 10000, 100000, 1000000, … Ukranian!
  4. Behind the scenes All combinations are generated for each language

    and sent to translators:  For Chinese: “$num:other$ chickens” will be sent  For a 2 mode language: “$num:one$ chickens” and “$num:other$ chickens” will be sent Rules have to be evaluated at runtime to choose the correct translation.
  5. 10 classes called Class I to Class X and containing

    all sorts of arbitrary groupings but often characterised as • people, • long objects, • animals, • miscellaneous objects, • large objects and liquids, • small objects, • languages, • pejoratives, • infinitives, • mass nouns Uganda
  6. Some numbers 700 views localized 100,000 lines of code A

    lot of javascript A LOT of refactoring/fixing/tech debt repayed Very little performance impact ~6 months of work (team of ~3)
  7. More numbers Portuguese released Dec. 12 4k Questions 7k Answers

    8k Users One of the best performing new communities ever
  8. Never put non-content text data in the DB It’s A

    Good Thing™ if all the text to be localized is in the views or javascript.
  9. 1. It’s possible to internationalize quickly and cheaply, without performance

    hits. 2. Localization is a surprisingly rich problem. There are many gotchas that can be painful later, like pluralization “bugs”. Fun! 3. Localization is a very healthy choice for Stack Overflow and we hope to provide more and more users with a native interface some day :-)