$30 off During Our Annual Pro Sale. View Details »

Crossing the Language Divide in Open Source Projects

Crossing the Language Divide in Open Source Projects

In The Hitchhikers’s Guide to the Galaxy, the Babel Fish is a universal translator. By allowing all beings to communicate regardless of language, it ‘neatly crosses the language divide between any species’. Most programming languages are designed for English speakers, and most open source projects require some familiarity with English. How do we create a multilingual programming language and a multilingual open source community, allowing developers who only speak English to collaborate on open-source projects with developers who don't speak English at all?

Aditya Mukerjee

September 15, 2016
Tweet

More Decks by Aditya Mukerjee

Other Decks in Programming

Transcript

  1. Crossing the Language Divide in
    Open Source Projects
    Aditya Mukerjee
    Risk Engineer at Stripe
    @chimeracoder

    View Slide

  2. •We write open-source software because we want
    to have an impact
    •We want to solve big problems
    •We want to change the world
    @chimeracoder

    View Slide

  3. •Basic rule of product development: understand
    the end users
    •The people best equipped to find and solve
    problems are the ones who experience them
    @chimeracoder

    View Slide

  4. 95%
    of the world doesn’t speak
    English as their first language
    89%
    of the world doesn't speak English
    at all
    @chimeracoder

    View Slide

  5. •Reading code requires proficiency in English
    •Most error messages are not localized
    •Project documentation is only available in English
    •Community Meetups and conferences are
    English-only
    •Participating in mailing lists requires knowledge
    of English
    @chimeracoder

    View Slide

  6. @chimeracoder

    View Slide

  7. ...and yet
    •Only 10% of the world's programmers live in
    China
    •Only 1.4% of StackOverflow visits come from
    China
    @chimeracoder

    View Slide

  8. Where would the other 86% of programmers in
    China go?

    View Slide

  9. @chimeracoder

    View Slide

  10. @chimeracoder

    View Slide

  11. @chimeracoder

    View Slide

  12. How do we defragment open-source
    communities?

    View Slide

  13. Why not use a lingua franca?
    •English is difficult to learn
    •A lingua franca doesn’t eliminate language
    barriers
    •Languages evolve. Even if we tried to restrict
    them, we couldn't
    •Embrace multilingual workflows instead of
    fighting an uphill battle
    @chimeracoder

    View Slide

  14. Making Code Multilingual

    View Slide

  15. View Slide

  16. “I would have thought it would be useful to NATO, because they had the
    common verbs for the things they were going to do. And the nouns, they’d
    just have to have a dictionary for things they were referring to for inventory
    control…. They’d have common nouns throughout NATO, and they could
    make a dictionary of common verbs and translate the program. You could
    write one in English and you could translate it and it could go to [the other
    language]. No problem, you’d have communication. It would be a limited
    vocabulary.”
    - Grace Hopper
    @chimeracoder

    View Slide

  17. package main
    import "fmt"
    func main() {
    if true {
    fmt.Printf(“hello, world!\n")
    }
    }
    @chimeracoder

    View Slide

  18. View Slide

  19. @chimeracoder

    View Slide

  20. gofmt
    korofmt
    @chimeracoder

    View Slide

  21. Automatic source code translation
    •Bidirectional translation layers
    •Localize source code as a commit hook
    @chimeracoder

    View Slide

  22. Structuring Code

    View Slide

  23. Naming Schemes
    GetFoo()
    SetFoo()
    Build()
    to_foo
    Fooপড়()
    Foo লখ()
    (etc.)

    @chimeracoder

    View Slide

  24. Translation and Interpretation

    View Slide

  25. Automatic Translation?

    View Slide

  26. The secret of automated translation….
    •We don't have to bridge all communication
    between two arbitrary languages
    •We just have to bridge communication in a
    specific context between two languages
    @chimeracoder

    View Slide

  27. A regular expression (or RE) specifies a set
    of strings that matches it; the functions in
    this module let you check if a particular
    string matches a given regular expression
    (or if a given regular expression matches a
    particular string, which comes down to the
    same thing).
    Regular expressions can be concatenated to
    form new regular expressions; if A and B are
    both regular expressions, then AB is also a
    regular expression. In general, if a
    string p matches A and another
    string q matches B, the string pq will match
    AB. This holds unless A or B contain low
    precedence operations; boundary
    conditions between A and B; or have
    numbered group references. Thus, complex
    expressions can easily be constructed from
    simpler primitive expressions like the ones
    described here. For details of the theory
    and implementation of regular expressions,
    consult the Friedl book referenced above,
    or almost any textbook about compiler
    construction.
    A brief explanation of the format of regular
    expressions follows. For further information
    and a gentler presentation, consult
    the Regular Expression HOWTO.
    এক র লার এ ে শন, (পুনরায়) ং য
    িমলেব এক সট িনধারণ করা; এই মিডউল ইন
    ফাংশান যিদ এক িন দ ং এক দ
    র লার এ ে শেনর সােথ মলােনা আপিন না
    পরী া িদন (বা এক দ র লার এ ে শন
    এক িবেশষ পংি , যা একই িজিনস আেস
    িনেচ মলােনা).
    র লার এ ে শন নত
    ন র লার এ ে শন
    গঠেনর ঘিনভ
    ত করা যেত পাের; যিদ A ও B
    উভয় র লার এ ে শন হয়, তারপর এিব
    এক র লার এ ে শন. সাধারণভােব,
    আপিন যিদ এক ং িপ ম াচ সিরেয় অন ং

    ই িব ম াচ, ং PQ এিব ম াচ হেব. এই ঝ
    িলেত
    যিদ না A অথবা B কম াধান অপােরশন ধারণ
    কের; A ও B এর মেধ সীমানা শত; অথবা প
    রফাের গিণত আেছ. সুতরাং, জ ল
    এ ে শন সহেজ এখােন বণনা বশী মত সহজ
    আিদম এ ে শন থেক িনমাণ করা যেত পাের.
    তT এবং র লার এ ে শন বা বায়ন স েক
    িব ািরত তেথ র জন , উপের রফাের ড Friedl
    বই, বা ক াইলার িনমাণ স েক ায় কােনা
    পাঠ পু ক পরামশ.
    র লার এ ে শন িবন ােসর এক সংি
    ব াখ া নীেচ. আরও তথ এবং এক মৃদু মা ার
    উপ াপনার জন , র লার এ ে শন বে র
    সে পরামশ.

    View Slide

  28. A regular expression (or RE) specifies a set
    of strings that matches it; the functions in
    this module let you check if a particular
    string matches a given regular expression
    (or if a given regular expression matches a
    particular string, which comes down to the
    same thing).
    Regular expressions can be concatenated to
    form new regular expressions; if A and B are
    both regular expressions, then AB is also a
    regular expression. In general, if a
    string p matches A and another
    string q matches B, the string pq will match
    AB. This holds unless A or B contain low
    precedence operations; boundary
    conditions between A and B; or have
    numbered group references. Thus, complex
    expressions can easily be constructed from
    simpler primitive expressions like the ones
    described here. For details of the theory
    and implementation of regular expressions,
    consult the Friedl book referenced above,
    or almost any textbook about compiler
    construction.
    A brief explanation of the format of regular
    expressions follows. For further information
    and a gentler presentation, consult
    the Regular Expression HOWTO.
    A regular expression (again) to determine a
    set of strings that matches; This module
    functions in a particular string matches a
    given regular expression not only in the day
    (or a regular expression given a particular
    string, which comes down to the same
    thing in a match).
    Regular expressions can be concatenated to
    form new regular expressions; If A and B
    are both regular expressions, then AB is a
    regular expression. In general, if you are a
    match for a string P and another string q
    matches B, the string will match AB PQ.
    This holds unless A or B contain low
    precedence operations; Boundary
    conditions between A and B; Or reference
    group are numbered. Thus, complex
    expressions like the ones described here
    easily can be created from simple primitive
    expressions. For more information about
    the theory and implementation of regular
    expressions, Friedl book referenced above,
    or almost any textbook about compiler
    construction advice.
    Below is a brief explanation of the format of
    regular expressions. For further information
    and a gentler presentation, consult the
    article regular expressions.

    View Slide

  29. A regular expression (or RE) specifies a set
    of strings that matches it; the functions in
    this module let you check if a particular
    string matches a given regular expression
    (or if a given regular expression matches a
    particular string, which comes down to the
    same thing).
    Regular expressions can be concatenated to
    form new regular expressions; if A and B are
    both regular expressions, then AB is also a
    regular expression. In general, if a
    string p matches A and another
    string q matches B, the string pq will match
    AB. This holds unless A or B contain low
    precedence operations; boundary
    conditions between A and B; or have
    numbered group references. Thus, complex
    expressions can easily be constructed from
    simpler primitive expressions like the ones
    described here. For details of the theory
    and implementation of regular expressions,
    consult the Friedl book referenced above,
    or almost any textbook about compiler
    construction.
    A brief explanation of the format of regular
    expressions follows. For further information
    and a gentler presentation, consult
    the Regular Expression HOWTO.
    A regular expression (re) define a set of
    strings to match it; This module with a
    specific function in a given string matches
    the regular expression (or a regular
    expression given a special string, which is
    the same amount of matches).
    Regular expressions can be concatenated to
    form new regular expressions; If A and B
    are both regular expressions, then AB is a
    regular expression. In general, if you are a
    match for a string P and another string q
    matches B, the string will match AB PQ.
    This is less than the A or B containing low-
    priority operation; Boundary conditions
    between A and B; Or they are numbered
    reference group. Thus, complex expressions
    described as easily could be constructed
    from simple primitive expressions. For more
    information about the theory and
    implementation of regular expressions,
    Friedl book referenced above, or almost any
    textbook about compiler construction
    advice.
    Below is a brief explanation of the format of
    regular expressions. For more information
    and a gentler presentation, consult the
    article regular expressions.

    View Slide

  30. •Automatic translation is not a substitute
    •The point is to signal interest to draw other
    speakers to your project
    •Growing a multilingual OSS community doesn’t
    happen overnight
    • But it doesn’t happen automatically, either – we have to work
    for it
    @chimeracoder

    View Slide

  31. Community Translations
    •Actively reach out within your community
    •We are already looking for ways to get people to
    contribute to OSS projects
    •You have to ask
    @chimeracoder

    View Slide

  32. Localizing Error Messages
    Cannot cast parameter type from type “System.String” to
    argument type “System.String”
    The only thing worse than a cryptic error message
    is a cryptic error message in a foreign language
    that you can’t understand
    @chimeracoder

    View Slide

  33. Localizing Documentation
    @chimeracoder

    View Slide

  34. Localizing Live Events
    •Live interpretation
    •Live stenography can assist non-native speakers
    • StenoKnightCART
    • White Coat Captioning
    @chimeracoder

    View Slide

  35. Localizing Mailing Lists
    •Encourage people to write in their native
    languages
    •Provide archive links with community translations
    @chimeracoder

    View Slide

  36. Where do we go from here?

    View Slide

  37. •Social change is rarely about building the fanciest
    use of technology
    •It's about figuring out the right way to improve
    and mobilize the technologies we already have
    •And it's about actually committing ourselves to
    doing it
    @chimeracoder

    View Slide

  38. Aditya Mukerjee
    @chimeracoder
    https://github.com/ChimeraCoder

    View Slide