Upgrade to Pro — share decks privately, control downloads, hide ads and more …

করো: Translating Go to Other (Human) Languages, and Back Again (GopherCon 2020)

করো: Translating Go to Other (Human) Languages, and Back Again (GopherCon 2020)

In The Hitchhikers’s Guide to the Galaxy, the Babel Fish is a universal translator. By allowing all beings to communicate regardless of language, it ‘neatly crosses the language divide between any species’.

While Go uses English keywords, because of the way Go’s lexer and parser are designed, we can easily port Go to other languages and still maintain interoperability between different dialects of Go. gofmt already bridges the divide between two seemingly-incompatible groups of developers — those who prefer tabs and those who prefer spaces — and allows them to collaborate seamlessly, with no extra effort for either group. We can extend this approach further, and allow developers who only speak English to collaborate seamlessly with developers who don’t speak English at all.

We will look at করো (koro), which adds Bengali support for the Go toolchain. করো lets native Bengali speakers program in the language most familiar to them, but provides bidirectional translation layers so that all Go programmers only ever see code written in their native language.

This same technique can be used to add support in the Go toolchain for Spanish, Russian, Korean, Arabic, or any other natural language that Go programmers want to use when programming, making Go the ultimate Babel Fish for programmers everywhere.

Aditya Mukerjee

November 13, 2020
Tweet

More Decks by Aditya Mukerjee

Other Decks in Technology

Transcript

  1. কেরা: Translating Code to
    Other (Human) Languages
    and Back Again
    @chimeracoder
    Aditya Mukerjee
    Systems Engineer, Stripe

    View Slide

  2. @chimeracoder

    View Slide

  3. @chimeracoder

    View Slide

  4. 95%
    of the world doesn’t speak
    English as their first language
    89%
    of the world doesn't speak
    English at all
    @chimeracoder

    View Slide

  5. @chimeacoder

    View Slide

  6. @chimeracoder

    View Slide

  7. Software reflects the people who build it
    @chimeracoder

    View Slide

  8. View Slide

  9. “I would have thought it would be useful to NATO, because they had
    the common verbs for the things they were going to do. And the
    nouns, they’d just have to have a dictionary for things they were
    referring to for inventory control…. They’d have common nouns
    throughout NATO, and they could make a dictionary of common
    verbs and translate the program. You could write one in English
    and you could translate it and it could go to [the other
    language]. No problem, you’d have communication. It would be a
    limited vocabulary.”
    - Grace Hopper
    @chimeracoder

    View Slide

  10. github.com/ChimeraCoder/koro
    কেরা
    (koro)

    View Slide

  11. package main
    import "fmt"
    func main() {
    if true {
    fmt.Printf(“hello, world!\n")
    }
    }
    @chimeracoder

    View Slide

  12. প ােকজ main
    আমদািন "fmt"
    ফ main(){
    যিদ true {
    fmt.Println("ওেহ িব !\n")
    }
    }
    @chimeracoder

    View Slide

  13. View Slide

  14. View Slide

  15. @chimeracoder

    View Slide


  16. View Slide

  17. gofmt
    korofmt
    @chimeracoder

    View Slide

  18. View Slide

  19. Automatic source code translation
    • Bidirectional translation layers
    • Localize source code as a commit hook
    @chimeracoder

    View Slide

  20. Structuring Code

    View Slide

  21. Naming Schemes
    @chimeracoder
    ReadFoo()
    WriteFoo()
    Close()
    (etc.)
    Fooপড়()
    Foo লখ()
    ব ()

    View Slide

  22. Localizing Error Messages

    View Slide

  23. type error interface {
    Error() string
    }
    @chimeracoder

    View Slide

  24. @chimeracoder
    Cannot cast parameter type from type “System.String” to
    argument type “System.String”

    View Slide

  25. The only thing worse than a cryptic error message is a cryptic error
    message in a foreign language that you can’t understand
    @chimeracoder

    View Slide

  26. What about documentation?

    View Slide

  27. The secret of automated translation….
    • We don't have to bridge all communication between two
    arbitrary languages
    • We just have to bridge communication in a specific context
    between two languages
    @chimeracoder

    View Slide

  28. Pipe creates a synchronous in-
    memory pipe. It can be used to
    connect code expecting an
    io.Reader with code expecting an
    io.Writer.
    Reads and Writes on the pipe are
    matched one to one except when
    multiple Reads are needed to
    consume a single Write. That is,
    each Write to the PipeWriter blocks
    until it has satisfied one or more
    Reads from the PipeReader that fully
    consume the written data. The data
    is copied directly from the Write to
    the corresponding Read (or Reads);
    there is no internal buffering.
    It is safe to call Read and Write in
    parallel with each other or with
    Close. Parallel calls to Read and
    parallel calls to Write are also safe:
    the individual calls will be gated
    sequentially.
    পাইপ এক িসেKানাস ইন- মমির পাইপ
    তির কের। এ এক io.Reader কাড
    কাড এক io.Writer আশা সে কাড
    সংেযাগ করেত ব বহার করা যেত পাের।
    পাইেপর উপর লখা এবং িলখন িল এক
    থেক এক সােথ মলােনা হয় যখন একক
    লখার জন একািধক িরড িলর েয়াজন
    হয়। য, েত ক পাইপ ওয়ািরটার
    ক িলেত িলখন যত ন না িলিখত ডটা
    স ূণভােব ব বহার কের PipeReader
    থেক এক বা একািধক িরড স হয়। তথ
    সংি পাঠ (বা পাঠ) থেক িলখন থেক
    সরাসির কিপ করা হয়; কান অভ রীণ
    বাফার আেছ।
    এটা এেক অপেরর সে বা ব সে
    সমা রাল প ন প ন এবং িলখন
    িনরাপদ। লখার জন সমা রাল কল িল
    প ন এবং সমা রাল কল িলও িনরাপদ:
    ব ি গত কল িল মানুসাের গট হেয়
    যােব।

    View Slide

  29. Pipe creates a synchronous in-
    memory pipe. It can be used to
    connect code expecting an
    io.Reader with code expecting an
    io.Writer.
    Reads and Writes on the pipe are
    matched one to one except when
    multiple Reads are needed to
    consume a single Write. That is,
    each Write to the PipeWriter blocks
    until it has satisfied one or more
    Reads from the PipeReader that fully
    consume the written data. The data
    is copied directly from the Write to
    the corresponding Read (or Reads);
    there is no internal buffering.
    It is safe to call Read and Write in
    parallel with each other or with
    Close. Parallel calls to Read and
    parallel calls to Write are also safe:
    the individual calls will be gated
    sequentially.
    The pipe creates a synchronous in-
    memory pipe. It can be used to
    connect the code with an io.Writer
    Expect an io.Reader code code.
    Writing and writing on pipe
    matches one to one when multiple
    reads are required for single
    writing. Write down each pipe
    warrior block until one or more
    reads from PipeReader are
    satisfied using the written data
    completely. The information is
    copied directly from the
    corresponding text (or text); There
    are no internal buffers.
    Read it to read parallel with each
    other or close and safe to enter.
    Read parallel calls for writing and
    parallel calls are also safe: Private
    calls will be gated in sequence.

    View Slide

  30. Pipe creates a synchronous in-
    memory pipe. It can be used to
    connect code expecting an
    io.Reader with code expecting an
    io.Writer.
    Reads and Writes on the pipe are
    matched one to one except when
    multiple Reads are needed to
    consume a single Write. That is,
    each Write to the PipeWriter blocks
    until it has satisfied one or more
    Reads from the PipeReader that fully
    consume the written data. The data
    is copied directly from the Write to
    the corresponding Read (or Reads);
    there is no internal buffering.
    It is safe to call Read and Write in
    parallel with each other or with
    Close. Parallel calls to Read and
    parallel calls to Write are also safe:
    the individual calls will be gated
    sequentially.
    Piping creates a synchronous
    pipeline in memory. It is a discount
    code io.Reader code to connect
    with a io.Writer hope can be used.
    Writing and writing in the pipeline
    when the units correspond one by
    one with the need to write more
    than one ridagulira. That is, each
    tube will write blocks until the data
    using the one or more read
    PipeReader is satisfied. Relevant
    text information (or text) is copied
    directly from the site; There is no
    internal buffer.
    It is read in parallel with each other
    or with the read and write
    insurance. Reading calls to write
    parallel and parallel calls is safe:
    the door will be in the order of
    personal calls.

    View Slide

  31. @chimeracoder

    View Slide

  32. Growing a multilingual OSS community doesn’t happen overnight
    @chimeracoder
    Automatic translation is not a substitute - it’s an invitation
    (but it doesn’t happen automatically either)

    View Slide

  33. @chimeracoder
    It’s surprisingly easy to get community translations…
    but you do have to ask

    View Slide

  34. Localizing Live Events
    @chimeracoder

    View Slide

  35. @chimeracoder
    We need to bring events to the global community
    …but we also need to bring the global community to our events
    To truly eliminate linguistic barriers, both are necessary

    View Slide

  36. Localizing Project Communication
    @chimeracoder

    View Slide

  37. @chimeracoder
    Encourage people to write in their native languages
    Provide archive links with community translations

    View Slide

  38. Where do we go from here?

    View Slide

  39. 95%
    of the world doesn’t speak
    English as their first language
    89%
    of the world doesn't speak
    English at all
    @chimeracoder

    View Slide

  40. Aditya Mukerjee
    @chimeracoder
    https://github.com/ChimeraCoder

    View Slide