$30 off During Our Annual Pro Sale. View Details »

করো: Translating Go to Other (Human) Languages, and Back Again

করো: Translating Go to Other (Human) Languages, and Back Again

In The Hitchhikers’s Guide to the Galaxy, the Babel Fish is a universal translator. By allowing all beings to communicate regardless of language, it ‘neatly crosses the language divide between any species’.

While Go uses English keywords, because of the way Go’s lexer and parser are designed, we can easily port Go to other languages and still maintain interoperability between different dialects of Go. gofmt already bridges the divide between two seemingly-incompatible groups of developers — those who prefer tabs and those who prefer spaces — and allows them to collaborate seamlessly, with no extra effort for either group. We can extend this approach further, and allow developers who only speak English to collaborate seamlessly with developers who don’t speak English at all.

In this talk, we will look at koro, which adds Bengali support for the Go toolchain. The koro extension lets native Bengali speakers program in the language most familiar to them, but provides bidirectional translation layers so that all Go programmers only ever see code written in their native language.

This same technique can be used to add support in the Go toolchain for Korean, Russian, Tagalog, or any other natural language that Go programmers want to use when programming, making Go the ultimate Babel Fish for programmers everywhere

Aditya Mukerjee

April 25, 2017
Tweet

More Decks by Aditya Mukerjee

Other Decks in Technology

Transcript

  1. কেরা: Translating Go to Other
    (Human) Languages, and Back
    Again
    Aditya Mukerjee
    Systems Engineer at Stripe
    @chimeracoder

    View Slide

  2. @chimeracoder

    View Slide

  3. @chimeracoder

    View Slide

  4. @chimeracoder

    View Slide

  5. View Slide

  6. “I would have thought it would be useful to NATO, because they had the
    common verbs for the things they were going to do. And the nouns, they’d
    just have to have a dictionary for things they were referring to for inventory
    control…. They’d have common nouns throughout NATO, and they could
    make a dictionary of common verbs and translate the program. You could
    write one in English and you could translate it and it could go to [the other
    language]. No problem, you’d have communication. It would be a limited
    vocabulary.”
    - Grace Hopper
    @chimeracoder

    View Slide

  7. View Slide

  8. github.com/ChimeraCoder/koro
    কেরা
    (koro)

    View Slide

  9. package main
    import "fmt"
    func main() {
    if true {
    fmt.Printf(“hello, world!\n")
    }
    }
    @chimeracoder

    View Slide

  10. প ােকজ main
    আমদািন "fmt"
    ফ main(){
    যিদ true {
    fmt.Println("ওেহ িব !\n")
    }
    }
    @chimeracoder

    View Slide

  11. @chimeracoder

    View Slide


  12. View Slide

  13. gofmt
    korofmt
    @chimeracoder

    View Slide

  14. View Slide

  15. Automatic source code translation
    •Bidirectional translation layers
    •Localize source code as a commit hook
    @chimeracoder

    View Slide

  16. Structuring Code

    View Slide

  17. Naming Schemes
    ReadFoo()
    WriteFoo()
    Close()
    (etc.)
    Fooপড়()
    Foo লখ()
    ব ()

    @chimeracoder

    View Slide

  18. type error interface {
    Error() string
    }
    @chimeracoder

    View Slide

  19. Localizing Error Messages
    Cannot cast parameter type from type “System.String” to
    argument type “System.String”
    The only thing worse than a cryptic error message
    is a cryptic error message in a foreign language
    that you can’t understand
    @chimeracoder

    View Slide

  20. What about documentation?

    View Slide

  21. The secret of automated translation….
    •We don't have to bridge all communication
    between two arbitrary languages
    •We just have to bridge communication in a
    specific context between two languages
    @chimeracoder

    View Slide

  22. Pipe creates a synchronous in-
    memory pipe. It can be used to
    connect code expecting an io.Reader
    with code expecting an io.Writer.
    Reads and Writes on the pipe are
    matched one to one except when
    multiple Reads are needed to
    consume a single Write. That is, each
    Write to the PipeWriter blocks until it
    has satisfied one or more Reads from
    the PipeReader that fully consume
    the written data. The data is copied
    directly from the Write to the
    corresponding Read (or Reads); there
    is no internal buffering.
    It is safe to call Read and Write in
    parallel with each other or with
    Close. Parallel calls to Read and
    parallel calls to Write are also safe:
    the individual calls will be gated
    sequentially.
    পাইপ এক িসেKানাস ইন- মমির পাইপ
    তির কের। এ এক io.Reader কাড
    কাড এক io.Writer আশা সে কাড
    সংেযাগ করেত ব বহার করা যেত পাের।
    পাইেপর উপর লখা এবং িলখন িল এক
    থেক এক সােথ মলােনা হয় যখন একক
    লখার জন একািধক িরড িলর েয়াজন
    হয়। য, েত ক পাইপ ওয়ািরটার
    ক িলেত িলখন যত ন না িলিখত ডটা
    স ূণভােব ব বহার কের PipeReader থেক
    এক বা একািধক িরড স হয়। তথ সংি
    পাঠ (বা পাঠ) থেক িলখন থেক সরাসির
    কিপ করা হয়; কান অভ রীণ বাফার
    আেছ।
    এটা এেক অপেরর সে বা ব সে
    সমা রাল প ন প ন এবং িলখন িনরাপদ।
    লখার জন সমা রাল কল িল প ন এবং
    সমা রাল কল িলও িনরাপদ: ব ি গত
    কল িল মানুসাের গট হেয় যােব।

    View Slide

  23. Pipe creates a synchronous in-
    memory pipe. It can be used to
    connect code expecting an io.Reader
    with code expecting an io.Writer.
    Reads and Writes on the pipe are
    matched one to one except when
    multiple Reads are needed to
    consume a single Write. That is, each
    Write to the PipeWriter blocks until it
    has satisfied one or more Reads from
    the PipeReader that fully consume
    the written data. The data is copied
    directly from the Write to the
    corresponding Read (or Reads); there
    is no internal buffering.
    It is safe to call Read and Write in
    parallel with each other or with
    Close. Parallel calls to Read and
    parallel calls to Write are also safe:
    the individual calls will be gated
    sequentially.
    The pipe creates a synchronous in-
    memory pipe. It can be used to connect
    the code with an io.Writer Expect an
    io.Reader code code.
    Writing and writing on pipe matches
    one to one when multiple reads are
    required for single writing. Write down
    each pipe warrior block until one or
    more reads from PipeReader are
    satisfied using the written data
    completely. The information is copied
    directly from the corresponding text (or
    text); There are no internal buffers.
    Read it to read parallel with each other
    or close and safe to enter. Read parallel
    calls for writing and parallel calls are
    also safe: Private calls will be gated in
    sequence.

    View Slide

  24. Pipe creates a synchronous in-
    memory pipe. It can be used to
    connect code expecting an io.Reader
    with code expecting an io.Writer.
    Reads and Writes on the pipe are
    matched one to one except when
    multiple Reads are needed to
    consume a single Write. That is, each
    Write to the PipeWriter blocks until it
    has satisfied one or more Reads from
    the PipeReader that fully consume
    the written data. The data is copied
    directly from the Write to the
    corresponding Read (or Reads); there
    is no internal buffering.
    It is safe to call Read and Write in
    parallel with each other or with
    Close. Parallel calls to Read and
    parallel calls to Write are also safe:
    the individual calls will be gated
    sequentially.
    Piping creates a synchronous pipeline
    in memory. It is a discount code
    io.Reader code to connect with a
    io.Writer hope can be used.
    Writing and writing in the pipeline
    when the units correspond one by one
    with the need to write more than one
    ridagulira. That is, each tube will write
    blocks until the data using the one or
    more read PipeReader is satisfied.
    Relevant text information (or text) is
    copied directly from the site; There is
    no internal buffer.
    It is read in parallel with each other or
    with the read and write insurance.
    Reading calls to write parallel and
    parallel calls is safe: the door will be in
    the order of personal calls.

    View Slide

  25. Where do we go from here?

    View Slide

  26. 95%
    of the world doesn’t speak
    English as their first language
    89%
    of the world doesn't speak English
    at all
    @chimeracoder

    View Slide

  27. Aditya Mukerjee
    @chimeracoder
    https://github.com/ChimeraCoder

    View Slide