Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Collaborative Translation Efforts (CTE)

Collaborative Translation Efforts (CTE)

Linux Foundation Japan

September 20, 2019

More Decks by Linux Foundation Japan

Other Decks in Technology


  1. whoami  Marketing, Business Planning in DX (Digital Transformation) 

    10 + years @ (so called ) OSPO @NEC  Translation works with LF Translation team  Open Source Audits in M&A transaction (Dr. Ibrahim Haddad)  OpenChain Spec/Ref. Training Material 1.1, 1.2 (Congrats, 2.0 release!)  LF Certification prep guide  I’m NOT a developer, nor a professional translator Open Source Audits in M&A transaction Open Chain Reference Training Material Open Chain Spec 1.2 LF Certfication prep guide
  2. Power of a Word  A word is meme of

    culture  A word sometimes refrects history, values, how they think, they live,and what they desire Word and Culture (Takao Suzuki) “ありがとう”/”Gracias”/Thank you 親 Parent(s) 立 木 見 “Standing” “Tree” “To look/care” “Pacific” sea = “太平洋” “忖度” = “Read the Atmothphere”
  3. Power of Phrase  Good words, phrases sometimes /often change

    someone’s mind even her/his life (in good way) “継続は力なり” “Endurance makes you stronger.” “El que persevera alcanza.”
  4. Power of Translation  Good translations bring positve change beyond

    culture  Awareness,new notion, inspiration even innovation  And Such good translations are brought by translators “A Translator in Edo era” Natsuko TODA, Subtitler, film industry interpreter -> Translated by Naoko Ota
  5. Translation work is…  Wonderful experience  Enlightening  Sympathy

    with author  Deep understanding  Feeling of archivement  It’s fun, a blink of fun  But, Basically (Very^2) paiful experience  Frastating  Requred stoickness  Takes time  Lonely battle (nobody knows that)  Feel exhauseted after achievement  Few feedbak  Few reward (Especially if you are volunteer) So… Fun Painful <<
  6. Motivation: OSS Way is not widespread enough  For Developer/Engineer,

    it’s natural even in Japan  OSS Way is more and more important in DX business, but for traditional managers or for C-suite, it isn’t  Not aware of the power of mass collaboration (yet)  Cause of this may translation,OSS related translations are not ready enough, or are unseen or unreachable…?
  7. Vision of CTE (Tentative) More productive translation in “Open Source”

    way; with more Openness,more Collaboration, and much more Fun! Note: CTE is an idea from LF Japan, Nori (Fukuyasu) san
  8. Current: Tranlsations in LF resources  60% (97/162) in Press

    (2018)  7% (12/161) in Blog (2018)  17% (5/30) in Publication  91% (11/12) in Open Source Guide  42% (9/21) in Open Source Good Readings [*] LF resources https://www.linuxfoundation.jp/newsroom/press/ https://www.linuxfoundation.jp/newsroom/blog/ https://www.linuxfoundation.jp/resources/publications/ https://www.linuxfoundation.jp/resources/open-source-guides/ https://www.linuxfoundation.org/open-source-guides-reading-list/
  9. Challenges in translation (especially for OSS resources) Select Review Translate

    Publish  Challenges about “Process” (like Openchain initiative)  Each process has bottleneck
  10. Challenges in translation (especially for OSS resources)  “Quality(量)”: Dilemma

    b/w OSS expertise and Translation skills • The OSS related translation strongly requires OSS expertise such as technology and culture. Its tends to be hard even for professional translators. • Meanwhile, if you have good enough undestanding abouOSS, translation quality may not be good enough  “Quantity(質)”: Unscalable processes • There exist bottlnecks for each process as in; • Processes such as Choose ,Prepare,Translate,Review,and Publish • Each process is too slow
  11. Challenges in translation (especially for OSS resources)  Critical thinking

    and communication is important but…  Challenge for “忖度(Sontaku: Read-the-atmosphere)” in review process  You should focus on translation work itself  But “who translated it “ accounts for much in your mind  This can be risk to translation quality Who translated this? Is this really good translation? “Japanese Society” Author : Chie Nakane
  12. Quality v.s. Quantity(量と質の問題)  Essentially “trade-off”, you think it hard

    to improve both  Good translation needs much time, brain power  This leads to less productivity Quantity (量) Qualty (質) Which do we tackle on at first ?
  13. CTE Basic Approach  Putting priority on “Quality” requires money

    (sometimes costs much)  So, focus on “Quantity” at first,  To be “prolific”, above all  Put priority on “speed”, “scalablity” (Similar to Cloud native approach Interestingly)  Of course with “fun” Quantity (量) Qualty (質)
  14. Major bottlenecks though traslation procecess Review Translate Publish Management, Visualize

    Select Prepare Project A B C 1. Single main translator must translate all ->It takes time depending on time and skill available for the translator ->Besides , quality also depends on his/her skil and the quality affects review prcess. 2. Sequential reviews -> Review is done one by one sequentially and it takes time -> Besides, you often have do “review of review” it also takes time -> e-mail base communications take time (and need your brain power) 3. Release -> Workload is focused on few people who have DTP or Document cosmetics skills , This delays pubulishing. We focus on these processes here
  15. Approach (Cont’d) Prepar ation Sequencial Review Translation Publish Main Translator

    Reviewer A Reviewer B Reviewer C Preparation (None in some case ) Translation (Machine) Weeks/Months Weeks/Months Seconds/ Minuites Simultaneous Review Weeks Publish CTE approach Current processes Reviewer A, B, C (GOTO Next Project)
  16. Approach  To be “prolific” (and scalable), above all 

    Invigorate peer review, make it more fun  More collaborators Use tools to automate • Tool 1: OmegaT as translation memory • Tool 2: Google Machine Translation -> Shorten the time in Translation prcess, and Focus on Review process • Tool 3: Hackmd(CodiMD) -> Simultaneous, realtime review (Edit markdown contents ) • Tool 4: Slack -> Simultaneous, realtime review (Communication)  Define Metrics and measure (Bigger is better) 1. Speed Words /day 2. Efficiency 1/[Hours/(person*words)] 1/(Spend time per 1 word, per person)
  17. Starting point: OpenChain Spec1.0 Translation  2,200 words  E-mail

    base Discussion  Translation (By me) :  Duration : 1st March 2017 ~ 9th April 2017  Work time: 50 h ( including email communication)  Review : 3 reviewers (LF Kunai san/LF Sato san/Sony Fukuchi-san ) ~ 26th April 2017  10h per person https://github.com/OpenChain-Project/Specification-Translation-JP/blob/master/RELEASE/v1.0/openchainspec-1.0_jp.pdf Metric 1: Speed : 55 Metric 2: Efficiency : 100
  18. Trials: Basic Rule  Measure/record time in your activity ->

    This is an important factor for evaluation indicators. -> Rough measurement is OK  PERIODIC Online cross reviews in short time (1-2hours) -> Eliminate the time you think hard on your own. -> Do not disucss for long time. Let’s make reviews more casual -> Abstract and manage ToDo list  Online review should be done via Chat tool -> Chat is more casual for many. We need diverse, more collaborators!  Above all, make it fun! -> Enjoy original contents (Author’s idea, thought ) -> Enjoy interaction ( Off topic is also important sometime) -> Enjoy progress (We are coming to the goal!)
  19. 1st Trial (by 1 person)  Objective : Measure and

    evaluate how much single person takes  Target Resource: • TODO Group「Building Leadership in an Open Source Community」 • Approx.3600 words (24,000 letters)  # of person:1 (Taniguchi)  Started: 28th Oct, 2018
  20. 1st Trial (by 1 person): Work in progress Edit pane(markdown)

    *Paste machine translation output here at first View pane
  21. 1st Trial (by 1 person): Result  Outout: 「オープンソース コミュニティのリーダーシップ」

    https://hackmd.io/aibsz3_JTqStRbyTdVO7rA  Duration: 32days (28th Oct, 2018- 26th Nov, 2018) • Translation:Google Machine Translation -> 30 min. (manual work a little) • Review -> 870 min. (14.5h) • Review (correction by tool) -> 30 min. • Release (Image linking ) -> 30 min. • Sum 960 min. (16h) Note: Evalutation scores become worse because peer review is not included Metric 1: Speed : 113 (up) Metric 2: Efficiency : 225 (up)
  22. 2nd Trial + … Collaboration a little bit ( Realtime

    review ) Community Developer’s support
  23. 2nd Trial (2 persons)  Obejective: Evaluate effectiveness by collaboration

    by 2 people  Target material: • LF「 Certification Preparation Guide 」 (1) DL site with introduction (104 words,) (2) PDF Slide (w/ 21 slides, 4,212 words, )  # of person:2(Mieko-san and Taniguchi) ※Besides that, Inou-san from NEC Solution Innovator joined and did technical check as an engineer  Started : 27th Dec 2018 DL site PDF slide
  24. 2nd Trial ( 2 persons) Work in progres Simultaneous &

    realtime online review Chats in Slack
  25. 2nd Trial (2 persons): Result  Duration: 32 days (27th

    Dec, 2018 – 28th Jan, 2019( Incl. Release process by InDesign DTP) • Translation: Google Machine Tranlsation -> 35 min. (mainly Copy and Paste) • Review ( Self review) -> 1145 min. (19h:609min for Taniguchi), 540 min for Mieko-san) • Review ( Online cross-review) -> 595 min.(5 times) • Release (DTP by InDesign) -> 420 min. Sum 2190 min. (36h) • # of ToDo in online review: approx. 20 (closed all) Metric 1: Speed : 135 (up) Metric 2: Efficiency : 299 (up) * Note: scores must be (much) better than this, because this includes “Publish” process
  26. 3rd Trial (Ongoing) #kubernetes-docs-ja + … Collaboration a little bit

    more ( Launch HackMD site in Tokyo Region to increase realitime response)
  27. Case studies  General info, but essential  Business interest

     Good entrance for beginers  But low priority for engineers  We may contribute to the community ( a little )
  28. 3 Trial (3 persons, Ongoing )  Obejective: Evaluate effectiveness

    by collaboration by 3+ people  Target material: Kubernetes Case Stadies (40+ case studies) • 1st case study: China unicom • 1016 words  # of person 3  Started : 15th June 2019
  29. Result : 3 Trial (for now )  From 15th

    to 30th June 2019  Translation : 5 minutes  Review  Self : 290 min(4.8h) for 3 reviewers  Online: 210 min (3.5h, Held twice)  Publish  HTML format Metric 1: Speed : 72 (down) Metric 2: Efficiency : 367 (up)
  30. Trials Round-up (for now) *1: Note: Actual scores become worse

    because peer review is not included *2: Note: Actual scores must be better than this, because this includes “Publish” process Efficiency Speed Start (55,100) 1st trial (113,225) 2nd trial (135,299) *2 *1 3rd trial (on going) (75,367) *2 (Words / day) 4 persons 1 person 2 persons (1 supporter) 3 persons
  31. Thoughts (so far)  To some extent translation can be

    faster and more efficient in Translation and Review processes  And we see each process can be breakdown into “sub-processes”  Above all, it’s becoming enjoyable  Measuring metrics/and eveluatie them is meaningful  But, this is not solid yet. We need to looking for better approaches.
  32. Thoughts (so far):プロセス内の必要タスク Select Review Translate Publish Management ・進捗管理・促進 ・場の提供

    ・課題管理 ・インセンティブの提供 ・人材やスキルの把握 ・読者からのフィードバック計測 ・表記・ルールなどのメンテ ・品質チェック ・必要データ・ツール類の整備 ・メトリック評価 ・Issueの発行 ・CLA締結 ・リポジトリのfork (コミュニティ、翻訳メモリ) ・機械翻訳の実施 ・対訳作成 ・オンラインレビュー準備 ・作業時間の計測 ・翻訳対象の決定 ・ライセンス確認 ・コミュニティとの調整 ・ルールの決定 ・ボリュームのチェック ・必要知識にあった メンバの招集、参加 ・ツールの選定、準備 ・アウトプットと作業 スコープの明確化 ・ルール・用語の共有 ・オンラインレビュー ・作業時間の計測 ・課題/TODO管理 ・コミュニティでの 確認 ・マージ作業 (コンテンツ、メモリ 辞書・用語などなど) ・デザインルール決め ・フォーマット対応作業 ・作業時間の計測 ・アップストリームへの 提供 ・サイトへの掲載 ・クレジット ・アナウンス
  33. Thoughts (so far)  Quality? Yes, Quality and Quantitys are

    “tradeoffs”  But both may be improbed (especially in 2nd trial)
  34. 翻訳における課題(より俯瞰的に) • プロセスのボトルネック(シーケンシャル、時間がかかる、リリースできない、非生産的) • 皆が同じようにつらい • インセンティブ 【量の問題】 • 冗長性とフラグメンテーション(個別最適・分散しすぎ)

     ツール・ポリシー・プロセス・ノウハウ・リソースの分散 • 各プロセスのボトルネック  シーケンシャルなプロセス  CLA  ファイルフォーマット依存(.html , .idd, .docx, .pptx, .md, .txt, ODF…)  コミュニケーション・・・ • 翻訳者、レビュワの確保 【質の問題】 • 遠慮、忖度 • 人材とスキル(特にツールスキルは結構な障壁) • そもそも読者からのフィードバックがない(=Trial&Errorの仕組みの欠如)
  35. 残る課題(今回の活動の知見から) 翻訳を取り巻く課題は多様、多数、複雑。全体を俯瞰して課題を明確にしながらTrial&Errorが必要 • やっぱりツールがボトルネック?(OmegaT/Git/InDesign) • 人材育成・スキルアップ・ノウハウ共有⇒ Meetup • Slection/Publishプロセス •

    全体管理プロセス、見える化 • フラグメンテーションの低減:リソースの統一と共有(表記、翻訳メモリ、用語定義、ノウハウ。。。) • CLAなど手続きの簡略化(Community Hub) • コピーライト関連 • 機械翻訳の質 • さらなる自動化(メトリック自動測定) • さらなる見える化(全体進捗) • 役割の定義(カジュアルレビュワと、コアレビュワ) • カジュアルレビュワのインセンティブはどうするのか? • 市場価値の検証(読者からのフィードバックの測定) POC的活動。 まずはできるところでの 情報共有Trial & Errorで DevOps的に
  36. 機械翻訳の質 “And this is why building and maintaining leadership in

    open source projects is key to corporate strategies and goals. However, it isn’t as easy as pounding desks and throwing around cash-based clout.” Open Source Guide: Building Leadership in an Open Source Community https://github.com/todogroup/guides/blob/master/building-leadership-in-an-open-source-community.md “オープンソースプロジェクトでリーダーシップを築き維持することが企業の戦略と 目標の鍵となるのはこのためです。ただし、デスクを叩いたり、現金を使用した 効果的な効果を出したりするのと同じくらい簡単ではありません。” That's why building and maintaining leadership with open source projects is the key to a company's strategy and goals. However, it's not as easy as hitting a desk or using cash for an effective effect Translated Japanese is weird….(Grammar is OK , but ..)