Slide 1

Slide 1 text

Collaborative Translation Efforts (CTE) 18th July, 2019@OSSJ Masao Taniguchi,NEC

Slide 2

Slide 2 text

whoami  Marketing, Business Planning in DX (Digital Transformation)  10 + years @ (so called ) OSPO @NEC  Translation works with LF Translation team  Open Source Audits in M&A transaction (Dr. Ibrahim Haddad)  OpenChain Spec/Ref. Training Material 1.1, 1.2 (Congrats, 2.0 release!)  LF Certification prep guide  I’m NOT a developer, nor a professional translator Open Source Audits in M&A transaction Open Chain Reference Training Material Open Chain Spec 1.2 LF Certfication prep guide

Slide 3

Slide 3 text

Power of a Word  A word is meme of culture  A word sometimes refrects history, values, how they think, they live,and what they desire Word and Culture (Takao Suzuki) “ありがとう”/”Gracias”/Thank you 親 Parent(s) 立 木 見 “Standing” “Tree” “To look/care” “Pacific” sea = “太平洋” “忖度” = “Read the Atmothphere”

Slide 4

Slide 4 text

Power of Phrase  Good words, phrases sometimes /often change someone’s mind even her/his life (in good way) “継続は力なり” “Endurance makes you stronger.” “El que persevera alcanza.”

Slide 5

Slide 5 text

Power of Translation  Good translations bring positve change beyond culture  Awareness,new notion, inspiration even innovation  And Such good translations are brought by translators “A Translator in Edo era” Natsuko TODA, Subtitler, film industry interpreter -> Translated by Naoko Ota

Slide 6

Slide 6 text

Translation work is…  Wonderful experience  Enlightening  Sympathy with author  Deep understanding  Feeling of archivement  It’s fun, a blink of fun  But, Basically (Very^2) paiful experience  Frastating  Requred stoickness  Takes time  Lonely battle (nobody knows that)  Feel exhauseted after achievement  Few feedbak  Few reward (Especially if you are volunteer) So… Fun Painful <<

Slide 7

Slide 7 text

Motivation: Much more Fun in Translation Fun Painful >>

Slide 8

Slide 8 text

Motivation: OSS Way is not widespread enough  For Developer/Engineer, it’s natural even in Japan  OSS Way is more and more important in DX business, but for traditional managers or for C-suite, it isn’t  Not aware of the power of mass collaboration (yet)  Cause of this may translation,OSS related translations are not ready enough, or are unseen or unreachable…?

Slide 9

Slide 9 text

Vision of CTE (Tentative) More productive translation in “Open Source” way; with more Openness,more Collaboration, and much more Fun! Note: CTE is an idea from LF Japan, Nori (Fukuyasu) san

Slide 10

Slide 10 text

Translation (CTE)

Slide 11

Slide 11 text

Current: Tranlsations in LF resources  60% (97/162) in Press (2018)  7% (12/161) in Blog (2018)  17% (5/30) in Publication  91% (11/12) in Open Source Guide  42% (9/21) in Open Source Good Readings [*] LF resources https://www.linuxfoundation.jp/newsroom/press/ https://www.linuxfoundation.jp/newsroom/blog/ https://www.linuxfoundation.jp/resources/publications/ https://www.linuxfoundation.jp/resources/open-source-guides/ https://www.linuxfoundation.org/open-source-guides-reading-list/

Slide 12

Slide 12 text

Contents 1. Challenges and Approach 2. Demo 3. Trials 4. Thoughts so far 5. Next steps

Slide 13

Slide 13 text

Challenges and Approach

Slide 14

Slide 14 text

Challenges in translation (especially for OSS resources) Select Review Translate Publish  Challenges about “Process” (like Openchain initiative)  Each process has bottleneck

Slide 15

Slide 15 text

Challenges in translation (especially for OSS resources)  “Quality(量)”: Dilemma b/w OSS expertise and Translation skills • The OSS related translation strongly requires OSS expertise such as technology and culture. Its tends to be hard even for professional translators. • Meanwhile, if you have good enough undestanding abouOSS, translation quality may not be good enough  “Quantity(質)”: Unscalable processes • There exist bottlnecks for each process as in; • Processes such as Choose ,Prepare,Translate,Review,and Publish • Each process is too slow

Slide 16

Slide 16 text

Challenges in translation (especially for OSS resources)  Critical thinking and communication is important but…  Challenge for “忖度(Sontaku: Read-the-atmosphere)” in review process  You should focus on translation work itself  But “who translated it “ accounts for much in your mind  This can be risk to translation quality Who translated this? Is this really good translation? “Japanese Society” Author : Chie Nakane

Slide 17

Slide 17 text

Quality v.s. Quantity(量と質の問題)  Essentially “trade-off”, you think it hard to improve both  Good translation needs much time, brain power  This leads to less productivity Quantity (量) Qualty (質) Which do we tackle on at first ?

Slide 18

Slide 18 text

CTE Basic Approach  Putting priority on “Quality” requires money (sometimes costs much)  So, focus on “Quantity” at first,  To be “prolific”, above all  Put priority on “speed”, “scalablity” (Similar to Cloud native approach Interestingly)  Of course with “fun” Quantity (量) Qualty (質)

Slide 19

Slide 19 text

Major bottlenecks though traslation procecess Review Translate Publish Management, Visualize Select Prepare Project A B C 1. Single main translator must translate all ->It takes time depending on time and skill available for the translator ->Besides , quality also depends on his/her skil and the quality affects review prcess. 2. Sequential reviews -> Review is done one by one sequentially and it takes time -> Besides, you often have do “review of review” it also takes time -> e-mail base communications take time (and need your brain power) 3. Release -> Workload is focused on few people who have DTP or Document cosmetics skills , This delays pubulishing. We focus on these processes here

Slide 20

Slide 20 text

Approach (Cont’d) Prepar ation Sequencial Review Translation Publish Main Translator Reviewer A Reviewer B Reviewer C Preparation (None in some case ) Translation (Machine) Weeks/Months Weeks/Months Seconds/ Minuites Simultaneous Review Weeks Publish CTE approach Current processes Reviewer A, B, C (GOTO Next Project)

Slide 21

Slide 21 text

Approach  To be “prolific” (and scalable), above all  Invigorate peer review, make it more fun  More collaborators Use tools to automate • Tool 1: OmegaT as translation memory • Tool 2: Google Machine Translation -> Shorten the time in Translation prcess, and Focus on Review process • Tool 3: Hackmd(CodiMD) -> Simultaneous, realtime review (Edit markdown contents ) • Tool 4: Slack -> Simultaneous, realtime review (Communication)  Define Metrics and measure (Bigger is better) 1. Speed Words /day 2. Efficiency 1/[Hours/(person*words)] 1/(Spend time per 1 word, per person)

Slide 22

Slide 22 text

Approach Review Translate Publish Prepare We Focus on 2 procecces

Slide 23

Slide 23 text

Demo

Slide 24

Slide 24 text

Trials

Slide 25

Slide 25 text

Starting point: OpenChain Spec1.0 Translation  2,200 words  E-mail base Discussion  Translation (By me) :  Duration : 1st March 2017 ~ 9th April 2017  Work time: 50 h ( including email communication)  Review : 3 reviewers (LF Kunai san/LF Sato san/Sony Fukuchi-san ) ~ 26th April 2017  10h per person https://github.com/OpenChain-Project/Specification-Translation-JP/blob/master/RELEASE/v1.0/openchainspec-1.0_jp.pdf Metric 1: Speed : 55 Metric 2: Efficiency : 100

Slide 26

Slide 26 text

Trials: Basic Rule  Measure/record time in your activity -> This is an important factor for evaluation indicators. -> Rough measurement is OK  PERIODIC Online cross reviews in short time (1-2hours) -> Eliminate the time you think hard on your own. -> Do not disucss for long time. Let’s make reviews more casual -> Abstract and manage ToDo list  Online review should be done via Chat tool -> Chat is more casual for many. We need diverse, more collaborators!  Above all, make it fun! -> Enjoy original contents (Author’s idea, thought ) -> Enjoy interaction ( Off topic is also important sometime) -> Enjoy progress (We are coming to the goal!)

Slide 27

Slide 27 text

1st Trial … Alone. (Try to measure the metrics) ぼっち

Slide 28

Slide 28 text

1st Trial (by 1 person)  Objective : Measure and evaluate how much single person takes  Target Resource: • TODO Group「Building Leadership in an Open Source Community」 • Approx.3600 words (24,000 letters)  # of person:1 (Taniguchi)  Started: 28th Oct, 2018

Slide 29

Slide 29 text

1st Trial (by 1 person): Work in progress Edit pane(markdown) *Paste machine translation output here at first View pane

Slide 30

Slide 30 text

1st Trial (by 1 person): Result  Outout: 「オープンソース コミュニティのリーダーシップ」 https://hackmd.io/aibsz3_JTqStRbyTdVO7rA  Duration: 32days (28th Oct, 2018- 26th Nov, 2018) • Translation:Google Machine Translation -> 30 min. (manual work a little) • Review -> 870 min. (14.5h) • Review (correction by tool) -> 30 min. • Release (Image linking ) -> 30 min. • Sum 960 min. (16h) Note: Evalutation scores become worse because peer review is not included Metric 1: Speed : 113 (up) Metric 2: Efficiency : 225 (up)

Slide 31

Slide 31 text

1st Trial (by 1 person): Result https://hackmd.io/zgthoZZcTl-s3JXAg1pgkw?view

Slide 32

Slide 32 text

2nd Trial + … Collaboration a little bit ( Realtime review ) Community Developer’s support

Slide 33

Slide 33 text

2nd Trial (2 persons)  Obejective: Evaluate effectiveness by collaboration by 2 people  Target material: • LF「 Certification Preparation Guide 」 (1) DL site with introduction (104 words,) (2) PDF Slide (w/ 21 slides, 4,212 words, )  # of person:2(Mieko-san and Taniguchi) ※Besides that, Inou-san from NEC Solution Innovator joined and did technical check as an engineer  Started : 27th Dec 2018 DL site PDF slide

Slide 34

Slide 34 text

2nd Trial ( 2 persons) Work in progres Simultaneous & realtime online review Chats in Slack

Slide 35

Slide 35 text

2nd Trial (2 persons): Result  Duration: 32 days (27th Dec, 2018 – 28th Jan, 2019( Incl. Release process by InDesign DTP) • Translation: Google Machine Tranlsation -> 35 min. (mainly Copy and Paste) • Review ( Self review) -> 1145 min. (19h:609min for Taniguchi), 540 min for Mieko-san) • Review ( Online cross-review) -> 595 min.(5 times) • Release (DTP by InDesign) -> 420 min. Sum 2190 min. (36h) • # of ToDo in online review: approx. 20 (closed all) Metric 1: Speed : 135 (up) Metric 2: Efficiency : 299 (up) * Note: scores must be (much) better than this, because this includes “Publish” process

Slide 36

Slide 36 text

2nd Trial: Result (Outcome) DL site introduction PDF Slide (Prep Guide)

Slide 37

Slide 37 text

3rd Trial (Ongoing) #kubernetes-docs-ja + … Collaboration a little bit more ( Launch HackMD site in Tokyo Region to increase realitime response)

Slide 38

Slide 38 text

Case studies  General info, but essential  Business interest  Good entrance for beginers  But low priority for engineers  We may contribute to the community ( a little )

Slide 39

Slide 39 text

3 Trial (3 persons, Ongoing )  Obejective: Evaluate effectiveness by collaboration by 3+ people  Target material: Kubernetes Case Stadies (40+ case studies) • 1st case study: China unicom • 1016 words  # of person 3  Started : 15th June 2019

Slide 40

Slide 40 text

Result : 3 Trial (for now )  From 15th to 30th June 2019  Translation : 5 minutes  Review  Self : 290 min(4.8h) for 3 reviewers  Online: 210 min (3.5h, Held twice)  Publish  HTML format Metric 1: Speed : 72 (down) Metric 2: Efficiency : 367 (up)

Slide 41

Slide 41 text

Trials Round-up (for now) *1: Note: Actual scores become worse because peer review is not included *2: Note: Actual scores must be better than this, because this includes “Publish” process Efficiency Speed Start (55,100) 1st trial (113,225) 2nd trial (135,299) *2 *1 3rd trial (on going) (75,367) *2 (Words / day) 4 persons 1 person 2 persons (1 supporter) 3 persons

Slide 42

Slide 42 text

この次(現在進行中) https://todogroup.org/guides/marketing-open-source-projects/ https://kubernetes.io/case-studies/appdirect/

Slide 43

Slide 43 text

Thoughts (so far)  To some extent translation can be faster and more efficient in Translation and Review processes  And we see each process can be breakdown into “sub-processes”  Above all, it’s becoming enjoyable  Measuring metrics/and eveluatie them is meaningful  But, this is not solid yet. We need to looking for better approaches.

Slide 44

Slide 44 text

Thoughts (so far):プロセス内の必要タスク Select Review Translate Publish Management ・進捗管理・促進 ・場の提供 ・課題管理 ・インセンティブの提供 ・人材やスキルの把握 ・読者からのフィードバック計測 ・表記・ルールなどのメンテ ・品質チェック ・必要データ・ツール類の整備 ・メトリック評価 ・Issueの発行 ・CLA締結 ・リポジトリのfork (コミュニティ、翻訳メモリ) ・機械翻訳の実施 ・対訳作成 ・オンラインレビュー準備 ・作業時間の計測 ・翻訳対象の決定 ・ライセンス確認 ・コミュニティとの調整 ・ルールの決定 ・ボリュームのチェック ・必要知識にあった メンバの招集、参加 ・ツールの選定、準備 ・アウトプットと作業 スコープの明確化 ・ルール・用語の共有 ・オンラインレビュー ・作業時間の計測 ・課題/TODO管理 ・コミュニティでの 確認 ・マージ作業 (コンテンツ、メモリ 辞書・用語などなど) ・デザインルール決め ・フォーマット対応作業 ・作業時間の計測 ・アップストリームへの 提供 ・サイトへの掲載 ・クレジット ・アナウンス

Slide 45

Slide 45 text

Thoughts (so far)  Quality? Yes, Quality and Quantitys are “tradeoffs”  But both may be improbed (especially in 2nd trial)

Slide 46

Slide 46 text

まとめ

Slide 47

Slide 47 text

翻訳における課題(より俯瞰的に) • プロセスのボトルネック(シーケンシャル、時間がかかる、リリースできない、非生産的) • 皆が同じようにつらい • インセンティブ 【量の問題】 • 冗長性とフラグメンテーション(個別最適・分散しすぎ)  ツール・ポリシー・プロセス・ノウハウ・リソースの分散 • 各プロセスのボトルネック  シーケンシャルなプロセス  CLA  ファイルフォーマット依存(.html , .idd, .docx, .pptx, .md, .txt, ODF…)  コミュニケーション・・・ • 翻訳者、レビュワの確保 【質の問題】 • 遠慮、忖度 • 人材とスキル(特にツールスキルは結構な障壁) • そもそも読者からのフィードバックがない(=Trial&Errorの仕組みの欠如)

Slide 48

Slide 48 text

アプローチ(今回の活動の知見から) 翻訳における効率化、負荷逓減そして楽しさ増大の可能性が垣間見れた • プロセスの効率化:機械翻訳のフル活用(遠慮、忖度の低減) • メトリックの設定、見える化 • プロセスの細分化:タスクレベルでやるべきことが見える • レビューの進捗促進:日程・時間を決めてのオンライン同時レビュー(シーケンシャル⇒パラレル) • より楽しく(カジュアルな参加、チャット) • インセンティブ:コミュニティでのコントリビューション(インセンティブ) • 翻訳レビューへのエンジニアの参加で質を高める

Slide 49

Slide 49 text

残る課題(今回の活動の知見から) 翻訳を取り巻く課題は多様、多数、複雑。全体を俯瞰して課題を明確にしながらTrial&Errorが必要 • やっぱりツールがボトルネック?(OmegaT/Git/InDesign) • 人材育成・スキルアップ・ノウハウ共有⇒ Meetup • Slection/Publishプロセス • 全体管理プロセス、見える化 • フラグメンテーションの低減:リソースの統一と共有(表記、翻訳メモリ、用語定義、ノウハウ。。。) • CLAなど手続きの簡略化(Community Hub) • コピーライト関連 • 機械翻訳の質 • さらなる自動化(メトリック自動測定) • さらなる見える化(全体進捗) • 役割の定義(カジュアルレビュワと、コアレビュワ) • カジュアルレビュワのインセンティブはどうするのか? • 市場価値の検証(読者からのフィードバックの測定) POC的活動。 まずはできるところでの 情報共有Trial & Errorで DevOps的に

Slide 50

Slide 50 text

リソースの一元化(必要なところ、できるところから) https://hackmd.io/@maabou512/S1DlLROBH

Slide 51

Slide 51 text

機械翻訳の質 “And this is why building and maintaining leadership in open source projects is key to corporate strategies and goals. However, it isn’t as easy as pounding desks and throwing around cash-based clout.” Open Source Guide: Building Leadership in an Open Source Community https://github.com/todogroup/guides/blob/master/building-leadership-in-an-open-source-community.md “オープンソースプロジェクトでリーダーシップを築き維持することが企業の戦略と 目標の鍵となるのはこのためです。ただし、デスクを叩いたり、現金を使用した 効果的な効果を出したりするのと同じくらい簡単ではありません。” That's why building and maintaining leadership with open source projects is the key to a company's strategy and goals. However, it's not as easy as hitting a desk or using cash for an effective effect Translated Japanese is weird….(Grammar is OK , but ..)

Slide 52

Slide 52 text

No content

Slide 53

Slide 53 text

カジュアルレビュワ コアレビュワ

Slide 54

Slide 54 text

その先の野望 • 独自の機械翻訳モデル(個人・グループ共有) • 下地を固めて、もっとクリエイティブな世界へ(翻訳者のレベル⇒翻訳者の個性) • 翻訳業界とのつながり・連携 • もっと産業横断的な活動へ • いずれは大作、違うジャンル、ベストセラー本も?

Slide 55

Slide 55 text

ありがとうございました。 CollaboTrans.slack.com