How we replaced a 10-year-old Perl product using Scala

How we replaced a 10-year-old Perl product using Scala

9b888a029ae98abd2675b115ff0c4256?s=128

Rikito Taniguchi

June 29, 2019
Tweet

Transcript

  1. How we replaced a 10-year-old Perl product using Scala Scala

    Matsuri 2019 2019/06/29
  2. ©Hatena Co., Ltd. “The project that is difficult to maintain

    or extend” • Long running & Poor documentation • Inflexible code • Outdated dependencies • ... Legacy project? 2 レガシープロジェクト: メンテナンスや拡張が 難しくなったプロジェクト
  3. ©Hatena Co., Ltd. • Refactoring • Re-architecting • Rewrite Escape

    from being legacy 3 レガシープロジェクト: メンテナンスや拡張が 難しくなったプロジェクト
  4. ©Hatena Co., Ltd. • Refactoring • Re-architecting • Rewrite ◦

    Why and How we replaced a 10-year-old Perl product (Hatena-Bookmark) using Scala Escape from being legacy 4 何故、そしてどのようにScalaを用いてリプ レースしたかについてお話します
  5. ©Hatena Co., Ltd. • Rikito taniguchi • @tanishiking / github:

    tanishiking • id:tanishiking24 • Hatena (2017~) ◦ Hatena-Bookmark team About me 5 2017年に入社以来、はてなブックマークのリ プレースに携わってきました。
  6. ©Hatena Co., Ltd. • What we did • Why we

    decided to perform the Full-Rewrite? • Why we chose Scala? • The Big Rewrite / data migration. Agenda 6 今日のアジェンダ
  7. ©Hatena Co., Ltd. • What we did • Why we

    decided to perform the Full-Rewrite? • Why we chose Scala? • The Big Rewrite / data migration. Agenda 7 概要からお話します
  8. ©Hatena Co., Ltd. Hatena-Bookmark 8 日本国内でサービス展開するソーシャルブッ クマークプラットフォーム • Social Bookmark

    Platform in Japan • Launched at 2005
  9. ©Hatena Co., Ltd. • Monolithic Perl application ◦ 400000 lines

    of Perl code (excluding tests) ◦ 270000 lines of tests ◦ About 70000 lines of HTML template ◦ (at November 2016) • And many git submodules... Hatena-Bookmark 9 モノリシックなPerlアプリケーションとして構築 されていた
  10. ©Hatena Co., Ltd. • Inconsistent wording • Homegrown ORM and

    Web framework • The models no longer reflects the real • Fat model / Fat controller • Too slow tests • Complicated release processes • Difficult to setup develop environment Hatena-Bookmark was “legacy” project 10 ソースコードやDBの肥大化・老朽化によりソ フトウェアの最適化や変更が難しく
  11. ©Hatena Co., Ltd. We decided to rewrite Hatena-Bookmark using Scala

    in 2015 !!! Rewrite Hatena-Bookmark using Scala 11 2015年にはてなブックマークのリライトを決め る。Scalaを採用。 Source: Hatena Bookmark in Scala https://www.slideshare.net/oarat/2015-0801-scala (Scala Kansai Summit 2015) • Reduce the costs of maintaining. • Optimize the application.
  12. ©Hatena Co., Ltd. • Server-side / web frontend engineers ◦

    2 - 4 members ◦ Develop the new system, and take care of the original system. • Infrastructure engineer ◦ 1 member Team members (engineers) 12 はてなブックマーク(Web)の開発チーム構成
  13. ©Hatena Co., Ltd. • Create a new database for the

    new application ◦ (Not sharing the existing database with old application) ◦ Requires data migration Brand New Database 13 新システムでは新しいDBを利用し、旧DBの 再利用はしない Original App New App Original DB New DB
  14. ©Hatena Co., Ltd. New software architecture (overview) 14 Core App

    Server にScala、ユーザーからの リクエストを処理する部分にPerl Core App Server (Scala) BFF (Perl) Microservices (Go/Python/Perl) Reverse proxy CDN Split to • Backend For Frontend(Perl) • Core App Server (Scala) • (and some microservices)
  15. ©Hatena Co., Ltd. 4 YEARS LATER 15 4年後...

  16. ©Hatena Co., Ltd. Now, Hatena-Bookmark’s Core App Server is built

    on Scala !!! 16 はてなブックマークのCore App Serverは Scalaで動いている!
  17. ©Hatena Co., Ltd. The original app server is no longer

    running! 17 旧システムは完全に停止 CPU usage on the original app hosts
  18. ©Hatena Co., Ltd. Improvements in performance. Benefits of Rewrite 18

    rewriteにより250ms以内で返せるreqの割合 が約40%から約90%に The proportion of requests whose response time is smaller than 250ms (40% => 90%) (in a comment list page).
  19. ©Hatena Co., Ltd. • Make it quite easy to add/change

    the features ◦ Now, we release the software to production almost everyday. • Save the substantial amount of computation resource for running an application Benefits of Rewrite 19 サービスへの変更が非常に容易に 計算リソースの大幅な節約
  20. ©Hatena Co., Ltd. • Were there any other options than

    rewrite for revitalising the project? • Rewrite is not the only option to revitalize the project. ◦ Refactoring ◦ Re-architecting ◦ Full Rewrite Was rewrite the best option? 20 ソフトウェアのフルスクラッチが唯一の選択肢 ではない
  21. ©Hatena Co., Ltd. • Risk ◦ Usually takes months or

    even years. ◦ Risk of the regressions. • Overhead ◦ We may have to freeze the development on the original software while rewriting. Rewrite is basically undesirable... 21 リライトには数年かかることも、既存プロジェ クトの開発を止めることにも
  22. ©Hatena Co., Ltd. So, why we decided to rewrite, in

    spite of the risks ? 22 では何故我々はそんなリスクを承知のうえで リライトという道を選んだのか
  23. ©Hatena Co., Ltd. • What we did • Why we

    decided to perform the Full-Rewrite? • Why we chose Scala? • The Big Rewrite / data migration. Agenda 23 何故リライトという道を選んだのか
  24. ©Hatena Co., Ltd. • Homegrown ORM • The models no

    longer reflects the real • Fat model / Fat controller • and more ... Hatena-Bookmark was “legacy” project 24 ソースコードやDBの肥大化・老朽化によりソ フトウェアの最適化や変更が難しく
  25. ©Hatena Co., Ltd. • Designed based on “convention over configuration”

    • They had been useful for rapid development, but… ◦ No longer maintained. ◦ People started to deviate the “convention”... • Tight coupled with the system. ◦ Hinder the large scale refactoring and optimization. (Homegrown) ORM, Web App Framework 25 もうメンテされてない内製フレームワークへの 依存。
  26. ©Hatena Co., Ltd. In the real world, a single content

    (entry) may have the multiple URLs. The difference between model and reality (example) 26 現実世界ではひとつのコンテンツが複数の URLを持ちうる http://example.com/ https://example.com/ https://foo.bar/ Entry 301 redirect / canonical Bookmark Bookmark
  27. ©Hatena Co., Ltd. In the old system, each URL had

    been modeled to have each different entry. The difference between model and reality (example) 27 旧システムでは各URLはそれぞれ異なるエン トリを指し示す。 http://example.com/ https://example.com/ https://foo.bar/ Entry Bookmark Bookmark Same contents!
  28. ©Hatena Co., Ltd. • Fat model ◦ The model that

    has more logics than its own behavior. ◦ $ wc -l lib/Hatena/Bookmark/MoCo/Entry.pm ▪ 4611 lib/Hatena/Bookmark/MoCo/Entry.pm • Fat controller ◦ The controllers sometimes have the logics that represents model’s behavior. Fat model / Fat controller 28 モデルの振る舞い以上のロジックまで持った モデルが出現
  29. ©Hatena Co., Ltd. • Inconsistent wording ◦ “favorite” and “follow”

    mean the same thing. • Too long test • Too complicated release process • Difficult to setup the development environment. and more ... 29 他にも様々な問題が...
  30. ©Hatena Co., Ltd. So, why we decided to rewrite, in

    spite of the risks ? Fundamental changes Past failure on refactorings 30 何故リライトという道を選んだのか 理由は主に2つ
  31. ©Hatena Co., Ltd. Fundamental changes were necessary for making the

    software keep to thrive... • Revise DB schema / model • Remove the dependency on the homegrown ORM and framework. Fundamental changes 31 ソフトウェアに対する根本的な変更が必要だ ということがわかっていた
  32. ©Hatena Co., Ltd. We’d experienced several times of large scale

    refactoring ended in failure. • Tried to replace the framework and gave up. • Tried to refactor around the database architecture / connection and failed. Past failures on refactoring 32 過去に大規模なリファクタリングを試みようと して失敗
  33. ©Hatena Co., Ltd. It was virtually impossible to make the

    system keep to thrive only with refactoring… => Full Rewrite Why Rewrite 33 これらの理由からリライトが最善だと判断
  34. ©Hatena Co., Ltd. • What we did • Why we

    decided to perform the Full-Rewrite? • Why we chose Scala? • The Big Rewrite / data migration. Agenda 34 何故Scalaを選んだのか
  35. ©Hatena Co., Ltd. • Well suited for complex problem domain

    ◦ Expressive type system ◦ Scalability ◦ Type safe • Concise syntax • Already adopted Scala in other projects Why Scala for Core App Server ? 35 社内での利用実績、複雑なドメインを簡潔に 表現できる。
  36. ©Hatena Co., Ltd. New software architecture (overview) 36 新アーキテクチャの概要(再掲) Core

    App Server (Scala) BFF (Perl) Microservices (Go/Python/Perl) Reverse proxy CDN Split to • Backend For Frontend(Perl) • Core App Server (Scala) • (and some microservices)
  37. ©Hatena Co., Ltd. • Hatena has a lot of Perl

    developers • Rapid development ◦ Easy to use / learn ◦ Do not require compiling • Thin layer Why Perl for BFF? 37 社内での利用実績、Perlエンジニアが多い
  38. ©Hatena Co., Ltd. Scala isn’t easy to learn… To alleviate

    the barrier to onboard the project, • Prepare learning materials • Try to avoid using “difficult” libraries ◦ Monocle / cats / scalaz … ◦ Though they are quite useful, they make it more difficult for non-scala engineer to onboard. Learning curve for Scala 38 Scala学習教材の用意、「難しい」ライブラリは できる限り避け参入障壁を下げる
  39. ©Hatena Co., Ltd. • Library ◦ Scalatra ◦ Slick (Plain

    SQL Query) ◦ circe ◦ Elastic4s ◦ etc • Cake pattern Tech stacks for Scala 39 Scalaの開発で利用している技術スタック
  40. ©Hatena Co., Ltd. • To avoid the problems in the

    old system, design the architecture based on Domain Driven Design. • Problems in the old system ◦ The gap between models and real world. ◦ Fat model / Fat controller. ◦ Inconsistent wording. Domain Driven Design 40 旧システムでの課題を解決するためドメイン 駆動設計の徹底
  41. ©Hatena Co., Ltd. • Common and rigorous language between developers

    and all members who are related to the project. • Domain model name after the ubiquitous languages. Discuss and re-define the ubiquitous languages, share those languages. Ubiquitous Languages 41 ユビキタス言語の再定義 ✅ inconsistent wording
  42. ©Hatena Co., Ltd. Layered architecture 42 レイヤードアーキテクチャを採用し各レイヤの 責務を明確にする。 ✅ separation

    of concerns
  43. ©Hatena Co., Ltd. Dependency inversion principle 43 依存関係逆転の原則 / インフラレイヤの変更

    によるほかレイヤへの影響を抑える ✅ ease of database refactoring ...
  44. ©Hatena Co., Ltd. 44 package domain.repository // Cake pattern trait

    BookmarkComponent { // Wrap the repository interface def bookmarkLoader: BookmarkLoader trait BookmarkLoader { // Domain repository has only the interface. def find(bookmarkId: BookmarkId): Option[BookmarkEntity] } } package infrastructure trait BookmarkComponent extends domain.repository.BookmarkComponent { // Concrete implementations here def bookmarkLoader: BookmarkLoader = BookmarkLoaderImpl }
  45. ©Hatena Co., Ltd. • The model had methods for retrieving

    and resolving the relationships with other models (in the old system) ◦ Fat Model • Define it as a extension method in domain service (domain relation) (in the new system). Relations between entities 45 エンティティ間の関係の解決
  46. ©Hatena Co., Ltd. Extension method in domain service 46 package

    domain.relation trait BookmarkLocationComponent { self: repository.LocationComponent => implicit class BookmarkSeqLocationsRelation( bookmarks: Seq[BookmarkEntity] ) { // In the real system, the return value is something like // Bookmark with { def location: Location } def withLocations: Stream[(BookmarkEntity, Location)] = … } }
  47. ©Hatena Co., Ltd. • Well suited for complex problem domain

    ◦ Expressive type system ◦ Scalability ◦ Type safe • Concise syntax • Already adopted Scala in other projects Why Scala for Core App Server ? 47 社内での利用実績、複雑なドメインを簡潔に 表現できる。
  48. ©Hatena Co., Ltd. • What we did • Why we

    decided to perform the Full-Rewrite? • Why we chose Scala? • The Big Rewrite / data migration. Agenda 48 Full-Rewrite、データ移行について
  49. ©Hatena Co., Ltd. • Make the system maintainable and easy

    to change. • Revise models and DB schema. • Optimize the system and save the computation resources. Project goal 49 プロジェクトの目標
  50. ©Hatena Co., Ltd. • Don’t add any new big feature

    while rewriting. • Continue to provide the main features. • Obsolete some of minor features. Project scope 50 新機能追加はなし、既存機能は基本的に存 続させる(一部廃止はあり)
  51. ©Hatena Co., Ltd. • Rewrite all at once ? or

    • Incremental rewrite ? THE BIG REWRITE 51 一度にすべて置き換えるか インクリメンタルに置き換えるか
  52. ©Hatena Co., Ltd. Split the rewriting process into smaller number

    of phases. • Aug 2017: Replace comment list page • Nov 2017: Replace user page • Mar 2018: Replace top page • Mar 2018: Replace search feature • ... Incremental Rewrite 52 一度に全てを置き換えず、何度かに分けて 徐々にリライト
  53. ©Hatena Co., Ltd. • Pros ◦ Each phase of release

    clarifies the progress and business value. ◦ Safer than a big-bang rewrite. • Cons ◦ We have to run both the new and original system until the rewrite complete. Incremental Rewrite 53 利点: 各フェーズ毎に進捗と成果を可視化 欠点: 新旧両システムを稼働させる必要
  54. ©Hatena Co., Ltd. LIST ALL THE FEATURES and LIST ALL

    THE RESOURCES EACH FEATURE DEPENDS (BY READING SOURCE CODE) • Choose which features to re-implement or not. • Prioritize based on the dependencies and business impact. • Group them into the components. ◦ Rewrite each group one by one. Thorough investigation on the old system 54 既存システムの全ての機能と依存するリソー スの洗い出し
  55. ©Hatena Co., Ltd. • Where are we in the project?

    ◦ The list will help clarifying the progress. • Encounter an unexpected features / dependencies while the rewrite project… ◦ There’s no way to avoid it other than listing all features and dependencies thoroughly before rewrite... Thorough investigation on the old system 55 プロジェクトの進捗を明らかに 想定外の仕様が後で発覚するのを防ぐ
  56. ©Hatena Co., Ltd. Switch upstream on reverse proxy 56 reverse

    proxy でリクエストを新/旧システムに 振り分け Listing user comments User page Setting Recommend old nginx Route to old system Route to new system new
  57. ©Hatena Co., Ltd. Split a component as a microservice 57

    一部の機能をマイクロサービスとして分離で きることも Listing user comments User page Setting Recommend old nginx new Split as a microservice
  58. ©Hatena Co., Ltd. Since we created a new database with

    brand new DB structure, it was required to migrate all the data in old database to new one. Data migration 58 新アプリケーションのために新しくDBを作っ たのでデータ移行が必要 Original App New App Original DB New DB
  59. ©Hatena Co., Ltd. Downtime for maintenance • Stop the service

    for each data migration. • Maintenance time might continue several hours. ◦ Large scale ◦ Complexed ETL process Downtime for maintenance vs zero-downtime 59 メンテナンスを挟むデータ移行と、ゼロダウン タイムでのデータ移行 Zero-downtime • No downtime • Require real-time data replication. • Replication delay.
  60. ©Hatena Co., Ltd. Data migration with zero-downtime 60 ゼロダウンタイムでのデータ移行することを決 断

    • Considering the required number of downtimes, it wasn’t acceptable to stop the service repeatedly. • Replication delay was not so critical.
  61. ©Hatena Co., Ltd. • 1. Start real-time data migration ◦

    Replicate the writes on the old system to new system. • 2. Batch data migration ◦ Copy all existing data into the new database. • 3. Data verification • 4. Replace Real-time and batch data migration 61 リアルタイムデータ移行とバッチデータ移行で ゼロダウンタイムを実現
  62. ©Hatena Co., Ltd. • Aug 2017: Replace comment list page

    • Nov 2017: Replace user page • Mar 2018: Replace top page • … • May 2019: Stop the old system Finally, released all the replaces!! 62 2019年5月に全てのデータ移行と置き換え作 業が完了し旧システム停止
  63. ©Hatena Co., Ltd. • Great improvements in non-functional requiurements ▪

    Faster response time ▪ Improved algorithms • Over the estimated development cost ◦ It is hard to estimate the exact cost for the rewrite. ◦ Rewriting the big legacy software always takes years. • We didn’t have any big re-work ◦ Thanks to the thorough investigation and. Review 63 見積もりより時間がかかってしまった しかし大きな手戻りなく進められた
  64. ©Hatena Co., Ltd. • Refactoring or Rewrite? ▪ Consider carefully

    / Refactoring first ▪ Rewrite is really powerful but tough • Solved problems in the old system thanks to Scala! ◦ Thank you!!! • Consider incremental rewrite for big rewrite ◦ Clarify the progress / safer / cost • Thorough research on the original system ◦ Prevent big-rework / listing all tasks Summary 64 まとめ
  65. ©Hatena Co., Ltd. Questions? 65

  66. ©Hatena Co., Ltd. If we have time I’m gonna talk

    about data migration deeper. 66 もしまだ時間があればデータ移行についても う少し詳しくお話します。
  67. ©Hatena Co., Ltd. • Options ◦ Push from Application ◦

    Push from Datastore ◦ Poll old datastore periodically Real-time data migration 67 リアルタイムデータ移行の方法 App or DB からのpush か polling
  68. ©Hatena Co., Ltd. Push all the updates on the original

    system to the new app, from the original app. Real-time data migration (From App) 68 旧システムに対する書き込みを旧アプリから 新アプリに対して同期する Original App Original DB New App New DB write enqueue write
  69. ©Hatena Co., Ltd. • Pros ◦ Easy to validate and

    transform data so that it fits to the new DB structure. • Cons ◦ Necessary to add code to the original app to send updates to the queue. ◦ Need to grasp all the sources of the updates (otherwise, some updates will lost). Real-time data migration (From App) 69 旧システムにおける書き込みの口を全て把握 する必要がある。
  70. ©Hatena Co., Ltd. Make the old database trigger writes the

    updates to the queue. Real-time data migration (From DB) 70 旧DBにtriggerを定義してそこからキューに書 き込む方法 Original App Original DB New App New DB write enqueue write
  71. ©Hatena Co., Ltd. • Pros • Don’t have to work

    on old application • Comprehensive (No worry about missing updates) • Cons ◦ Need to maintain complexed triggers and UDFs that write the updates to the queue. ◦ The migration logic will be regulated by SQL’s expressibility. Real-time data migration (From DB) 71 各テーブルへの書き込みの移行漏れの心配 がないが、複雑なトリガの運用が必須
  72. ©Hatena Co., Ltd. Fetch the data from the original system

    periodically. Real-time data migration (Poll) 72 定期的に旧システムからデータを取得し新シ ステムに移行 Original App Original DB New App New DB write write Cron Poll
  73. ©Hatena Co., Ltd. • Pros ◦ Don’t need to work

    on the original system ◦ Can build the migration system independently. • Cons ◦ Delayed replication. Real-time data migration (Poll) 73 旧システムと独立して移行システムを構築で きる。同期に大きな遅延が起こる。
  74. ©Hatena Co., Ltd. Push from Application • It is required

    to synchronize the data between original and new DB with small delays. • Complexed data transformation process. Our choice 74 アプリケーションからのpushを採用、遅延の 少なさやデータ構造の変換のため
  75. ©Hatena Co., Ltd. While real-time data migration replicate the new

    updates to the original system, batch data migration aims to copy all the existing data in the original system. Batch data migration 75 バッチデータ移行では既存の全てのデータを 新システムに移行する
  76. ©Hatena Co., Ltd. • Write idempotent script ◦ It is

    hard to migrate all the data to the new system only with a single trial. ◦ We’ll need to re-run our migration again to complete the job. ◦ Idempotency will help the cycle of trial and error. Tips for writing a batch data migration script 76 移行スクリプトを冪等にすることで再実行を容 易にできるよにしておく。
  77. ©Hatena Co., Ltd. • Estimate the execution time of the

    batch script ◦ Try to estimate how much time our script to run. ◦ If it is too long, consider to ▪ Running the script on a dedicated server. ▪ Scale up original or new database server. ▪ Performance optimization on the script. Tips for writing a batch data migration script 77 実行にかかる時間を計算。長すぎる場合は 高速化のための対応を検討。
  78. ©Hatena Co., Ltd. • Retry plan ◦ The script may

    stop in the middle of the migration because of an unexpected error. ◦ It will save your time to design the script so that it can re-run from the specific point of migration. Tips for writing a batch data migration script 78 スクリプトを任意の点から再開できるようにし ておくと再実行の時間を節約可 Re-run from here Already migrated Not yet migrated
  79. ©Hatena Co., Ltd. 1. Start real-time data migration 2. Batch

    data migration 3. Replace the application Steps of data migration 79 リアルタイムとバッチデータ移行の順序 Run batch data migration Start real-time data migration
  80. ©Hatena Co., Ltd. 1. Start real-time data migration 2. Batch

    data migration 3. Replace the application If the step1 and 2 reverse, some data won’t be migrated. Steps of data migration - Otherwise... 80 リアルタイムとバッチデータ移行の順序 Run batch data migration Batch data migration Real-time migration Start real-time data migration Data in this period will lost
  81. ©Hatena Co., Ltd. Risk of data collision (lost update anomaly)

    for update intensive data. Suppose we are trying to migrate data “X” from original DB to the new DB. Data collision between real-time and batch 81 更新頻度の高いデータではバッチとリアルタ イム移行間でデータ競合のリスク Original DB New DB X = 1
  82. ©Hatena Co., Ltd. First, batch data migration script reads data

    X from original DB. Data collision between real-time and batch 82 まず最初にバッチデータ移行スクリプトが データを旧DBから読み込む Original DB New DB X = 1 Batch data migration script X = 1
  83. ©Hatena Co., Ltd. The X on the original DB is

    updated to 2, and synchronized to the new DB, before the batch script write the data to the new DB. Data collision between real-time and batch 83 次にバッチスクリプトが新DBにデータを書く 前にリアルタイム移行が起きたとき Original DB New DB X = 2 Update X = 2 Batch data migration script X = 1 X = 2 Real-time data migration
  84. ©Hatena Co., Ltd. Finally, the batch data migration script overwrites

    value X in the new DB with X = 1. Data collision between real-time and batch 84 最後にバッチ移行スクリプトが新DBに書き込 みを行うと不整合が起きる。 Original DB New DB X = 2 Batch data migration script X = 1 X = 1 The value X should be equal to the X in the original DB... Update X = 2 Update on the original DB lost
  85. ©Hatena Co., Ltd. Compare their updated_at before write to the

    new DB, and adopt the newer value as the resulting data. To avoid the Lost Update 85 データの更新時刻を比較して新しい方を採用 することで不整合を防ぐ。 Original DB New DB Batch data migration script X = 2 updated_at = 1970-01-01 12:00:01 X = 1 updated_at = 1970-01-01 12:00:00 X = 2 updated_at = 1970-01-01 12:00:01 Do not update because the existing data is newer.
  86. ©Hatena Co., Ltd. Though the Lost Update anomaly will occur

    on the update intensive data, in the most cases, the probability of data collision might be ignorable and it is sufficient to validate and re-run the data migration (only if the migration went wrong). Should we always implement it? 86 更新頻度の低いデータでは起こりにくいので 多少無視できる