Upgrade to Pro — share decks privately, control downloads, hide ads and more …

History of Falcon, the way to production release

History of Falcon, the way to production release

ChatWorkのScala採用プロダクト "Falcon" リリースまでの失敗と成功の歴史

かとじゅん
PRO

March 07, 2017
Tweet

More Decks by かとじゅん

Other Decks in Technology

Transcript

  1. History of Falcon,
    the way to production release
    Junichi Kato (@j5ik2o)
    ChatWorkのScala採用プロダクト “Falcon” リリースまでの失敗と成功の歴史

    View Slide

  2. Self Introduction
    ● Approximately 6 years Scala experience.
    ● A backend software engineer developing a "business
    chat" service: ChatWork
    ● Responsible to architect and develop backends of
    ChatWork are my job.
    自己紹介。ChatWorkでエンジニアとして働いています

    View Slide

  3. Agenda
    ● The history of the "Falcon" project which has been released at
    the end of 2016
    ○ “ChatWork” chat service
    ○ History of Falcon Project
    ■ Phase-1: Live Migration Project
    ■ Rebooting Falcon
    ■ POC(= Proof Of Concept)
    ■ Phase-2: Production Development
    ■ DevOps
    ■ Finally Released
    ○ Conclusion
    アジェンダ。

    View Slide

  4. About ChatWork
    ● ChatWork is a chat service for business instead of mail or personal chat.
    ChatWorkは、メール・チャットに変わるビジネスチャットです。
    ● Number Of Clients
    ○ 124,000 companies (as
    of the end of Jan, 2017)
    ● Country / Region
    ○ 205 places
    ● Best of Business Chat
    ● Support for iOS, Android, Web
    ● ISO27001(ISMS) and
    ISO27018 Certificated
    ● Functions
    ○ Group Messaging
    ○ Task Management
    ○ File Sharing
    ○ Video Conferencing

    View Slide

  5. Scale of User Generated Data
    チャットワークユーザが生成するデータの規模
    rapid increase
    of messages !
    Number of . 5th
    Annivers
    ary
    6th
    Anniversa
    ry
    Chat Rooms 2.4 million 4.2million
    Messages 1 billion 1.8 billion
    Tasks 37 million 60 million
    Files 64 million 133 million

    View Slide

  6. Background of Developing ChatWork
    ● In 2010, ChatWork was developed for a internal product, built
    on PHP framework.
    ● Development for business opportunities led to technical debts.
    ● the system cannot support increasing data and loads.
    チャットワーク開発の経緯

    View Slide

  7. Way to re-implementation
    ● Occurred events by the technical debts. delayed
    delivery-time, system down trouble by SPoF, increasing
    workloads etc
    ● After that, the technical debts partially was improved, but
    they were supportive countermeasures.
    ● Eventually, We decided to re-implement it, because it
    became difficult to extend it any more.
    ● Of course, that is not easy.
    チャットワークの再実装

    View Slide

  8. We chose Scala
    ● Scala won in our training camp.
    ● The reasons are
    ○ Maintainability and performance are
    high
    ○ From dynamic typed ​​to static typed,
    success stories
    ○ AWS SDK for Java is the most fulfilling.
    ○ Congeniality of Scala and real-time
    proccesing for chat
    ○ Even PHP engineers became be able to
    coding by Scala as quickly as possible.
    Scalaの採用決定

    View Slide

  9. I joined ChatWork
    ● At July 2014, I joined ChatWork for
    migration to Scala.
    ● Approximately 6 years Scala experience.
    ○ REST API Server by Play2, for VOD
    Service
    ○ Chat Server by Finagle with Akka
    ● After that, we started the server side
    project that adopted Scala in ChatWork.
    このタイミングで入社しました

    View Slide

  10. Phase1: Live Migration Project
    P1: ライブマイグレーションプロジェクト

    View Slide

  11. Phase1: Strategies for Migrating Architecture
    ● To minimize the impact of stable legacy systems.
    ○ Don’t modify existing code as much as possible
    ○ The new system should be migrated without
    maintenance with downtime.
    ○ Don’t migrate existing data.
    ● Include rooms, messages, tasks, files, contacts in
    function scope.
    P1: アーキテクチャ移行のための戦略

    View Slide

  12. Phase1: Our Project Team Structure
    ● Since 07/2014
    ● Team Structure (Total 19 memebers)
    ○ Falcon Team (New Server Side by Scala)
    ■ 8 members (I belong to this team.)
    ○ Phoenix Team (Legacy Service Side by PHP)
    ■ 5 members
    ○ iOS Team(New-Version iOS Application Team)
    ■ 6 members
    ● Note: The number of members means final. It has grown as
    hiring Scala engineers.
    P1: プロジェクトチーム体制

    View Slide

  13. Phase1: Function Scope
    ● Chat Room (is a collection of messages)
    ○ Creating the Chat Room or Updating MetaData of it
    ○ Posting Messages, Updating them, Deleting them
    ○ Adding Members, Removing them, Modifying Role of them
    ○ Uploading Files, Deleting them
    ○ Adding Tasks, Updating them, Deleting them
    ● Contact (indicate connections between users)
    ○ Applying contacts, Reject them, Approving them
    ● FalconID (is 64bit ID by generating distributed id-workers)
    ○ Generating 64 bit ID with distributed id-worker
    ○ Mapping old id with it.
    P1: 機能スコープ

    View Slide

  14. Phase1 : Architecture Overview
    ● The former "master data" had been
    persisted in RDS of Legacy system.
    Access to the master data is via the
    Phoenix API.
    ● Falcon receives IOEvent that occurred in
    Legacy system.With IOEvent as a trigger,
    the system constructs the event to be
    delivered by the stream and the model
    cache in DynamoDB.
    ● The new client uses Falcon external API
    and stream API.
    ● internal-api performs Id generation and Id
    mapping.
    P1: アーキテクチャ概要

    View Slide

  15. Phase1: Context Map of DDD
    ● The downstream customer depends on
    the upstream supplier.
    ● At planning time, the downstream
    behaves as the customer to the upstream.
    At running time, the upstream behave as
    the interface supplier.
    ● Actually the communication of our
    teams were very complicated. It was a
    difficult problem together with
    technical issues.
    P1: アーキテクチャ概要
    Falcon
    (as Customer, Supplier)
    iOS Team
    (as Customer)
    ChatWork Web
    (as Supplier)
    Phoenix
    (as Customer/Supplier)

    View Slide

  16. ● Specifications and implementations side
    ○ Missing specifications spawned one after another.
    ○ Too much DynamoDB I/O cost due to overused secondary-index.
    ○ Pheonix API server is overloaded than expected.
    ○ High ID mapping cost.
    ○ limit of the managed service’ s performance.
    ● Project side
    ○ Project definition is ambiguous.
    ○ The review of each sprint was not enough.
    ○ The Integration-testing between subsystems was delayed.
    ○ Exhaustion due to long-term development.
    ● Scala itself had no major problems. The true problems was about project
    management, function-scope, and performance.
    Phase1: Various Problems that occurred
    P1: 発生した様々な問題

    View Slide

  17. Phase1: Make a Tough Decision
    ● Rescheduling the project repeatedly occurred around 2015.
    It eventually resulted in suspension in January 2016...
    ● We reviewed why we failed.
    ● There were many problems, but the good results were obtained.
    ○ The size and complexity of our challenge was reconfirmed
    concretely.
    ○ Our strong team was organized to solve complex issues.
    ○ Our practices for Akka and DDD was deepened. Especially
    We wanted to make Akka's ability more apply effectively to
    our applications.
    苦渋の決断

    View Slide

  18. Rebooting the Project
    ● We welcomed a new leader and project management and strategy were totally
    revised.
    ● New Project Strategy
    ○ To be the robust architecture for infrastructure system.
    ○ Clarification of business and technology issues to be solved.
    ○ POC is MUST.
    ○ Clarification of final non-functional requirements.
    ■ Decrease infrastructure cost by 30%
    ■ 15 billion messages / month
    ■ 500k writes/s, 5000k reads/s (100 times the legacy system)
    ○ The Data Migration with down time was accepted instead of Live
    Migration to cope with the rapidly increasing data volume.
    プロジェクトの再起動

    View Slide

  19. ● POC Bootcamp(2016/1)
    ○ Prototyping and review My Best Falcon Application with each members.
    ● Properties that the system should satisfy
    ○ Scalability(High throughput, Low latency)
    ○ Resiliency(Non SPoF, Backoff recovery)
    ○ twice the number of concurrent connections and R/W throughput.
    ○ Low cost
    ○ Functionality (based on DDD)
    ● Requirement
    ○ AWS
    ○ CQRS + Event Sourcing
    ○ Reactive Systems
    POC: Objective of “Proof of Concept”
    POC: POCの目的

    View Slide

  20. ● Since 2/2016
    ● Target scope is the messaging function contains chat room and member.
    ● As architecture, CQRS+ES was adopted because reading requests are
    more than writing requests, depending on chat characteristics.
    ○ akka-http, akka-actor, akka-stream, akka-persistence(-query).
    ○ our commponents are write-api, read-api, read-model-updater.
    ○ Layered architecture on our applications is Hexagonal-Architecture.
    ● Infrastructure and middleware
    ○ AWS EC2, ELB
    ○ Deployment tool is Lightbend ConductR.
    ○ Write DB is Cassandra, Read DB is Aurora
    ■ These DBs was selected to handle easy with Akka as a temporary
    option. In production, other options were choiced.
    POC: Verification for Risk Hedging
    POC: リスクヘッジのための検証

    View Slide

  21. ● Write API uses ClusterSharding and
    PersistentActor as Aggregate.
    ● Aggregate generates domain events from
    the received commands then adds them
    to the write db.
    ● ReadModelUpdater consumes domain
    events and constructs read-models
    asynchronously.
    ● Read API is non-cluster and stateless , has
    functions to return a flattened read-models.
    ● Multiple layers(Interface, UseCase,
    Domain etc) of "Hexagonal Architecture"
    in application, and each layers are
    composed with stream DSL (of
    akka-stream).
    POC: Architecture Overview
    POC: アーキテクチャ概要

    View Slide

  22. ● Instance Type
    ○ c3.x2large(vCPU = 8, Mem = 15GB)
    ○ Cassandra(m3.xlarge x 3)
    ○ Aurora(db.r3.2xlarge, write x 1, replica x 2)
    ● Throughput (from Write to Read)
    ○ random request
    ○ About 5,000 users concurrency
    ○ Almost linear and scale out possible.
    ○ KOs are zero.
    ● Posting messages
    ○ 3 nodes, 2,000 users concurrency,
    2000rps(120krpm) response time is 90pct
    max 30ms !
    POC: Result of POC(1/2)
    POC: 成果(1/2)

    View Slide

  23. POC: Result of POC (2/2)
    POC: 成果(2/2)
    ● Our adoption of akka cluster had many operational problems to make it the production
    service level within a short period of time.
    ○ How to solve the Split-Brain problem in 2-AZ? it’s impossible.
    ○ In our requirements, stateful actors were overkill and high operational cost .
    ■ Stateful actors are not effective because retrieving old data are few.
    ■ must be ‘ClusterSharding’ for stateful actors
    ○ Even other methods with low operation costs was able to satisfied our
    requirements.
    ● Cassandra
    ○ Estimated 24 hrs to re-create failure node.
    ○ The data distribution method by DHT and virtual node are not intuitive and difficult to
    understand.
    ● Aurora
    ○ Write performance cannot scale well in a single master manner. Sharding can
    solve it but needs expensive development and operation.

    View Slide

  24. Phase2: Production Development
    P2: プロダクション向け開発

    View Slide

  25. Phase2: Re-Architecture from POC
    P2: POCからのリアーキテクチャ
    ● akka-cluster was not adopted for reduction of operation cost, then to be
    stateless actors on APIs.
    ● For write-db, Kafka replaced Cassandra as write storage
    ○ straightforward append-only domain event storage with great
    produce/consume rate performance
    ● For read-db, HBase replaced Aurora as read storage.
    ○ Auto sharding based on row key on the storage level, and Master/Slave
    configuration is intuitive and easy to understand.
    ○ Underlying HDFS is fault tolerant and easy to manage
    ● Only focused on messaging system
    ○ core function that has many dependent features (e.g. tasks, files)
    ○ the highest business risk
    ○ the largest business opportunity

    View Slide

  26. ● Since 7/2016
    ● Team Structure (Total 11 members)
    ○ Falcon Team
    ■ 4 members (I belong to this team)
    ○ Data Migration Team
    ■ 1 members
    ○ Sparrow Team (Legacy Service Side by PHP)
    ■ 3 members
    ○ Infrastructure Team
    ■ 3 members
    ● Note: Since the early stages, our starting members is above.
    Phase2: Our Project Structure
    P2: プロジェクト体制

    View Slide

  27. P2: アーキテクチャ概要
    ● Concept
    ○ Backend service providing
    messaging function to Legacy
    system.
    ○ Storage selection was changed
    but CQRS+ES was kept.
    ● Components
    ○ ReadModelUpdater uses Kafka
    Streams.
    ○ Sparrow is mediator system
    bridging Falcon and the legacy
    system.
    ○ The Domain Events to the legacy
    system are sent from
    sparrow-forwarder to sparrow.
    ○ SparrowForwarder propagates
    domain events to Sparrow.
    Phase2 : Architecture Overview

    View Slide

  28. Phase2: Context Map of DDD
    ● Simpler Context Map of DDD than Phase1.
    ● Inter-team communication structure became
    simple as well.
    P2: コンテキストマップ
    Web, iOS, Android
    (as Existing Customer)
    Falcon
    (as Supplier)
    ChatWork
    (contains Sparrow)
    (as Customer/Supplier)

    View Slide

  29. ● System Configurations
    ○ c3.xlarge(vCPU = 4, Mem = 7.5GB) * 7
    ■ Write API * 2, Read API * 4,
    ■ ReadModelUpdater * 2, SparrowForwarder * 2
    ● Post Message API
    ○ 3000 users concurrency, throughput mean 2.6Kreq/s (latency 95percentile
    104ms)
    ■ max 70 req/s at exsiting system (37 times throughput)
    ● Get Message API
    ○ 1340 users concurrency, throughput mean 1.2Kreq/s (latency 95percentile
    62.9ms)
    ■ max 1.3 Kreq/s at exsiting system
    Phase2: Results of Stress Test
    P2: 負荷試験結果

    View Slide

  30. Phase2: Data Migration(1/2)
    ● Data Migration project aimed to migrate message data from Aurora to HBase.
    Minimizing service downtime is the most important mission.
    ● Considering them, the migration strategy was decided as follows
    ○ Basic Migration
    ■ All data except 4 days before final maintenance.
    ○ Incremental Migration
    ■ For INSERT, difference is based on ID increase from previous migration.
    ■ For UPDATE, difference is based on binlog from previous migration.
    ○ Verification After Migration
    ■ It is checked whether column data on HBase matches column data on
    Aurora.
    P2: データマイグレーション(1/2)

    View Slide

  31. Phase2: Data Migration(2/2)
    ● Data Migration engine:
    ○ Spark
    ● performance
    ○ Execution of Basic Migration
    ■ 3.5 hours (1.6 billion messages、60million chat rooms)
    ○ Verification of Basic Migration
    ■ 7.5 hours
    ○ Execution of Incremental Migration
    ■ 1 hour
    ○ Verification of Incremental Migration
    ■ 1 hour
    P2: データマイグレーション(2/2)

    View Slide

  32. ● Existing issues
    ○ It isn’t easy for developers to flexibly construct infrastructure for application development. because it
    is necessary to collaborate with infrastructure personnel. Collaboration with them has been made
    more efficient, and the design such as deployment, provisioning, scaling etc needs to be flexible.
    ● Countermeasure
    ○ coreos/kube-aws was adopted
    ■ kube-aws is tool and the installation artifacts for kubernetes on aws, developed by CoreOS.
    ● Create, update and destroy Kubernetes clusters on AWS
    ● Highly available and scalable Kubernetes clusters backed by multi-AZ deployment and
    Node Pools.
    ● Powered by various AWS services including CloudFormation, KMS, Auto Scaling, Spot
    Fleet, EC2, ELB, S3, etc.
    ○ concourse/concourse was adopted
    ■ Concourse is a pipeline-based CI system written in Go, developed by Pivotal. treats build
    pipelines and artifacts as first-class citizens.
    ■ In ThoughtWork's TECNOLOGY-RADAR 11/2015, the concourse-ci is contained in tools that
    'ACCESS' category.
    DevOps: Improving Development Efficiency
    DevOps: 開発効率の向上

    View Slide

  33. DevOps : Falcon Infrastructure by kube-aws
    DevOps: kube-awsによるFalconインフラ
    ● kubelet is the primary “node agent”
    that runs on each node.
    ● kube-proxy runs on each node.
    ● APIs validates and configures data
    for the api objects which include
    pods, services,
    replication-controllers, and others.
    ● Pod is a group of one or more
    containers, the shared storage for
    those containers, and options about
    how to run the containers.
    ● Falcon applications are deployed
    as Pods via helm(is package
    manager for k8s).

    View Slide

  34. DevOps : Concourse CI (1/2)
    DepOps: Concourse CI (1/2)
    ● Core Concepts
    ○ End goal of Concourse is to
    provide an expressive system
    with as few distinct moving parts
    as possible.
    ● Resources
    ○ A resource is any entity that can
    be checked for new versions,
    pulled down at a specific
    version, and/or pushed up to
    idempotently create new
    versions.
    ● Jobs
    ○ At a high level, a job describes
    some actions to perform when
    dependent resources change
    (or when manually triggered).
    Build Job
    Git Resource
    Deploy Job

    View Slide

  35. DevOps : Concourse CI (2/2)
    DepOps: Concourse CI (2/2)
    ● Tasks
    ○ A task is the execution
    of a script in an
    isolated environment
    with dependent
    resources available to
    it.
    Build Task
    Notification Task

    View Slide

  36. Finally Release
    ● The final release started at midnight December 29th, 2016, finished after 7 hours
    later. It succeeded!
    ● We are grateful for cheering messages from the Scala community. Thank you very much!
    ● Performance after release
    ○ As expected, Falcon achieves high throughput, low latency, resilliency.
    ○ And improvements to achieve the final goal will continue.
    ついにリリースへ

    View Slide

  37. Conclusion
    ● Falcon was released though twists and turns.
    ● Success Factors
    ○ Clarification of Project Strategy
    ■ The technical methods of achieving the project's goal was clarified.
    ○ Risk Hedging by POC
    ■ Verification the potential of CQRS + ES with Akka
    ○ Re-Architecture from POC
    ■ Review with consideration of operation costs
    ■ Function Scope Limitation
    ○ Data migration accepting downtime
    ○ Improving Development Efficiency by k8s, concourse-ci
    ● As a result, we succeeded in adoption an excellent architecture (CQRS+ES, Akka,
    Kafka, HBase) based on the verification.
    まとめ

    View Slide

  38. Thank you for listening!
    ご静聴ありがとうございました。

    View Slide