Slide 1

Slide 1 text

Copyright © GREE, Inc. All Rights Reserved. The Trial and Error in Releasing GREE Chat Shun Ozaki, Takayuki Hasegawa Scala Matsuri2014 B-6 GREE's First Scala Product

Slide 2

Slide 2 text

Copyright © GREE, Inc. All Rights Reserved. • Shun Ozaki • @wozaki • Joined in April, 2013 • Android Application • Takayuki Hasegawa • @hase1031 • Joined in April, 2013 • NLP, Machine Learning Self Introduction 1/56

Slide 3

Slide 3 text

Copyright © GREE, Inc. All Rights Reserved. • Released at June 2 GREE Chat 2/56

Slide 4

Slide 4 text

Copyright © GREE, Inc. All Rights Reserved. How to build our system for hundreds of thousands daily users Agenda 3/56

Slide 5

Slide 5 text

Copyright © GREE, Inc. All Rights Reserved. Share our knowledge we got through the development of GREE Chat • Why Scala? • Team development • Decision of frameworks • Obstacles & workarounds Goal 4/56

Slide 6

Slide 6 text

Copyright © GREE, Inc. All Rights Reserved. • Reason of Selecting Scala • Learning Scala in a Team • Architecture of GREE Chat • Obstacles • Summary Outline

Slide 7

Slide 7 text

Copyright © GREE, Inc. All Rights Reserved. • Reason of Selecting Scala • Learning Scala in a Team • Architecture of GREE Chat • Obstacles • Summary Outline

Slide 8

Slide 8 text

Copyright © GREE, Inc. All Rights Reserved. Decide the Language based on the Requirement

Slide 9

Slide 9 text

Copyright © GREE, Inc. All Rights Reserved. • )VOESFETPGUIPVTBOET daily users • 3FBMUJNF response Ø Connect with streaming • Run on BTNBMMOVNCFS of servers Ø Utilize server resource effectively • Maintain for over ZFBST Requirements 8/56

Slide 10

Slide 10 text

Copyright © GREE, Inc. All Rights Reserved. ◎ GREE uses PHP so heavily • Many libraries and know-how in GREE △ Streaming • #connections = #processes △ Single thread, multi process • Overhead of spawning an OS process △ Maintainability • Dynamic typing 9/56

Slide 11

Slide 11 text

Copyright © GREE, Inc. All Rights Reserved. ◎Compatible with concurrent programming ◎One process, many connections ◎High maintainability • Static typing • Functional programming (no side effects) ◎Stimulate new technology learning in GREE • Open to new languages & technologies! Scala! 10/56

Slide 12

Slide 12 text

Copyright © GREE, Inc. All Rights Reserved. • Reason of Selecting Scala • Learning Scala in a Team • Architecture of GREE Chat • Obstacles • Summary Outline

Slide 13

Slide 13 text

Copyright © GREE, Inc. All Rights Reserved. After adopting Scala …

Slide 14

Slide 14 text

Copyright © GREE, Inc. All Rights Reserved. Only two Scala programmers in a team of seven Shortage of Scala experts 2 / 7 13/56

Slide 15

Slide 15 text

Copyright © GREE, Inc. All Rights Reserved. Only two Scala programmers in a team of seven We must build up our skill! Shortage of Scala experts 2 / 7 14/56

Slide 16

Slide 16 text

Copyright © GREE, Inc. All Rights Reserved. 1. Self study 2. Study club 3. Pair programming Learning of Scala 15/56

Slide 17

Slide 17 text

Copyright © GREE, Inc. All Rights Reserved. 1. Self Study Understand the basic syntax

Slide 18

Slide 18 text

Copyright © GREE, Inc. All Rights Reserved. 1. Self Study Books 17/56

Slide 19

Slide 19 text

Copyright © GREE, Inc. All Rights Reserved. 1. Self Study Documents by Twitter https://twitter.github.io/scala_school/ http://twitter.github.io/effectivescala/ 18/56

Slide 20

Slide 20 text

Copyright © GREE, Inc. All Rights Reserved. 1. Self Study Source code from OSS 19/56

Slide 21

Slide 21 text

Copyright © GREE, Inc. All Rights Reserved. 2. Study Club Share what we learned by ourselves Introduction to functional programming

Slide 22

Slide 22 text

Copyright © GREE, Inc. All Rights Reserved. • Each engineer solves the problems provided by Scala experts • e.g., Binary Tree, Fibonacci Sequence 2. Study Club 21/56

Slide 23

Slide 23 text

Copyright © GREE, Inc. All Rights Reserved. • Example of factorial generation algorithm 2. Study Club Discussion with Members Use var Use recursion 22/56

Slide 24

Slide 24 text

Copyright © GREE, Inc. All Rights Reserved. Use var Use recursion • Example of factorial generation algorithm 2. Study Club Discussion with Members No side effect 23/56

Slide 25

Slide 25 text

Copyright © GREE, Inc. All Rights Reserved. 3. Pair Programming Practice by using the knowledge obtained from study club

Slide 26

Slide 26 text

Copyright © GREE, Inc. All Rights Reserved. • Effective learning • Learn about symbols (e.g., +, @), which cannot be searched easily on the Internet • Supplement the knowledge not taught at study club • Complex syntaxes (e.g., implicit, currying) • Development tools (e.g., sbt, IntelliJ) 3. Pair Programming 25/56

Slide 27

Slide 27 text

Copyright © GREE, Inc. All Rights Reserved. • Reasons of selecting Scala • Compatible with concurrent programming • Maintainability because of static typing • Learning of Scala • Self Study, Study Club, Pair Programming • Big burdens on Scala experts • Needed for teaching team members • Maintain the quality of codes Summary: Adopting Scala 26/56

Slide 28

Slide 28 text

Copyright © GREE, Inc. All Rights Reserved. • Reason of Selecting Scala • Learning Scala in a Team • Architecture of GREE Chat • Obstacles • Summary Outline

Slide 29

Slide 29 text

Copyright © GREE, Inc. All Rights Reserved. • Servers separated by Queues • API Server: Process users’ requests • EventBus Server: Process events in Queue • Stream Server: Supply event to connected users Architecture of Backend (simplified) 28/56

Slide 30

Slide 30 text

Copyright © GREE, Inc. All Rights Reserved. Role of Each Server and Framework

Slide 31

Slide 31 text

Copyright © GREE, Inc. All Rights Reserved. • Enqueue events • e.g., Send message Join/leave conversation • Techniques to process a lot of requests • Delegate heavy tasks (e.g., Disk I/O) to others • Logic is written to run asynchronously • scala.concurrent.Future API Server Process users’ requests 30/56

Slide 32

Slide 32 text

Copyright © GREE, Inc. All Rights Reserved. • RPC system for JVM based on Netty • Support us to write asynchronous logic • Used by large scale web services • e.g., Twitter, Tumblr, Foursquare, Pinterest • Equipped with various clients • Redis, Memcached • With retry policy, connection pools 31/56

Slide 33

Slide 33 text

Copyright © GREE, Inc. All Rights Reserved. • Examples of tasks • Logging • Inquire GREE internal system • Store events in DB • etc. • Problems of concurrent processing • Deadlock, race condition, … EventBus Server Process events in Queue 32/56

Slide 34

Slide 34 text

Copyright © GREE, Inc. All Rights Reserved. Framework to write concurrent, distributed logic more easily • No need to handle shared resources • Race condition, dead-lock • Logic separation • Parent-Child • Fault tolerance 33/56

Slide 35

Slide 35 text

Copyright © GREE, Inc. All Rights Reserved. Example of Akka 34/56

Slide 36

Slide 36 text

Copyright © GREE, Inc. All Rights Reserved. Example of Akka Match by Event type 35/56

Slide 37

Slide 37 text

Copyright © GREE, Inc. All Rights Reserved. Example of Akka Send copy to child 36/56

Slide 38

Slide 38 text

Copyright © GREE, Inc. All Rights Reserved. Example of Akka log 37/56

Slide 39

Slide 39 text

Copyright © GREE, Inc. All Rights Reserved. • Connect by streaming • Increase connections as much as possible • Keep users data in the memory to reduce I/O • Asynchronous I/O by Finagle, Akka • Task division by Akka • Supply events to users • Update users data • Send KeepAlive to users Stream Server Supply events to connected users 38/56

Slide 40

Slide 40 text

Copyright © GREE, Inc. All Rights Reserved. • Each server is separated by Queue • API, EventBus, Stream Server • Framework is selected by considering asynchronous processing • Finagle, Akka • Task is divided by Akka • Easy to write concurrent logic Summary: Architecture 39/56

Slide 41

Slide 41 text

Copyright © GREE, Inc. All Rights Reserved. • Reason of Selecting Scala • Learning Scala in a Team • Architecture of GREE Chat • Obstacles • Summary Outline

Slide 42

Slide 42 text

Copyright © GREE, Inc. All Rights Reserved. 1.JVM • Full GC 2. Self-created Scala library • Sharding • ID Architecture Obstacles 41/56

Slide 43

Slide 43 text

Copyright © GREE, Inc. All Rights Reserved. Full GC Problems

Slide 44

Slide 44 text

Copyright © GREE, Inc. All Rights Reserved. • Scala runs on JVM • We want to prevent “stop the world” • GC in young generation is faster than full GC • Try not to let it store object in old generation Full GC Old generation Young generation 43/56

Slide 45

Slide 45 text

Copyright © GREE, Inc. All Rights Reserved. • Use method that has side effects • Reuse object everywhere • Forget to release used resources Causes of Uncollected Reference Problem 44/56

Slide 46

Slide 46 text

Copyright © GREE, Inc. All Rights Reserved. • Use method that has side effects • Reuse object everywhere • Forget to release used resources Difficult to recognize by developers Causes of Uncollected Reference Problem 45/56

Slide 47

Slide 47 text

Copyright © GREE, Inc. All Rights Reserved. • Use val variable (final variable) • Don’t use mutable variable • Don’t leave references to objects for too long Short-Lived Object Create new objects instead of updating old ones 46/56

Slide 48

Slide 48 text

Copyright © GREE, Inc. All Rights Reserved. Sharding: Scalable Storage 47/56

Slide 49

Slide 49 text

Copyright © GREE, Inc. All Rights Reserved. Scale up? 48/56

Slide 50

Slide 50 text

Copyright © GREE, Inc. All Rights Reserved. Scale out! 49/56

Slide 51

Slide 51 text

Copyright © GREE, Inc. All Rights Reserved. • Sharding is necessary! • Considering capacity and #accesses • We have Cascade, a sharding library • For PHP, not Scala • So, we created Aurora and make it OSS • Aurora and application were developed simultaneously • So difficult to integrate them during development • Code modification must be done in many parts Horizontal Partitioning 50/56

Slide 52

Slide 52 text

Copyright © GREE, Inc. All Rights Reserved. Ref︓MySQL InnoDB Primary Key Choice: GUID/UUID vs Integer Insert Performance http://kccoder.com/mysql/uuid-vs-int-insert-performance/ ID Architecture: UUID and InnoDB

Slide 53

Slide 53 text

Copyright © GREE, Inc. All Rights Reserved. Ref︓MySQL InnoDB Primary Key Choice: GUID/UUID vs Integer Insert Performance http://kccoder.com/mysql/uuid-vs-int-insert-performance/ UUID AUTO_INCREMENT #record #time

Slide 54

Slide 54 text

Copyright © GREE, Inc. All Rights Reserved. • At first, we used UUID as primary key • But this caused MySQL to insert randomly • Created a library to generate ID instead • Referred to Snowflake by Twitter • Sequential insertion based on sorted created time Drop UUID, Take Time-Based 53/56

Slide 55

Slide 55 text

Copyright © GREE, Inc. All Rights Reserved. • JVM • Use short-lived objects to avoid Full GC • Self-created Scala library • Sharding • Made Scala library OSS • Integration of library to application was difficult • ID Architecture • Adopt time-based ID for sequential insertion Summary: Obstacles & Workaround 54/56

Slide 56

Slide 56 text

Copyright © GREE, Inc. All Rights Reserved. • Reason of Selecting Scala • Learning Scala in a Team • Architecture of GREE Chat • Obstacles • Summary Outline

Slide 57

Slide 57 text

Copyright © GREE, Inc. All Rights Reserved. • Reasons for selecting Scala • Compatible with concurrent programming • Plan Scala learning within the team • Learning cost is expensive • Architecture & Frameworks • Each server is separated by Queue • Finagle, Akka • Shortage of libraries • We must have made libraries by ourselves • Needed to pull request to OSS repository Summary 56/56

Slide 58

Slide 58 text

Copyright © GREE, Inc. All Rights Reserved. Thanks! @tomoyoshi_ogura, @kyo_ago, @takc923 @yoshie_777, @le_chang, @beketa, @j5ik2o and ScalaMatsuri staffs