Realtime Database for high traffic production application

046baac588d91fd78a85b189847a151d?s=47 Sota Sugiura
September 01, 2018

Realtime Database for high traffic production application

2018/9/1
GDG DevFest Tokyo 2018

046baac588d91fd78a85b189847a151d?s=128

Sota Sugiura

September 01, 2018
Tweet

Transcript

  1. Realtime Database for high traffic production application @sota1235 2018/9/1 GDG

    DevFest Tokyo 2018
  2. None
  3. None
  4. For what?

  5. None
  6. Mercari channel • Live e-commerce • Working on Mercari app

    • You can buy/sell anything through streaming https://www.mercari.com/jp/mercari-channel/
  7. • Messages • Likes • Notifications • Item list

  8. Today • Why Realtime Database? • How to build app

    • How to run app on production
  9. Note • Today, Iʼll share some techniques • I believe

    itʼll help you to use Realtime Database
  10. Why Realtime Database?

  11. Tight schedule Easy to use Sync data in real time

    What we needed
  12. Points • Fully managed storage • Powerful SDK • Sync

    data in real time
  13. Fully managed • 1 JSON data in 1 database •

    Multiple databases per 1 Firebase Project • Make project and ship it
  14. Powerful SDK https://firebase.google.com/docs/guides/

  15. // Set the configuration for your app const config =

    { apiKey: "apiKey", authDomain: "projectId.firebaseapp.com", databaseURL: "https://databaseName.firebaseio.com", storageBucket: "bucket.appspot.com" }; firebase.initializeApp(config); // Get a reference to the database service const database = firebase.database(); Initializing
  16. const starCountRef = firebase.database().ref('posts/starCount'); starCountRef.on('value', (snapshot) => { console.log(snapshot.val()); });

  17. Sync data in real time Subscribe Publish

  18. How to build app

  19. Architecture design Schema design

  20. Architecture design Architecture design Schema design

  21. Read Write ・Using SDK to read data ・Techniques not to

    read all data ・Updating data with REST API ・Managing frequency of query
  22. Read data • Getting data through SDK • Thatʼs it

    • We need only 1 hack
  23. $PNNFOU $PNNFOU $PNNFOU Time

  24. $PNNFOU $PNNFOU $PNNFOU Time Start seeing streaming

  25. $PNNFOU $PNNFOU $PNNFOU Time Start seeing streaming SDK got this

    even though it’s unnecessary ☹
  26. Little hack • SDK gets one data when connecting •

    But sometimes itʼs unnecessary • So filtering data by timestamp
  27. Write data • Updating data through REST API • Managing

    frequency of query
  28. None
  29. None
  30. REST API • Two ways to write data • REST

    API or WebSocket(w/SDK) • In our case, REST API matched
  31. Authentication • Database token • Deprecated • Google OAuth2 •

    Recommended
  32. Cache auth token • Cache auth token on cache store

    • In case of sending request to REST API, it will work • OAuth2 authentication will be overhead
  33. keep-alive with chocon • We use chocon for using keep-alive

    • PHP process canʼt use it • Realtime Database instances are in US • 1 request means RTT to the Pacific Ocean • https://github.com/kazeburo/chocon
  34. Keep alive with chocon + , keep-alive

  35. Frequency managing • Not sending all data from user

  36. Frequency managing • We donʼt need to send these kinds

    of data every time • Likes, Audience count, Notifications • So we manage frequency of sending such data
  37. Mercari API Like count=1 count=1

  38. Mercari API Like count=1 count=1 Like Like Like count=2 count=3

    count=4
  39. Mercari API Like count=1 count=1 Like Like Like count=2 count=3

    count=4
  40. Mercari API Like count=1 count=1 Like Like Like count=2 count=3

    count=4 Check frequency before sending data N query/sec
  41. Mercari API Like count=1 count=1 Like Like Like count=2 count=3

    count=4 count=4 N sec
  42. Motivation • We donʼt need to update UI 1000 times

    • The most important point is experience • Actually, spec is limited for 1 instance • 1000 update per sec
  43. Architecture design Schema design Schema design

  44. Realtime Database is schema-less Then, how to design?

  45. Rule • Settings for 1 database • Written in JSON

    • Validation • Permission/Authorization
  46. { "rules": { ".write": false, "lives": { "$live_id": { "messages":

    { "$message_id": { ".validate": "newData.hasChildren(['user', 'text'])" } }, "notifications": { "buy_item": { ".validate": "newData.hasChildren(['text'])" } } } } } } Sample rule
  47. Our strategy • Only writing data from our server •

    Controlling read permission by using flag data on database
  48. Only writing from server • Pros • Not need to

    be careful about permission for client • Cons • Server needs to handle high traffic
  49. { "rules": { ".write": false, "lives": { "$live_id": { "messages":

    { "$message_id": { ".validate": "newData.hasChildren(['user', 'text'])" } }, "notifications": { "buy_item": { ".validate": "newData.hasChildren(['text'])" } } } } } } Read only Rule
  50. { "rules": { ".write": false, "lives": { "$live_id": { "messages":

    { "$message_id": { ".validate": "newData.hasChildren(['user', 'text'])" } }, "notifications": { "buy_item": { ".validate": "newData.hasChildren(['text'])" } } } } } } Read only Rule
  51. { "rules": { ".write": false, "lives": { "$live_id": { "messages":

    { "$message_id": { ".validate": "newData.hasChildren(['user', 'text'])" } }, "notifications": { "buy_item": { ".validate": "newData.hasChildren(['text'])" } } } } } } Read only Rule Tips: Admin user can ignore all rules
  52. Controlling permission dynamically • Rule can refer data on schema

    • It means we can manage permission by data
  53. { "rules": { ".write": false, "lives": { "$live_id": { "messages":

    { "$message_id": { ".validate": "newData.hasChildren(['user', 'image', 'text'])" } }, "notifications": { "buy_item": { ".validate": "newData.hasChildren(['text'])" } } } }, "alive_lives": { ".write": false, ".read": false } } }
  54. { "rules": { ".write": false, "lives": { "$live_id": { "messages":

    { "$message_id": { ".validate": "newData.hasChildren(['user', 'image', 'text'])" } }, "notifications": { "buy_item": { ".validate": "newData.hasChildren(['text'])" } } } }, "alive_lives": { ".write": false, ".read": false } } } Data for client Data for managing permission
  55. { "rules": { ".write": false, "lives": { "$live_id": { ".read":

    "root.child('alive_lives').child($live_id).val() === true", "messages": { "$message_id": { ".validate": "newData.hasChildren(['user', 'image', 'text'])" } }, "notifications": { "buy_item": { ".validate": "newData.hasChildren(['text'])" } } } }, "alive_lives": { ".write": false, ".read": false } } }
  56. { "rules": { ".write": false, "lives": { "$live_id": { ".read":

    "root.child('alive_lives').child($live_id).val() === true", "messages": { "$message_id": { ".validate": "newData.hasChildren(['user', 'image', 'text'])" } }, "notifications": { "buy_item": { ".validate": "newData.hasChildren(['text'])" } } } }, "alive_lives": { ".write": false, ".read": false } } }
  57. "root.child('alive_lives').child($live_id).val() === true”, Refer data on database

  58. "root.child('alive_lives').child($live_id).val() === true”, Permission condition

  59. { "lives": { "1": { ... }, "2": { ...

    } }, "alive_lives": { "1": true, "2": false } } alive_lives[ʻ1ʼ] is true So it can be read
  60. { "lives": { "1": { ... }, "2": { ...

    } }, "alive_lives": { "1": true, "2": false } } alive_lives[ʻ1ʼ] is false So it canʼt be read
  61. Easier to control permission • Updating rule every time will

    be overhead • So adding data to manage permission is good technique
  62. How to run on production

  63. Scalability Monitoring Troubleshooting What we needed

  64. Scalability

  65. Scalability • Mercari is high traffic service • Mercari channel

    also should think about for high traffic
  66. KPI of Mercari channel • Over 1500 streamers/day • Over

    30K viewers/day • 5-6M yen sales per 1 stream ※not strict numbers
  67. Can Realtime Database be scale automatically?

  68. No, we need to scale by ourselves

  69. Again, limitation of database • There is limitation for each

    database • 1000 query/sec • 1M connection
  70. Vertical sharding

  71.       subscribe Some data

  72. subscribe Some data      

  73. For scalability • We use over 15+ database instances in

    production • Client switch DB by data from Mercari API • For now, Cloud Firestore can be good choice for this problem
  74. Monitoring

  75. Monitoring • In house monitoring tool • Stackdriver

  76. In house • Application logging on Kibana • Queue status

    on Mackarel
  77. Kibana

  78. Mackerel

  79. Stackdriver • Monitoring tool provided by Google • Some metrics

    of Realtime Database • Aggregating all instances data to 1 dashboard
  80. Metrics • Connection count • Database load • Network usage

    • etc… https://cloud.google.com/monitoring/api/metrics_gcp#gcp-firebasedatabase
  81. For our app • Connection count • Database load •

    REST API hit
  82. Console

  83. Troubleshooting

  84. Only one truth

  85. All services can be in trouble

  86. What we need • Expect Cloud services as failing sometimes

    • Think about what we can for that
  87. Notice Operation ・Notice to some problem ・Investigation ・Fix bug ・Make

    service maintenance Problem ・Some incident of Cloud service ・System alert
  88. Notice Operation ・Notice to some problem ・Investigation ・Fix bug ・Make

    service maintenance Problem ・Some incident of Cloud service ・System alert
  89. Problem from official • There is official status dashboard •

    RSS/JSON feed supported
  90. Problem from system • Customized Stackdriver • Mackerel

  91. Notice Operation ・Notice to some problem ・Investigation ・Fix bug ・Make

    service maintenance Problem ・Some incident of Cloud service ・System alert
  92. Alert channel • All information will be passed to Slack

    channel • Channel includes all team members
  93. Notification from system • Stackdriver • Mackerel

  94. Notification from official feed

  95. When recovered

  96. Small technique • Official alert including both incident and recovery

    informations • There is a staff who can't read English • So we customized feed bot
  97. Customized message 4XJUDIFNPKJ PS

  98. Customized message POMZNFOUJPOGPS JODJEFOU

  99. Customized message "VUPUSBOTMBUJPO

  100. Notice Operation ・Notice to some problem ・Investigation ・Fix bug ・Make

    service maintenance Problem ・Some incident of Cloud service ・System alert
  101. Maintenance mode • System flag to manage maintenance mode •

    Team member can edit • If the effect of incident is too large for customers, we'll edit it
  102. Appendix

  103. Cloud Firestore vs Realtime Database • You SHOULD check this

    article • https://firebase.googleblog.com/2017/10/ cloud-firestore-for-rtdb-developers.html
  104. For more detail https://speakerdeck.com/sota1235/realtime-messaging-with-firebase-number-phpcon2017

  105. Thank you