$30 off During Our Annual Pro Sale. View Details »

Reliability Engineering for Enterprise Serverless

Reliability Engineering for Enterprise Serverless

Masashi Terui

March 11, 2018
Tweet

More Decks by Masashi Terui

Other Decks in Technology

Transcript

  1. RELIABILITY ENGINEERING
    FOR ENTERPRISE SERVERLESS
    MASASHI TERUI @ JAWS DAYS 2018

    View Slide

  2. SERVERWORKS CO.,LTD.
    + FREELANCER
    • Serverless Oji-san
    • Serverless Framework Plugin Developer
    • Serverlessconf Tokyo 2016/2017 speaker
    • Remote worker (in Sapporo)
    • The best Cloud Engineer in Hokkaido!! (Ͱ͋Γ͍ͨʣ
    MASASHI TERUI
    ARCHITECT / DEVELOPER

    View Slide

  3. AGENDA
    SERVERLESS
    ͬͯͳΜ͚ͩͬʁ
    1 6
    2 7
    3 8
    4 9
    5 10
    SERVERLESS
    ͷ৴པੑͱ͸ʁ
    SERVERLESS
    ͷΑ͋͘Δ՝୊
    SERVERLESS
    Λ࣮ମΛଊ͑Δ
    RELIABILITY
    ߟ͑ํ
    RELIABILITY
    ઃܭฤ
    RELIABILITY
    ࣮૷ฤ
    RELIABILITY
    ؂ࢹฤ
    SUMMARY
    ੈքΛ޿͛Δ
    SERVERLESS
    ͸ಛผ͡Όͳ͍

    View Slide

  4. SERVERLESSͬͯͳΜ͚ͩͬʁ
    WHAT IS

    View Slide

  5. CNCF SERVERLESS WHITEPAPER V1.0
    • Serverless computing refers to the concept of building and
    running applications that do not require server management
    • A platform may provide one or both of the following:
    • Functions-as-a-Service (FaaS)
    • Backend-as-a-Service (BaaS)
    • Products or platforms deliver the following benefits to developers:
    • Zero Server Ops
    • No Compute Cost When Idle

    https://github.com/cncf/wg-serverless/tree/master/whitepaper

    View Slide

  6. SERVERLESS IS NOT GLUE
    IN ENTERPRISE APPLICATION
    ”THE ORCHESTRATOR MANAGES THE TRADES USING A GRAPH OF STATES”

    View Slide

  7. SERVERLESS CLOUD NATIVE LANDSCAPE

    View Slide

  8. BUT I PREFER THIS ONE
    https://www.slideshare.net/acloudguru/ant-stanley-being-serverless

    View Slide

  9. SERVERLESS USE CASES (FROM CNCF WP)
    • Asynchronous, concurrent, easy to parallelize into independent
    units of work
    • Infrequent or has sporadic demand, with large, unpredictable
    variance in scaling requirements
    • Stateless, ephemeral, without a major need for instantaneous
    cold start time
    • Highly dynamic in terms of changing business requirements that
    drive a need for accelerated developer velocity
    • Non-HTTP-centric and non-elastic scale workloads that weren’t
    good fits for an IaaS, PaaS, or CaaS solution (Event Driven
    workloads)

    View Slide

  10. “There are many workloads
    that are stateful and/or not easy to parallelize”
    ͱࢥͬͯ·ͤΜ͔ʁ
    “Asynchronous and Event Driven processing is
    too difficult for humans”

    View Slide

  11. SERVERLESSͷ৴པੑͱ͸ʁ
    Reliability

    View Slide

  12. RELIABILITY(RASIS)
    ࡉ෼Խ͢Δͱ৭ʑ͋Δ
    • Reliability
    • Availability
    • Serviceability
    • Integrity
    • Security
    ͜͜Ͱ͸ҎԼͷΑ͏ʹఆٛ

    ʮReliability = ޿ٛͷ৴པੑ(RASIS)ʯ
    Reliability
    Availability Serviceability
    Integrity Security
    Reliability

    View Slide

  13. IS SERVERLESS DIFFICULT TO
    GUARANTEE THE RELIABILITY?
    • Strongly depends on FaaS platform and BaaS products
    • Lose the business continuity (Reliability, Availability)
    • Distributed Instances
    • Lose the traceability (Serviceability)
    • Hard to develop
    • All become functions (Serviceability)
    • NoSQL matches better than RDB (Integrity)

    View Slide

  14. SERVERLESSͷΑ͋͘Δ՝୊
    i s s u e s

    View Slide

  15. FaaSͷΑ͋͘Δ՝୊
    • How to test the functions?
    • Granularity of the functions
    • Messaging between the functions and backends
    • Handling request and response (Error Handling)
    • Log Aggregation, Traceability
    • Monitoring

    View Slide

  16. BaaSͷΑ͋͘Δ՝୊
    • How to choose the services?
    • Fault Tolerance
    • Monitoring

    View Slide

  17. SERVERLESSͷ࣮ମΛଊ͑Δ
    Mechanism
    HINT

    View Slide

  18. MECANISM OF FAAS

    View Slide

  19. SERVERLESS PROCESSING MODEL
    https://github.com/cncf/wg-serverless/tree/master/whitepaper#detail-view-serverless-processing-model
    ॲཧͷશମ૾Λ௫Ή
    Πϕϯτιʔε͔ΒͷΠϕϯτॲཧཁٻΛ
    ଟ਺ͷΠϯελϯε্ͷFunction͕෼ࢄॲཧ

    View Slide

  20. THE INTERNAL FLOW OF PROCESSING
    https://github.com/apache/incubator-openwhisk/blob/master/docs/about.md#the-internal-flow-of-processing
    Πϕϯτ(HTTP)͕ॲཧ͞ΕΔྲྀΕΛ௫Ή
    ετϦʔϜ΍ΩϡʔΛڬΜͰ෼ࢄॲཧ͢Δͷ͕جຊ
    ֎͔ΒݟͯಉظͰ΋த͸ඇಉظ͔ͭ෼ࢄ

    View Slide

  21. THE FUNCTIONS THAT
    INVOKED ASYNCHRONOUS
    IN THE CONTAINERS
    FaaS is… ίϯςφ಺Ͱඇಉظʹݺͼग़͞ΕΔؔ਺

    View Slide

  22. WHAT IS BAAS?

    View Slide

  23. FROM OWNERSHIP
    TO USE SERVICES
    Ϋϥ΢υʹΑͬͯαʔό͕ॴ༗͔Βར༻΁
    ͞Βʹϛυϧ΢ΣΞ΍ϥΠϒϥϦ΋ར༻΁
    BaaSΛ࢖͏ͷ͸ͦ͏͍͏ࣗવͳྲྀΕ

    View Slide

  24. FULLY MANAGED AND ABSTRACTED
    MIDDLEWARES AND LIBRARIES
    BaaS is… ϑϧϚωʔδυ͔ͭந৅Խ͞ΕͨMW΍ϥΠϒϥϦ

    View Slide

  25. SERVERLESS͸ಛผ͡Όͳ͍
    is not special

    View Slide

  26. CONSTITUTED OF THE ABSTRACTED
    FUNCTIONS AND MIDDLEWARES
    Serverless is… ந৅Խ͞ΕͯΔ͚ͩͰؔ਺ͱMWͰͰ͖ͯΔ

    View Slide

  27. WE CAN MAKE RELIABLE
    SERVERLESS APPLICATION
    SO… ৴པੑ͸࡞ΕΔʂ

    View Slide

  28. RELIABILITYͷߟ͑ํ
    Method of thinking

    View Slide

  29. RELIABILITYͷߟ͑ํ
    • Make the reliability by myself
    • Serverless will help you, but will not protect your business
    • Think simple
    • Apply generally development/operation practices
    • If you can't apply the practices, take care of the serverless mechanism
    • Keep simple
    • Don't be afraid that increase the number of the functions
    • We should be afraid complicated architecture
    • Change your mind as a software
    • Everything is part of your application

    View Slide

  30. RELIABILITY ઃܭ
    Architecting

    View Slide

  31. ALL EVENTS FLOWS IN
    THE SAME DIRECTION
    Πϕϯτ(σʔλ)͸ಉ͡ํ޲ʹྲྀ͢
    ݁Ռ͕ඞཁͳΒಉظͰฦͣ͞
    औΓʹߦ͔ͤΔ

    View Slide

  32. ALL EVENTS FLOWS IN THE SAME DIRECTION
    • They will be naturally Asynchronous and Functional
    • Asynchronous processing is Retriable
    • Functional processing is Reproducible
    • The clients get the results by myself
    • However, polling is not good choise...
    • Pushing is better choice
    • Can we be happy with AppSync? (Pushing via Websocket)

    View Slide

  33. UNIFY THE ENDPOINTS
    BETWEEN THE SERVICES
    αʔϏεؒͷΤϯυϙΠϯτ͸ू໿͢Δ
    ϥοϓͯ͠ू໿͠ίϯτϩʔϧΛಘΔ

    View Slide

  34. UNIFY THE ENDPOINTS BETWEEN
    THE SERVICES
    • Microservices
    • Separate the services by the domains (One BaaS is one of your service)
    • The endpoint of the service is not unique, it has the endpoints for each operations
    • Wrap the endpoints to abstract them
    • Like a “MySQL Server” and “libmysql”
    • Do you call “libmysql” directly?
    • You can make Failover/Failsafe mechanism
    • Like a Reverse Proxy
    • Do you connect to multiple “Read replicas” from “each app servers”?
    • Trafic controlling, Caching

    View Slide

  35. ALL SERIES OF EVENTS
    HAVE THE SAME ID
    Ұ࿈ͷॲཧʹ͸ಉ͡IDΛ෇༩
    ͦΕΛҾ͖ճ͢͜ͱͰIDͰτϨʔεͰ͖Δ
    ͜ͷID͸༷ʑͳ੍ޚʹ΋࢖͑Δ

    View Slide

  36. ALL SERIES OF EVENTS HAVE
    THE SAME ID
    • Log Aggregation
    • A series of events can be traced by the ID
    • Monitor the progress
    • Log all event messages
    • Execution control
    • At least once -> Exactly once
    • e.g. DynamoDB Conditional Writes
    • Make it easy to implement with something
    like a decorator

    View Slide

  37. DATA MODELING
    • Become the friend with DynamoDB
    • Distributed by Partition Key and
    Indexed(B-tree) by Sort Key, LSI
    • GSI is a projection of sorted(indexed) data
    • The consistency can be guaranteed without
    ACID transaction
    • Denormalization
    • Strong consistency reading, 

    Conditional Writing
    • There are some difficult situation
    • Write asynchronous to RDB

    View Slide

  38. RELIABILITY ࣮૷
    Implementation

    View Slide

  39. GRANULARITY OF THE
    FUNCTIONS
    •Testable!!
    • Unit testing is justice in the
    serverless world
    • Make the dependencies of other
    services are replaceable
    • Would be replaceable to the mocks
    • Easy to Failover/Failsafe

    View Slide

  40. HOW TO TEST THE FUNCTIONS
    • Unit testing is justice in the serverless world (2ճ໨)
    • Deploy a new environment if the mocks are not enough at integration testing
    • It is easy with some frameworks (e.g. Serverless, SAM)
    • The services outside of AWS are needed to be easily to deploy (via API)
    • Continuous E2E testing with traceable ID
    • It become a monitoring

    View Slide

  41. RELIABILITY ؂ࢹ
    Monitoring

    View Slide

  42. RELIABILITY MONITORING
    • The greatest monitoring is the notifications from the application
    • Be sure to catch the errors and notify them
    • Collect the metrics of the services
    • CloudWatch
    • This is a condition to choice the services outside of AWS
    • Continuous E2E testing with traceable ID

    View Slide

  43. SUMMARY

    View Slide

  44. “SERVERLESS IS NOT SPECIAL”
    THANKS!!
    “MAKE THE RELIABILITY BY MYSELF”
    “THINK SIMPLE, KEEP SIMPLE”
    “EVERYTHING IS PART OF YOUR APPLICATION”
    “LET'S EXPAND SERVERLESS WORLD”

    View Slide

  45. bit.ly/jd2018-sls
    #jawsdays #jd2018_a
    PLEASE TAKE A SURVEY

    View Slide