Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Reliability Engineering for Enterprise Serverless

Reliability Engineering for Enterprise Serverless

Masashi Terui

March 11, 2018
Tweet

More Decks by Masashi Terui

Other Decks in Technology

Transcript

  1. SERVERWORKS CO.,LTD. + FREELANCER • Serverless Oji-san • Serverless Framework

    Plugin Developer • Serverlessconf Tokyo 2016/2017 speaker • Remote worker (in Sapporo) • The best Cloud Engineer in Hokkaido!! (Ͱ͋Γ͍ͨʣ MASASHI TERUI ARCHITECT / DEVELOPER
  2. AGENDA SERVERLESS ͬͯͳΜ͚ͩͬʁ 1 6 2 7 3 8 4

    9 5 10 SERVERLESS ͷ৴པੑͱ͸ʁ SERVERLESS ͷΑ͋͘Δ՝୊ SERVERLESS Λ࣮ମΛଊ͑Δ RELIABILITY ߟ͑ํ RELIABILITY ઃܭฤ RELIABILITY ࣮૷ฤ RELIABILITY ؂ࢹฤ SUMMARY ੈքΛ޿͛Δ SERVERLESS ͸ಛผ͡Όͳ͍
  3. CNCF SERVERLESS WHITEPAPER V1.0 • Serverless computing refers to the

    concept of building and running applications that do not require server management • A platform may provide one or both of the following: • Functions-as-a-Service (FaaS) • Backend-as-a-Service (BaaS) • Products or platforms deliver the following benefits to developers: • Zero Server Ops • No Compute Cost When Idle 
 https://github.com/cncf/wg-serverless/tree/master/whitepaper
  4. SERVERLESS USE CASES (FROM CNCF WP) • Asynchronous, concurrent, easy

    to parallelize into independent units of work • Infrequent or has sporadic demand, with large, unpredictable variance in scaling requirements • Stateless, ephemeral, without a major need for instantaneous cold start time • Highly dynamic in terms of changing business requirements that drive a need for accelerated developer velocity • Non-HTTP-centric and non-elastic scale workloads that weren’t good fits for an IaaS, PaaS, or CaaS solution (Event Driven workloads)
  5. “There are many workloads that are stateful and/or not easy

    to parallelize” ͱࢥͬͯ·ͤΜ͔ʁ “Asynchronous and Event Driven processing is too difficult for humans”
  6. RELIABILITY(RASIS) ࡉ෼Խ͢Δͱ৭ʑ͋Δ • Reliability • Availability • Serviceability • Integrity

    • Security ͜͜Ͱ͸ҎԼͷΑ͏ʹఆٛ
 ʮReliability = ޿ٛͷ৴པੑ(RASIS)ʯ Reliability Availability Serviceability Integrity Security Reliability
  7. IS SERVERLESS DIFFICULT TO GUARANTEE THE RELIABILITY? • Strongly depends

    on FaaS platform and BaaS products • Lose the business continuity (Reliability, Availability) • Distributed Instances • Lose the traceability (Serviceability) • Hard to develop • All become functions (Serviceability) • NoSQL matches better than RDB (Integrity)
  8. FaaSͷΑ͋͘Δ՝୊ • How to test the functions? • Granularity of

    the functions • Messaging between the functions and backends • Handling request and response (Error Handling) • Log Aggregation, Traceability • Monitoring
  9. RELIABILITYͷߟ͑ํ • Make the reliability by myself • Serverless will

    help you, but will not protect your business • Think simple • Apply generally development/operation practices • If you can't apply the practices, take care of the serverless mechanism • Keep simple • Don't be afraid that increase the number of the functions • We should be afraid complicated architecture • Change your mind as a software • Everything is part of your application
  10. ALL EVENTS FLOWS IN THE SAME DIRECTION • They will

    be naturally Asynchronous and Functional • Asynchronous processing is Retriable • Functional processing is Reproducible • The clients get the results by myself • However, polling is not good choise... • Pushing is better choice • Can we be happy with AppSync? (Pushing via Websocket)
  11. UNIFY THE ENDPOINTS BETWEEN THE SERVICES • Microservices • Separate

    the services by the domains (One BaaS is one of your service) • The endpoint of the service is not unique, it has the endpoints for each operations • Wrap the endpoints to abstract them • Like a “MySQL Server” and “libmysql” • Do you call “libmysql” directly? • You can make Failover/Failsafe mechanism • Like a Reverse Proxy • Do you connect to multiple “Read replicas” from “each app servers”? • Trafic controlling, Caching
  12. ALL SERIES OF EVENTS HAVE THE SAME ID • Log

    Aggregation • A series of events can be traced by the ID • Monitor the progress • Log all event messages • Execution control • At least once -> Exactly once • e.g. DynamoDB Conditional Writes • Make it easy to implement with something like a decorator
  13. DATA MODELING • Become the friend with DynamoDB • Distributed

    by Partition Key and Indexed(B-tree) by Sort Key, LSI • GSI is a projection of sorted(indexed) data • The consistency can be guaranteed without ACID transaction • Denormalization • Strong consistency reading, 
 Conditional Writing • There are some difficult situation • Write asynchronous to RDB
  14. GRANULARITY OF THE FUNCTIONS •Testable!! • Unit testing is justice

    in the serverless world • Make the dependencies of other services are replaceable • Would be replaceable to the mocks • Easy to Failover/Failsafe
  15. HOW TO TEST THE FUNCTIONS • Unit testing is justice

    in the serverless world (2ճ໨) • Deploy a new environment if the mocks are not enough at integration testing • It is easy with some frameworks (e.g. Serverless, SAM) • The services outside of AWS are needed to be easily to deploy (via API) • Continuous E2E testing with traceable ID • It become a monitoring
  16. RELIABILITY MONITORING • The greatest monitoring is the notifications from

    the application • Be sure to catch the errors and notify them • Collect the metrics of the services • CloudWatch • This is a condition to choice the services outside of AWS • Continuous E2E testing with traceable ID
  17. “SERVERLESS IS NOT SPECIAL” THANKS!! “MAKE THE RELIABILITY BY MYSELF”

    “THINK SIMPLE, KEEP SIMPLE” “EVERYTHING IS PART OF YOUR APPLICATION” “LET'S EXPAND SERVERLESS WORLD”