leveraging large amounts of data are challenging to make trustful, reliable and scalable • Data privacy and reliability essential due to sensitivity of data • Scalability essential due to size of data or user base • Reliability and scalability requires constructing digital services as distributed systems • Tolerating faults (of compute nodes or the network) • Scaling compute and storage resources on demand 2023-10-19 4
distributed systems is challenging • Fault tolerance requires sophisticated distributed algorithms • Scalability requires complex concurrent programming • Data consistency may increase latency and reduce availability • Data-privacy legislation like GDPR is difficult to enforce • Dynamic updates or enhancements challenging 2023-10-19 5
architectures and programming frameworks have been proposed • The latest developments include Serverless Computing • Applications decomposed into functions that are invoked in response to events (FaaS — Function-as-a-Service) • Key innovation: allocate compute resources automatically → scale up and down (even to zero) according to demand • No manual provisioning of virtual machines → “serverless” • Partially addresses fault tolerance, scalability and dynamic updates 2023-10-19 6
responsibilities of applications: • Challenge: enforcing strong data consistency on top of weak consistency provided by serverless framework • Challenge: data privacy • Added challenge: enforcing data-privacy legislation automatically • These are essential for trust and privacy 2023-10-19 7
are to develop: • A decentralized serverless system architecture • Enabling deployment of “workflows” on the edge and in the cloud • A programming model to support a wide range of applications • Flexible inter-workflow communication via “portals” • Principled data processing via “atomic streams” • A formalization and a proof of execution guarantees and correctness • A prototype implementation enabling experimental evaluation and extension 2023-10-19 8
(transactional) steps: 1. Consume an atom (“batch of events”) from the input stream 2. Process the entire atom 3. Produce side effects (new events, state updates) 2023-10-19 10 Work fl ow[T, U] src sink tasks AtomicStream[T] AtomicStream[U]
a service via bidirectional communication 2023-10-19 11 Responding Dataflow src tasks sink Requests Replies Portal Service Access Operator Reques8ng Dataflow Requests Replies Portal Service
under active development • Open source, Apache 2.0 license • Written in Scala 3, a high-level language combining functional and object-oriented programming • Repository on GitHub: https://github.com/portals-project/portals 2023-10-19 12
Portals-based applications in the web browser: portals-project.org/playground/ • Made possible by compiling the Portals framework to JavaScript using Scala.js, the Scala-to-JavaScript compiler 2023-10-05 13
Extension of Dataflow Streaming for Stateful Serverless. ACM SIGPLAN Onward! 2022 https://doi.org/10.1145/3563835.3567664 • Spenger, Huang, Haller, Carbone. Portals: A Showcase of Multi-Dataflow Stateful Serverless. PVLDB 16(12): 4054-4057 (2023) https://doi.org/10.14778/3611540.3611619 • Spenger, Carbone, Haller: A Survey of Actor-Like Programming Models for Serverless Computing. Active Object Languages: Current Research Trends, Springer LNCS, to appear • Project website with all papers and talks: https://www.portals-project.org/ 2023-10-19 14