design IDL better, not to do networking better! • Support only the client-server model – Backend.AI requires bi-directional request-reply • Difficult to extend networking layers – Supporting asyncio – Multiple asynchronous request-reply pairs • Often missing large data streaming (mostly small messages) • Requirements of additional features – Connection bundling & tunneling proxies for cluster federation PyCon KR 2018
cerebral hemispheres § If gets disconnected... • Split-brain! • Alien hand syndrome • Sometimes intentionally cut to mitigate epileptic seizure Henry Vandyke Carter - Henry Gray (1918) Anatomy of the Human Body PyCon KR 2018
• Focus on networking and leave IDL for existing RPC libs! • Make lower networking & upper IDL layers replacible! • Simplify more than aiozmq.rpc • Support encryption & authentication natively § Development • Let's keep MVP (minimum-viable product) in mind • Start from a simple and working example! PyCon KR 2018
Thrift protobuf JSON ZeroMQ HTTP User Apps IDL supports (e.g., type checks, serialization) Abstraction of raw TCP sockets as message-based communication Transport-layer extensions (e.g., async scheduling, streaming, etc.) User Apps
''' Return the identity of the server. Only used by the binder. ''' raise NotImplementedError @abc.abstractmethod async def check_client(self, client_id: Identity) -> AuthResult: ''' Check if the given domain and client public key is a valid one or not. Only used by the binder. ''' raise NotImplementedError @abc.abstractmethod async def server_public_key(self) -> bytes: ''' Return the public key of the server. Only used by the connector. ''' raise NotImplementedError @abc.abstractmethod async def client_identity(self) -> Identity: ''' Return the identity of the client. Only used by the connector. ''' raise NotImplementedError @abc.abstractmethod async def client_public_key(self) -> bytes: ''' Return the public key of the client. Only used by the connector. ''' raise NotImplementedError Server-side Interface Client-side Interface
via request-keys • Integration with aiojobs to limit maximum concurrency • Each peer may have different scheduling policies! PyCon KR 2018 1 Type 0: No ordering Same color : same request key Number : global request index Yellow : request begins Red : response returns 2 3 Client: Server: 1 2 1 2 3 3 Time Each request may be completely different RPC methods!
Return ordered by request-keys Type 2: Execution serialization by request-keys Same color : same request key Number : global request index Yellow : request begins Red : response returns 2 3 Client: Server: 1 2 1 2 3 3 1 2 3 Client: Server: 1 2 1 2 3 3 Time Time
easy to integrate with Callosum. • Why: aiothrift + Thrift's runtime IDL loading scheme § How about others? Does it apply generally? • What are the requirements for IDL libraries? – Asynchronization often requires IDL compiler changes. • I hope to integrate with nirum! PyCon KR 2018
& de-serialization • How to simplify the validation logic in Python's side? • How to reduce the number of memory copies from/to buffers? • How to reduce the network bandwidth usage? § Solution • msgpack • snappy PyCon KR 2018
Rust § Application to Backend.AI • Replace the current manager-agent communication to eliminate the necessity of VPNs for hybrid-cloud & inter-cluster setups (e.g., ) • Extend Callosum to make bundle of Callosum peerings ("bundle of bundles") PyCon KR 2018
library! § Writing my own one vs. Maintaining existing one vs. Contributing to existing one § Tried to integrate with nirum (PyCon KR 2017), but had not enough time difficulties to extend its networking layer PyCon KR 2018 On March 21st, aio-libs Gitter
Thrift over ZeroMQ § Difficulties when interating external IDL libs • "API contamination": async forces all others to be async • To write async APIs without "async def"... – Live in the hell of callbacks – Rewrite entire Python on top of an event loop (node) – Monkey-patch standard networking functions (gevent) PyCon KR 2018