SAC 2020 - Presentation Slides

Black-box inter-application traﬃc monitoring for adaptive container placement Francisco Neves,
Ricardo Vilaça and José Pereira HASLab, INESC TEC and University of Minho Braga, Portugal ACM/SIGAPP Symposium On Applied Computing

• Distributed software components are managed with containers, container pods
and orchestrators • Container placement is important for achieving great performance • Inter-application traﬃc is a key issue for determining performance Introduction 2

• Cloud-based enviroments are unable to accurately monitor inter-application traffic
in an application-independent way • Existing tracing tools that provide detailed information about data flow within an cloud application require instrumentation • Capturing network traffic is possible but incurs large overhead and demands high computational resources during peaks Problem 3

• How to optimize container placement of system’s deployment without
application knowledge and incurring negligible overhead? Problem 4 Host 1 Host 2 Host 3 Host N

• Kernel layer is the common low level layer of
virtualized environments • Observing system calls provide useful insights on which and how software components interact • Network communication between processes (even in containers) involves system calls for managing connections, reading and writing messages to network channels Monitoring at Kernel Layer 5

• There is a plenty of system calls for the
same purpose • System calls do not always provide relevant information ◦ File descriptors are meaningless out of the process context Network Communication in Kernel 6 Write System Calls Read System Calls write(fd, buf, size) sendto(socket, …) sendmsg(socket, …) sendfile(to_fd, from_fd, …) read(fd, buf, size) readfrom(socket, …) readmsg(socket, …)

Network Communication in Kernel 7 Write System Calls Read System
Calls write(fd, buf, size) sendto(socket, …) sendmsg(socket, …) sendfile(to_fd, from_fd, …) read(fd, buf, size) readfrom(socket, …) readmsg(socket, …) int sock_sendmsg( struct socket *sock, struct msghdr *msg ) int sock_recvmsg( struct socket *sock, struct msghdr *msg, int flags ) user space kernel space

• struct socket contains connection details: ◦ local and remote
ip addresses ▪ local and remote addresses in sender are remote and local addresses at the receiver side, respectively ◦ local and remote ports ◦ socket family (AF_INET, AF_INET6) and socket type (SOCK_STREAM) Network Communication in Kernel 8 Write System Calls Read System Calls int sock_sendmsg( struct socket *sock, struct msghdr *msg ) int sock_recvmsg( struct socket *sock, struct msghdr *msg, int flags ) user space kernel space

• return value indicates the amount of data actually sent/received.
◦ Or if any error occured Network Communication in Kernel 9 Write System Calls Read System Calls int sock_sendmsg( struct socket *sock, struct msghdr *msg ) int sock_recvmsg( struct socket *sock, struct msghdr *msg, int flags ) user space kernel space

• eBPF is a popular technology that permits eﬃcient attachment
of custom programs at the entry and exit points of kernel functions • Collect data in kernel space and publish events to ring buﬀers, which are consumed by a frontend program in user space Monitoring using eBPF 10

• Reading kernel structures require copying them ﬁrst • Stack
size of each probe is limited to 512 bytes • Ring buﬀer size is limited and high event throughput lead new events to overwrite the oldest ones • Processing events in frontend program incurs CPU usage Monitoring using eBPF - Caveats 11

• Probes attached to the entry and exit points of
kernel routines sock_sendmsg and sock_recvmsg collect connection details and amount of sent/received bytes • Worst-case stress scenario setup with iperf tool • Two versions implemented: ◦ One event for each read/write, aggregating in userspace (UserAgg) ◦ Send event with already aggregated statistics to user space (KernelAgg) Monitoring using eBPF - Overhead 12

Monitoring using eBPF - Overhead 13

• Example of layered and distributed data processing ◦ Combination
of Apache Cassandra and Apache Spark • Four n1-standard-4 Google Cloud Engine instances • Docker containers orchestrated by Kubernetes. ◦ 4 replicas of Apache Cassandra ◦ 4 replicas of Spark Workers and 1 of Spark Master • Populated with 2 million rows of ~2KiB in size Case Study 14

• Traditional resource monitoring of two queries Q1 and Q2
◦ CPU time in seconds ◦ Others in KiB Case Study - Default Placement 15

• Which instances contribute to such traﬃc? Case Study -
Default Placement 18

• Which instances contribute to such traffic? Case Study -
Default Placement 19 high intra-host traffic high inter-host traffic

Case Study - Default Placement (Q1) 20 Cassandra — Spark
data transfer

Case Study - Default Placement (Q2) 21 Spark — Spark
shuﬄing

Case Study - Automatic Placement 22 • The black-box approach
is compatible with automatic techniques for optimizing containers placement • Pyevolve utility for optimizing placement, giving the initial set of containers and servers, each with corresponding processes • Three optimization factors: ◦ optimal result for each server where CPU cores are expected to be fully used ◦ optimal result for each server where RAM is expected to be fully used ◦ optimal result for no cross-server communication

Case Study - Automatic Placement 23 • Q1: Place two
Spark workers and two Cassandra servers in each server • Q2: Place three Cassandra servers in one instance and the remaining Cassandra together with all Spark workers in a second instance • Overall decrease of exchanged inter-host network traﬃc

Case Study - Manual Placement 24 • The collected data
can be used also for manual placement and configuration • We manually placed and configured containers based on network traffic • Q1: Each Spark worker together with a Cassandra server ◦ Improves locality • Q2: Only one Spark worker with 4x as much resources assigned ◦ Avoids shuffling

Case Study - Manual Placement (Q1) 25

Case Study - Manual Placement (Q2) 26

• Monitoring at Kernel layer provides useful insights, in a
black-box fashion, on systems performance • Quantifying the amount of data exchanged between software components is key for improving performance • Monitoring network connections is feasible with low overhead and without application knowledge Conclusions 27

Black-box inter-application traﬃc monitoring for adaptive container placement Francisco Neves,
Ricardo Vilaça and José Pereira HASLab, INESC TEC and University of Minho Braga, Portugal ACM/SIGAPP Symposium On Applied Computing

• Which processes contribute to such traﬃc? Case Study -
Default Placement 29

• Which processes contribute to such traﬃc? Case Study -
Default Placement 30 Cassandra — Spark data transfer Spark — Spark shuﬄing

SAC 2020 - Presentation Slides

SAC 2020 - Presentation Slides

Francisco Neves

More Decks by Francisco Neves

Other Decks in Science

Featured

Transcript

Black-box inter-application traﬃc monitoring for adaptive container placement Francisco Neves,

• Distributed software components are managed with containers, container pods

• Cloud-based enviroments are unable to accurately monitor inter-application traﬃc

• How to optimize container placement of system’s deployment without

• Kernel layer is the common low level layer of

• There is a plenty of system calls for the

Network Communication in Kernel 7 Write System Calls Read System

• struct socket contains connection details: ◦ local and remote

• return value indicates the amount of data actually sent/received.

• eBPF is a popular technology that permits eﬃcient attachment

• Reading kernel structures require copying them ﬁrst • Stack

• Probes attached to the entry and exit points of

Monitoring using eBPF - Overhead 13

• Example of layered and distributed data processing ◦ Combination

• Traditional resource monitoring of two queries Q1 and Q2

• Traditional resource monitoring of two queries Q1 and Q2

• Traditional resource monitoring of two queries Q1 and Q2

• Which instances contribute to such traﬃc? Case Study -

• Which instances contribute to such traﬃc? Case Study -

Case Study - Default Placement (Q1) 20 Cassandra — Spark

Case Study - Default Placement (Q2) 21 Spark — Spark

Case Study - Automatic Placement 22 • The black-box approach

Case Study - Automatic Placement 23 • Q1: Place two

Case Study - Manual Placement 24 • The collected data

Case Study - Manual Placement (Q1) 25

Case Study - Manual Placement (Q2) 26

• Monitoring at Kernel layer provides useful insights, in a

Black-box inter-application traﬃc monitoring for adaptive container placement Francisco Neves,

• Which processes contribute to such traﬃc? Case Study -

• Which processes contribute to such traﬃc? Case Study -