Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Evaluation of Container Virtualized MEGADOCK System in Distributed Computing Environment

Evaluation of Container Virtualized MEGADOCK System in Distributed Computing Environment

SIGBIO49 @ Japan Advanced Institute of Science and Technology

We evaluate the performance of MEGADOCK system using container virtualization technology in distributed computing environment on cloud.
MEGADOCK is a Protein-Protein-Interaction Prediction System for heterogeneous supercomputers.
We choosed Docker as a container virtualization technology for Linux container implementation and evaluated parallel execution performance while increasing the number of virtual machines.

metaVariable

March 23, 2017
Tweet

More Decks by metaVariable

Other Decks in Technology

Transcript

  1. Evaluation of Container Virtualized MEGADOCK System in Distributed Computing Environment

    March 23th, 2017 SIG BIO 49@Japan Advanced Institute of Science and Technology Kento Aoyama1,2, Yuki Yamamoto1,2, Masahito Ohue1,3, Yutaka Akiyama1,2,3 1) Department of Computer Science, School of Computing Tokyo Institute of Technology 2) Education Academy of Computational Life Sciences (ACLS) Tokyo Institute of Technology 3) Advanced Computational Drug Discovery Unit, Institute of Innovative Research Tokyo Institute of Technology
  2. Docker and Bioinformatics 3 A. Paolo, D. Tommaso, A. B.

    Ramirez, E. Palumbo, C. Notredame, and D. Gruber, “Benchmark Report : Univa Grid Engine , Nextflow , and Docker for running Genomic Analysis Workflows.” Docker Integration Benchmark Report @Centre for Genomic Regulation (Barcelona, Spain) • Univa Grid Engine (Job Scheduler) • Nextflow (Workflow manager) • Docker (Linux Container) • Reproducibility • Portability
  3. To develop the Container-Native HPC Bioinformatics Application Using Linux Container

    which has … • Low Dependency on Environment • High-Performance • Parallel execution performance • Overhead of virtualization • Dynamically Scaling Research Purpose 4
  4. • To evaluate the Performance of Docker Container-Virtualization in Bioinformatics

    Application Target Application • MEGADOCK[1] • FFT-grid-based Protein-Protein Docking software • Multi-threading, Multi-node, Multi-GPU (OpenMP, MPI, GPU) • Extremely compute intensive workloads Today’s Report 5 [1] Masahito Ohue, et al. “MEGADOCK 4.0: an ultra-high-performance protein-protein docking software for heterogeneous supercomputers”, Bioinformatics, 30(22): 3281-3283, 2014.
  5. Kernel-Shared Virtualization • Lightweight : small size, fast deploy, easy

    sharing • Performance : few virtualization overhead, faster than VM Linux Container 7 Hardware Linux Kernel Container App Bins/Libs Container App Bins/Libs Hardware Virtual Machine App Guest OS Bins/Libs Virtual Machine App Guest OS Bins/Libs Hypervisor Virtual Machines Containers
  6. Linux Container • virtualizes the host resource as containers •

    Filesystem, hostname, IPC, PID, Network, User, etc. • can be used like Virtual Machines Linux Kernel Features • Containers are sharing same host kernel • namespace[1], chroot, cgroup, SELinux, etc. Container-based Virtualization 8 [1] E. W. Biederman. “Multiple instances of the global Linux namespaces.”, In Proceedings of the 2006 Ottawa Linux Symposium, 2006. Machine Linux Kernel Space Container Process Process Container Process Process
  7. Linux Container – Performance [1] 9 [1] W. Felter, A.

    Ferreira, R. Rajamony, and J. Rubio, “An updated performance comparison of virtual machines and Linux containers,” IEEE International Symposium on Performance Analysis of Systems and Software, pp.171-172, 2015. (IBM Research Report, RC25482 (AUS1407-001), 2014.) 0.96 1.00 0.98 0.78 0.83 0.99 0.82 0.98 0.00 0.20 0.40 0.60 0.80 1.00 PXZ [MB/s] Linpack [GFLOPS] Random Access [GUPS] Performance Ratio [based Native] Native Docker KVM KVM-tuned
  8. Docker [1] • Most popular Linux Container management platform •

    Many useful components and services Linux Container Management Tools 10 [1] Solomon Hykes and others. “What is Docker?” - https://www.docker.com/what-docker [2] W. Bhimji, S. Canon, D. Jacobsen, L. Gerhardt, M. Mustafa, and J. Porter, “Shifter : Containers for HPC,” Cray User Group, pp. 1–12, 2016. [3] “Singularity” - http://singularity.lbl.gov/ [1] [2] [3]
  9. Easy container sharing – Docker Hub 11 Portability & Reproducibility

    • Easy to share the application environment via Docker Hub • Containers can be executed on other host machine Ubuntu Docker Engine Container App Bins/Libs Image App Bins/Libs Docker Hub Image App Bins/Libs Push Pull Dockerfile apt-get install … wget … … make CentOS Docker Engine Container App Bins/Libs Image App Bins/Libs Generate Share
  10. AUFS (Advanced multi layered unification filesystem) [1] • Docker default

    filesystem as AUFS • Layers can be reused in other container image • AUFS helps software Reproducibility Docker - Filesystem 12 [1] Advanced multi layered unification filesystem. http://aufs.sourceforge.net, 2014. Docker Container (image) f49eec89601e 129.5 MB ubuntu:16.04 (base image) 366a03547595 39.85 MB ef122501292c 133.6 MB e50c89716342 660.4 KB tag: beta tag: version-1.0 tag: version-1.0.2 tag: version-1.2 5aec9aa5462c 24.17 MB tag: latest 0d3cccd04bdb 6.07 MB
  11. Why in the field of Bioinformatics? • Types of Applications

    • Data Analysis, Machine Learning • MD Simulation, Docking calc. , etc. • Data-centric workload • Compute : Large • Data I/O : Case by case • Communication : Small • Container performs well on compute-Intensive workload[1] For Bioinformatics Apps : 1 13 [1] W. Felter, et al. “An updated performance comparison of virtual machines and Linux containers,” IEEE International Symposium on Performance Analysis of Systems and Software, pp.171-172, 2015.
  12. Reproducibility • Different version of library can make different result

    • e.g.) Genomic analysis pipeline [Paolo, 2016] Container A’ Container A Container B Container A For Bioinformatics Apps : 2 14 Library A Application A Application B version >= 1.2 version < 1.1 Application A Library version 1.3 Result A’ Application A Library version 1.2 Result A conflict different result Dependency Isolation Application Reproducibility Dependency conflict • Different application can requires different version of same library
  13. Performance • Few performance overhead Reproducibility • Dependency Isolation from

    other applications/libraries Portability, Generality • Sharing/Porting to other environment Features for Bioinformatics Apps 15 Features Native VM Container Performance Scalability Great Bad Good Reproducibility Bad Good Great Portability Generality Bad Great Great
  14. MEGADOCK 17 Masahito Ohue, et al. “MEGADOCK 4.0: an ultra-high-

    performance protein-protein docking software for heterogeneous supercomputers”, Bioinformatics, 30(22): 3281-3283, 2014. High-performance protein-protein interaction predictions • FFT-grid based docking software • Extremely compute-intensive • OpenMP/MPI/GPU support • Great HPC Performance
  15. Container-based Application Distribution 18 Resource Resource MEGA DOCK Resource MEGA

    DOCK Add/Remove Container Resource MEGA DOCK Add/Remove Application Layer Compute Resource Layer • All application dependencies exist in the Container • Easy-to-test application • Easy-to-scale size of resources Test Environment Production Environment
  16. Experiment I Evaluate container virtualization overhead on Physical Machine •

    Physical Machine (single-node) + Docker • Physical Machine (single-node, GPU) + NVIDIA-Docker Experiment II Evaluate container virtualization overhead on Cloud Environment • Virtual Machines (multi-node) + Docker • Virtual Machines (multi-node, GPU) + NVIDIA-Docker Experiments 20
  17. Measurement • megadock-gpu exec. time • time command (6 times,

    median) Dataset • 100 pair-pdb (KEGG pathway) Options (OpenMP, OpenMPI) • MPI : 12 threads / 4 MPI process / 1 node • GPU : 1 GPU / 1 process / 1 node Overview of Experiment I 21 Physical Machine MPI MPI MPI MPI Physical Machine Docker MPI MPI MPI MPI Physical Machine GPU MEGADOCK GPU Physical Machine NVIDIA Docker MEGADOCK GPU GPU (b) (a) (d) (c) Test Case Native Docker CPU (MPI) (a) (b) GPU (c) (d)
  18. Hardware/Software Specification 22 Software Env. Physical Machine Docker NVIDIA Docker

    (GPU) OS (image) CentOS 7.2.1511 ubuntu:14.04 nvidia/cuda8.0-devel Linux Kernel 3.10.0 3.10.0 3.10.0 GCC 4.8.5 4.8.4 4.8.4 FFTW 3.3.5 3.3.5 3.3.5 OpenMPI 1.10.0 1.6.5 N/A Docker Engine 1.12.3 N/A N/A NVCC 8.0.44 N/A 8.0.44 NVIDIA Docker 1.0.0 rc.3 N/A N/A NVIDIA Driver 367.48 N/A 367.48 CPU Intel Xeon E5-1630, 3.7 [GHz] ×8 [core] Memory 32 [GB] Local SSD 128 [GB] GPU NVIDIA Tesla K40
  19. Execution time 23 7353.80 1646.09 7850.57 1638.05 0 1500 3000

    4500 6000 7500 9000 CPU (MPI) GPU Time [sec] Native Docker +6.32 % slower
  20. Profile Result (CPU time) 24 Process native [sec] docker [sec]

    diff Ratio (all) FFT3D 7.40E+04 7.63E+04 +3.01% 76.84% MPIDP-Master 8010.98 8325.9 +3.78% 8.38% Create Voxel 3743.7 3993.29 +6.25% 4.02% FFT Convolution 3551.08 3576.43 +0.71% 3.60% Score Sort 2462.61 2459.7 -0.12% 2.48% Output Detail 2139.94 2225.96 +3.86% 2.24% Ligand Preparation 1035.51 1849.11 +44.00% 1.86% MPI_Barrier 236.95 231.05 -2.55% 0.23% MPI_Init 0.94 4.54 79.30% 0.00% … … … … …
  21. (a) MEGADOCK-Azure[2] Measurement • megadock-dp exec. time • time command

    (3 times, median) Dataset • ZDOCK benchmark 1.0 [1] (59 * 59 = 3481 pairs) Options (OpenMP, OpenMPI) • MPI : 12 threads / 4 MPI process / 1 node All file input/output in Local SSD Overview of Experiment II-(a) 25 Virtual Machine MPI MPI MPI MPI VM MPI MPI MPI MPI VM MPI MPI MPI MPI VM MPI MPI MPI MPI VM MPI MPI MPI MPI VM MPI MPI MPI MPI VM MPI MPI MPI MPI Master Process Worker Process (Other) [1] R. Chen, et al. “A protein-protein docking benchmark,” Proteins: Structure, Function and Genetics, vol. 52, no. 1, pp. 88-91, 2003. [2] Masahito Ohue, et al. ”MEGADOCK-Azure: High-performance protein-protein interaction prediction system on Microsoft Azure HPC”, IIBMP2016.
  22. (b) MEGADOCK + Docker on Microsoft Azure Measurement • megadock-dp

    exec. time • time command (3 times, median) Dataset • ZDOCK benchmark 1.0 (59 * 59 = 3481 pairs) Options (OpenMP, OpenMPI) • MPI : 12 threads / 4 MPI process / 1 node All file input/output in Local SSD Docker Swarm • All Containers in 1 overlay network Overview of Experiment II-(b) 26 Virtual Machine Docker MPI MPI MPI MPI Docker MPI MPI MPI MPI Docker MPI MPI MPI MPI Docker MPI MPI MPI MPI Docker MPI MPI MPI MPI Docker MPI MPI MPI MPI Docker MPI MPI MPI MPI Docker Swarm (Docker Network) Master Process Worker Process (Other) [1] R. Chen, J. Mintseris, J. Janin, and Z. Weng, “A protein-protein docking benchmark,” Proteins: Structure, Function and Genetics, vol. 52, no. 1, pp. 88-91, 2003.
  23. VM Instance/Software Specification 27 Software Env. Virtual Machine Docker OS

    (image) SUSE Linux Enterprise Server 12 ubuntu:14.04 Linux Kernel 3.12.43 3.12.43 GCC 4.8.3 4.8.4 FFTW 3.3.4 3.3.5 OpenMPI 1.10.2 1.6.5 Docker Engine 1.12.6 N/A VM Instance Standard_D14_v2 CPU Intel Xeon E5-2673, 2.40 [GHz] × 16 [core] Memory 112 [GB] Local SSD 800 [GB]
  24. Execution time 28 145,534 25,515 13,132 6,006 4,098 117,219 25,145

    12,331 6,344 3,971 0 25,000 50,000 75,000 100,000 125,000 150,000 1 5 10 20 30 Time [sec] # of VMs VM Docker on VM May be a measurement mistake
  25. Scalability (Strong Scaling, based VM=1) 29 0 5 10 15

    20 25 30 35 40 45 0 100 200 300 400 500 Speed-up # of worker cores Ideal VM Docker on VM VM=5 VM=1 VM=10 VM=20 VM=30 comparable scalability
  26. Experiment I • MEGADOCK + Docker on Physical Machine showed

    6.32% lower performance. • Docker can cause 0-4% compute-performance down[1] • Communications via Docker NAT (Network Address Translation) • MEGADOCK (GPU) + NVIDIA-Docker on Physical Machine showed comparable performance to native. • GPU calc. is independent from container virtualization • Container virtualization has few overhead on memory bandwidth Experiment II • MEGADOCK + Docker on Microsoft Azure performed comparable scalability. • Container virtualization overhead is smaller than other cloud environment factor Result & Discussion 30 [1] W. Felter, A. Ferreira, R. Rajamony, and J. Rubio, “An updated performance comparison of virtual machines and Linux containers”, IEEE International Symposium on Performance Analysis of Systems and Software, pp.171-172, 2015. (IBM Research Report, RC25482 (AUS1407-001), 2014.)
  27. • Performance overhead of Docker container-virtualization is small. • suitable

    for GPU-accelerated-App and Cloud Environment • Container-Virtualization can isolate application environment from host environment. • same container image can be used on various machines • Physical machine on local environment • Virtual machine on cloud environment • Docker is useful for computational research work Conclusion 31
  28. Multi-Node & Multi-GPU Evaluation on Cloud • NVIDIA-Docker is not

    available on Docker Swarm mode • Kubernetes[1] officially support 1GPU/1node • (experimental-feature: multi-GPU support) Container-based Task Distribution • Web-Service-Application like container-based distribution • easy to scale computing resource • easy to extends multiple task (e.g. GHOST-MP, MEGADOCK) Future Work 32 [1] B. Burns, B. Grant, D. Oppenheimer, E. Brewer, and J. Wilkes, “Borg, Omega, and Kubernetes,” acmqueue, vol. 14, no. 1, p. 24, 2016.