Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Cloud OS

atustw
October 23, 2011

Cloud OS

atustw

October 23, 2011
Tweet

Other Decks in Technology

Transcript

  1. Cloud Operating Systems Cloud Operating Systems Cloud Operating Systems Cloud

    Operating Systems Patrick Fu Patrick Fu Cloud Computing Research Center Cloud Computing Research Center for Mobile Applications (CCMA) for Mobile Applications (CCMA) 雲端運算行動應用研究中心 雲端運算行動應用研究中心 雲端運算行動應用研究中心 雲端運算行動應用研究中心
  2. Top 10 CIO Priorities Top 10 CIO Priorities 2010 1.

    Virtualization 2. Cloud computing 2. Cloud computing 3. Web 2.0 4. Networking, voice, & data communications 5. Business Intelligence (BI) 6. Mobile technologies 7. Document mgmt & Storage 2 2 2 2 7. Document mgmt & Storage 8. Service oriented applications & architecture 9. Security technologies 10. IT management All rights reserved. No part of this confidential report may be reproduced in any form or by any means without written permissi Copyright 2010 ITRI/CCMA All rights reserved. No part of this confidential report may be reproduced in any form or by any means without written permissi Source: Gartner Source: Gartner Top 10 CIO Priorities Top 10 CIO Priorities 2011 1. Cloud computing 2. Virtualization 3. Mobile technologies 4. IT Management 5. Business Intelligence (BI) 6. Networking, voice, & data communications 7. Enterprise applications 8. Collaboration technologies 9. Infrastructure 10. Web 2.0 No part of this confidential report may be reproduced in any form or by any means without written permission from ITRI. Copyright 2010 ITRI/CCMA 雲端技術中心 No part of this confidential report may be reproduced in any form or by any means without written permission from ITRI.
  3. CCMA Mission CCMA Mission • Enable Taiwan be the Major

    Supplier of Global Integrated Cloud System Solutions • Enable Taiwan Industry enter into Global Cloud OS Market Market 3 3 3 3 ITRI Confidential Confidential 2 CCMA Mission CCMA Mission Integrated Cloud System Solutions Cloud OS and Cloud Application/Service Cloud Service Cloud Service Operation Know-how ITRI Confidential Cloud System Cloud System Cloud Datacenter Cloud Datacenter System Integration IT non-IT S/W Design Confidential 2
  4. SaaS • Integrated CCMA Technologies Focus CCMA Technologies Focus Applications

    Applications -Surveillance Service (Snake Eyes) PaaS SaaS Cloud Cloud OS OS - Integrated data center software stack Cloud Application Cloud Application Middleware Platform Middleware Platform - Surveillance Engine (Snake Eyes) IaaS Modular Cloud Modular Cloud Computer Computer - Manageable container computer Confidential 5 Integrated cloud platform and application CCMA Technologies Focus CCMA Technologies Focus AWS-like IaaS service Confidential 5
  5. Container Computer Container Computer - - A A Commodity Commodity-

    -only Modular only Modular 5 5 5 5 Confidential 6 Container Computer Container Computer only Modular only Modular Design Design • •Commodity H/W Commodity H/W • •All All- -layer layer- -2 2 data center • •Commodity H/W Commodity H/W • •All All- -layer layer- -2 2 data center Major Features Major Features • •All All- -layer layer- -2 2 data center network architecture • •Touch cooling Touch cooling-based thermal management • •Light Light- -out out management • •Fast deployment Fast deployment • •All All- -layer layer- -2 2 data center network architecture • •Touch cooling Touch cooling-based thermal management • •Light Light- -out out management • •Fast deployment Fast deployment Confidential 6
  6. Physical Server VM0 VM1 VMn Container Computer 1.0 Container Computer

    1.0 Architecture Architecture Layer Center Network 6 6 6 6 ITRI Confidential Compute Server Rack Layer-3 Container Computer 1.0 Container Computer 1.0 Architecture Architecture Layer-2-Only Data Center Network Load Balancing Traffic Shaping Layer-3 Border Routers ITRI Confidential 6 Traffic Shaping Intrusion Detection NAT Storage Server
  7. Multiplexing multiple Multiplexing multiple VDC’s VDC’s What is Cloud OS?

    What is Cloud OS? Photo Sharing VDC Provision and Deploy Virtual Data Center Management Virtual Data Center Management Service Provider 7 7 7 7 ITRI Confidential Monitor and Configure Virtual Resources •Cloud Application Developer •Cloud Service Provider VDC’s VDC’s in a physical data center in a physical data center What is Cloud OS? What is Cloud OS? Video Streaming VDC Web Conference VDC Physical Data Center Management Physical Data Center Management DataCenter Operator ITRI Confidential Physical Cluster •Cloud Service Infrastructure Administrator •Carrier Monitor, Diagnose and Configure Physical Resources
  8. Physical Resource Mgmt Physical Resource Mgmt • Physical Layout •

    Topology • Topology • Remote configuration • Monitoring • Trouble ticketing • Root cause analysis • Root cause analysis • Centralized Data Center Log analysis Physical Resource Mgmt Physical Resource Mgmt Unified Logger Zenoss GLPI monitoring event ticket PDCM monitoring via SNMP Cloud OS components Switches & physical machines …
  9. Virtual Resource Mgmt Virtual Resource Mgmt • VDC, VC, VM

    provisioning • Resource scheduling Virtual Data Center • Resource scheduling • Image Repository • Load Balancing • Failover • Live Migration • Auto-scaling APs v APs v APs v Virtual Data Center • Auto-scaling • Monitoring • Usage Statistics PM OS APs m OS APs m OS APs m Virtual Resource Mgmt Virtual Resource Mgmt Virtual Data Center Virtual Data Center APs APs APs APs APs v APs v APs v APs v APs v APs v APs v APs v APs v APs v Vcluster VCluster VCluster Virtual Data Center Virtual Data Center … PM OS APs OS APs OS APs OS APs … PM OS APs m OS APs m OS APs m OS APs m … PM OS APs m OS APs m OS APs m OS APs m … OS APs m OS APs m …
  10. Why Cloud “OS” ? Why Cloud “OS” ? Traditional OS

    Scheduling, placement, and migration High availability and scalability Inter-process protection L3/L7 firewall 10 10 10 10 L3/L7 firewall Turn off unnecessary HW Why Cloud “OS” ? Why Cloud “OS” ? Cloud OS Scheduling, placement, and migration High availability and scalability Inter-VM/VDC protection L3/L7 firewall L3/L7 firewall Turn off unnecessary servers
  11. Cloud OS Service Model Cloud OS Service Model • Virtual

    data center consists of one or multiple clusters, each of which comprises one or multiple clusters, each of which comprises one or multiple • Users provide a Virtual Cluster – No. of VM instances each with CPU performance and memory size requirement – Per-VM storage space requirement – External network bandwidth requirement – Security policy 11 11 11 11 – Backup policy – Load balancing policy – Network configuration, e.g. public IP address and private IP address range – OS image and application image Cloud OS Service Model Cloud OS Service Model consists of one or multiple virtual , each of which comprises one or multiple VMs , each of which comprises one or multiple VMs Virtual Cluster specification No. of VM instances each with CPU performance and memory size VM storage space requirement External network bandwidth requirement 11 Network configuration, e.g. public IP address and private IP address
  12. Mapping Virtual to Physical Mapping Virtual to Physical • Users

    create VDC, VC, VM according to their needs • On-demand resource provisioning • VRM maintains a PM pool • VRM maintains a PM pool • Each PM registers to VRM upon startup • VRM schedules VM onto PM per request VDC 12 12 12 12 OS APs vm Mapping Virtual to Physical Mapping Virtual to Physical • Static provisioning Model – Round-robin, worst-fit allocation – To balance the workload between PMs – The scheduler finds PMs that can host the – The scheduler finds PMs that can host the capacity requirement of the VM – Among those PMs, allocate one PM that has most residual capacity after the allocation VC VC VC VDC VDC … Node OS APs vm OS APs vm OS APs vm OS APs vm … Node OS APs vm OS APs vm OS APs vm OS APs vm … Node OS APs vm OS APs vm OS APs vm OS APs vm … Node OS APs vm OS APs vm OS APs vm OS APs vm …
  13. 90.00% 100.00% Auto Auto- -scaling scaling Scale up 10.00% 20.00%

    30.00% 40.00% 50.00% 60.00% 70.00% 80.00% 90.00% Break point 0.00% 10.00% 0 20 40 60 80 100 Time scaling scaling Average workload High watermark Low watermark Break point Breach duration Scale down 120 140 160 180 200 Breach duration Scale down
  14. VM Migration and Consolidation VM Migration and Consolidation • Consolidation

    for power efficiency • Dynamic consolidation and load balancing – Compute the possible consolidation • – Compute the possible consolidation plan • 2-D vector bin packing – Apply the configuration through VM live migration • Power management – Turn off idle PMs – Prediction for avoiding oscillation • Underlining technology: live migration 15 14 14 14 14 migration – Hypervisor provides internal migration A=65 B=20 PM VM Migration and Consolidation VM Migration and Consolidation External migration by VRM – Network: handles GARP; Notify SLB; setup VM network (e.g. IP) at the target – Security: setup security on target PM for – Security: setup security on target PM for new VM; setup cluster level security policy – Storage: detach source volume, attach target volume – Meta information: remove from source PM, restore to target PM 15 10 E=20 candidate to be turned off A=65 B=20 E=20 80 PM1 PM3 D=25 C=60 PM2 A=65 D=25 PM1 PM2 PM3 B=20 C=60
  15. Why VM consolidation? Why VM consolidation? Data Center Costs 25%

    15% 15% Source: Cost of power in Large-Scale Data Center, James Hamilton Blog, 11/28/2008 Why VM consolidation? Why VM consolidation? Data Center Costs 45% Servers Power distribution & Cooling Power Draw (utility) 15 Network Scale Data Center, James Hamilton Blog, 11/28/2008
  16. Fail Fail- -over & Load Balancing over & Load Balancing

    Virtual Machine Manager VM Die 1. One VM die 2. System is busy I am the new 1.1 Restart the dead VM 16 16 16 16 VM Die I am the new one! Hypervisor Monitor over & Load Balancing over & Load Balancing Virtual Machine Manager 1.1 Restart the 2.1 Migrate to meet load balancing Monitor
  17. VM Failover VM Failover • Status monitoring – VRM monitors

    both VM and PM – PM agent reflects VM status to VRM – Invalidate a PM if it fails the health check – Invalidate a VM if it disconnects for 60 seconds • VM failover – Persistent VM data, stored in shared Cloud storage – VM level 17 17 17 17 ITRI Confidential – VM level • Automatically restart a crashed VM • Provided by the hypervisor (currently – PM level • PDCM notifies VRM upon detection of defective PM • VRM reallocates VMs on a defective PM to other VM Failover VM Failover VRM monitors both VM and PM PM agent reflects VM status to VRM Invalidate a PM if it fails the health check Invalidate a VM if it disconnects for 60 seconds Persistent VM data, stored in shared Cloud storage ITRI Confidential Automatically restart a crashed VM Provided by the hypervisor (currently Xen 3.1) PDCM notifies VRM upon detection of defective PM on a defective PM to other PMs
  18. Cloud Storage System Cloud Storage System • Cloud Storage aims

    at cloud is designed to be scalable, available and low is designed to be scalable, available and low • Key Features – Storage Virtualization: Thin Provisioning – Reliability - Data is always protected – Scalability - up to 1,000 ~ 1,500 disks, several of storage space – Dynamic storage tiering 18 18 18 18 ITRI Confidential – Dynamic storage tiering – Manageability – Lower TCO Cloud Storage System Cloud Storage System Cloud Storage aims at cloud-scale data centers, and is designed to be scalable, available and low-cost is designed to be scalable, available and low-cost Storage Virtualization: Thin Provisioning Data is always protected up to 1,000 ~ 1,500 disks, several petabytes ITRI Confidential
  19. Cross Data Center Storage System Architecture Cross Data Center Storage

    System Architecture Computing Server Controller-1 Controller-2 Computing Server Computing Server Computing Server Layer 2 Switches 10G 10G Distributed Storage server Layer 3 Routers Rack-N Controller-1 Controller-2 Rack-1 10G 10G SSD Main and Secondary Storage are integrated together Container Computer Cross Data Center Storage System Architecture Cross Data Center Storage System Architecture VTL server Controller-1 Controller-2 10G 10G Controller-1 Controller-2 DR server 10G 10G SSD tape Layer 3 Routers N Controller-1 Controller-2 SSD
  20. Scalable Scalable Load Balancer Load Balancer Client & Server End

    User 20 20 20 20 Web server Load Balancer Load Balancer SLB distributes site traffic among several servers SLB End User Web servers
  21. User space Software SLB appliance Software SLB appliance SLB Server

    Node VC/VM logic Kernel space User space SLB daemon IPVS module Setup LB scheduling rules Add/Remove VIP VC/VM logic VC/VM metadata RTNETLINK MIM Kernel Agent Packet’s VDC ID Software SLB appliance Software SLB appliance Virtual Machine Manager VC/VM logic Repository Server (RS) scheduling rules Virtual Machine Manager (VMM) VC/VM logic VC/VM metadata Setup proxy User space Dom0 daemon VM loading compute node Performance data Performance data VM VM VM Xen hyperviser libvirt Kernel WAF(M)
  22. All Layer All Layer- -2 Network 2 Network 22 22

    22 22 ITRI Confidential Delta Visit 2 Network 2 Network ITRI Confidential Delta Visit 22
  23. Cloud Cloud OS Security Architecture OS Security Architecture A.Inter-VDC Isolation

    1.Virtual Machine Packet Filter B B.Virtual Appliance 1.Host-based Intrusion Detection System 2.Layer7 Filter 3.Security Policy (Firewall) 4.WAF 5.Authentication Services 23 23 23 23 ITRI Confidential A B OS Security Architecture OS Security Architecture B ITRI Confidential
  24. Multi Multi- -Dimensional Dimensional Load Load Balancing Balancing 24 24

    24 24 ITRI Confidential Dimensional Dimensional Balancing Balancing ITRI Confidential
  25. ITRI Cloud OS Summary ITRI Cloud OS Summary • Turnkey

    IaaS solution for Data Center Operators • Integrated data center software stack that – virtual resource management, storage management, load balancing, security, and management – Targets at cloud-scale data centers and supports based IaaS • Currently evaluated by Taiwan telecom providers ITRI Confidential 25 25 • Roadmap – Hybrid cloud – Federated architecture – Traffic shaping and QoS consideration ITRI Cloud OS Summary ITRI Cloud OS Summary solution for Data Center Operators data center software stack that provides virtual resource management, storage management, network management, load balancing, security, and virtual/physical data center scale data centers and supports virtual data center- Currently evaluated by Taiwan telecom providers ITRI Confidential consideration
  26. Feature Set for Cloud OS 2.0 26 26 26 26

    Feature Set for Cloud OS 2.0
  27. DMS/DSS 2.0 DMS/DSS 2.0 • DMS – Automatic data redundancy

    allocation – Automatic data redundancy allocation • Parity vs. N-way replication based on data access patterns – Virtual disk cloning: metadata copying and copy – Shared virtual disk support for cluster file system and for cluster DBMS: cache consistency • DSS – Parallel/sequential data de-duplication – Parallel/sequential data de-duplication – Wide-area data backup – File-level restore: agent in guest VM DMS/DSS 2.0 DMS/DSS 2.0 Automatic data redundancy allocation Automatic data redundancy allocation way replication based on data access patterns Virtual disk cloning: metadata copying and copy-on-write Shared virtual disk support for cluster file system and for cluster DBMS: cache consistency duplication duplication level restore: agent in guest VM
  28. Virtualization Management 2.0 Virtualization Management 2.0 • Federation of geographically

    distributed data centers – Across multiple container computers – Across multiple container computers – Across multiple geographically distributed sites • Dynamic Virtual Machine Management – Reduces power consumption through consolidation – Inter-physical-machine load balancing – Thermal load balancing • Application performance-driven • Application performance-driven – Fractional auto-scaling Virtualization Management 2.0 Virtualization Management 2.0 Federation of geographically distributed data centers Across multiple container computers Across multiple container computers Across multiple geographically distributed sites Dynamic Virtual Machine Management Reduces power consumption through consolidation machine load balancing driven auto-scaling driven auto-scaling
  29. Security 2.0 Security 2.0 • Tool for incremental and WAF

    rules and modules WAF rules and modules detection rules • API design and implementation for third security virtual appliances • Security event management (SEM) console 29 29 29 29 • Distributed DDoS attack remediation Security 2.0 Security 2.0 and non-disruptive upgrades of modules and log-based intrusion modules and log-based intrusion API design and implementation for third-party security virtual appliances Security event management (SEM) console attack remediation
  30. Internet Edge Logic 2.0 Internet Edge Logic 2.0 • Multi-homing

    load balancing • Distributed traffic shaping • Distributed traffic shaping • Support for Hybrid Cloud – Private IP Address Reuse – Virtual private network (VPN) – Virtual SNMP for management virtualization – Global load balancing for cloud bursting Internet Edge Logic 2.0 Internet Edge Logic 2.0 homing load balancing Virtual private network (VPN) Virtual SNMP for management virtualization for cloud bursting
  31. PDCM 2.0 PDCM 2.0 • Physical server node locationing •

    Clustering implementation • Clustering implementation • Inverse map between VDC/VC/VM and physical resources • Application-level dependency map discovery • Informed diagnosis of root cause PDCM 2.0 PDCM 2.0 locationing Inverse map between VDC/VC/VM and physical level dependency map discovery Informed diagnosis of root cause
  32. VDCM 2.0 VDCM 2.0 • Support for federated data centers

    • Application dependency topology visualization • Application dependency topology visualization • Centralized application-level log collection and access – Type-aware deep analysis of log messages • Application performance management using 32 32 32 32 • Application performance management using PDCM’s support for root cause analysis • VDC/VC/VM configuration management VDCM 2.0 VDCM 2.0 Support for federated data centers Application dependency topology visualization Application dependency topology visualization level log collection and aware deep analysis of log messages Application performance management using 32 Application performance management using support for root cause analysis VDC/VC/VM configuration management