Slide 1

Slide 1 text

Performance Management for Cloud- based Applications STC 2012 1

Slide 2

Slide 2 text

Agenda Context Problem Statement Cloud Architecture Need for Performance in Cloud Performance Challenges in Cloud Generic Generic IaaS / PaaS / SaaS - specific Best Practices / Remedies / Pointers 2

Slide 3

Slide 3 text

Context Cloud Computing gained significance mainly due to its impact on reduced CapEx and OpEx that is possible due to characteristics such as Elasticity, On-demand resource provisioning and Pay-per-Use that drive organizations to migrate some of their applications, data and infrastructure to Cloud Architectures. Few concerns of current organizations that plan for cloud adoption: How is performance management different for applications in Cloud compared to current architectures? 3 compared to current architectures? What are the typical application performance challenges in Cloud Service Models (IaaS, PaaS, and SaaS)? How to address these performance management challenges? Objective is to highlight on significant performance challenges for Cloud-based applications and share best practices / pointers to address them

Slide 4

Slide 4 text

Generic Cloud Architecture Virtual Machines (VMs) Virtualization Layer Performance of any IT System depends on: Application (Code, Design, Architecture, Software and External Systems) Hardware (H/W) 4 Physical Servers Host Operating System Virtualization Layer CPU Memory Storage Network Infrastructure Software (S/W) Configuration Additional layers that impact Performance –> Hypervisor Layer &Virtual Machines Fig.1. Cloud Architecture & Components

Slide 5

Slide 5 text

Need for Performance in Cloud SaaS Applications Control over Application Performance Management Cost Overhead due to Cap- Ex & Op-Ex Figure.2 represents that ‘Cost Overhead’ and ‘Control over Application Performance (APM)’ decreases from IaaS to SaaS for Cloud Consumers. If APM not planned in Cloud, Cost of managing application 5 PaaS IaaS CPU Storage VM CPU Storage VM IDEs, Runtime Environment Management Ex & Op-Ex application performance offsets cost benefits by Cloud Service Models Fig.2. Cost Overhead Vs. Control over APM

Slide 6

Slide 6 text

Performance Management Challenges in Cloud • Bursty load of an Application robs resources from other Applications sharing hardware infrastructure • Hypervisor Layer has certain overhead due to resource virtualization • ‘Timekeeping’ issue impacts on time based perf metrics Hypervisor Layer Shared Physical Environmen t 6 • Elasticity not a substitute for ‘Application Scalability’ - App should be scalable first • n-Way Session Replication in VMs impacts performance & scalability Elasticity & Scalability ‘Stateful’ Workloads

Slide 7

Slide 7 text

Category Challenge Recommendation/ Best Practice Hypervisor Layer Time based perf. metrics in Virtualized environment will be inaccurate (timer interrupts get consumed by Architects/developer s should use Hypervisor specific APIs when designing routines Performance Management Challenges interrupts get consumed by Hypervisor Layer due to VM scheduling & de- scheduling causing drift effect) Time measurements of apps will get impacted (significant with more VMs & heavy load) designing routines to capture latency at application code level to overcome ‘Timekeeping’ problem

Slide 8

Slide 8 text

Category Challenge Recommendation/ Best Practice Hypervisor Layer Virtualizing a physical NIC into multiple ‘Virtual NICs’ will have more concurrent network traffic there Few VMs should be assigned dedicated physical NICs depending on criticality of Performance Management Challenges network traffic there by impacts bandwidth availability for application criticality of workload & performance SLAs. Appropriate sizing of physical NICs should be considered.

Slide 9

Slide 9 text

Category Challenge Recommendation / Best Practice Shared Physical Environment Sudden and unpredictable load of one application might take away more computing resources, Review and Analyze # of tenants sharing underlying physical hardware Load pattern, and MIN and MAX capacities for Performance Management Challenges computing resources, due to Elasticity, thereby making short of available resources of other applications and thereby impacting their Performance each App/Tenant Resource Sharing model b/w VMs(Shared/ Dedicated/Shared-Cap) Capture and analyze mapping b/w Virtual and Physical resources (E.g: A VM that has 4 CPUs (logical) might be assigned only 0.5 Physical CPU

Slide 10

Slide 10 text

Category Challenge Recommendation / Best Practice Stateful Workload For stateful workloads, session management and session replication across multiple VMs is costly due to n-way Usage of Distributed Caching solutions is a must in Cloud for Stateful Apps. (E.g: Oracle Coherence, Performance Management Challenges costly due to n-way replication (store and retrieval operations) Oracle Coherence, MemCache, WebSphere eXtreme Scale) Ensure to store only minimum data in HTTP Sessions

Slide 11

Slide 11 text

Category Challenge Recommendation / Best Practice Elasticity Vs. Application Scalability Elasticity benefits are realized if and only if a given ‘Application’ is ‘Scalable’ first. Assess application’s scalability prior to deploying in Cloud Employ performance engineering activities Performance Management Challenges engineering activities (Monitoring, Profiling, Tuning, Design and Architecture Optimization) to make application highly scalable

Slide 12

Slide 12 text

Category Challenge Recommendation / Best Practice IaaS Cloud Consumer has control only over OS and applications deployed on top of it - but not on underlying hardware Understand Mapping between Virtual and Physical Resources of VMs Analyze and Review VM Profile w.r.t resource Performance Management Challenges - IaaS underlying hardware infrastructure Profile w.r.t resource sharing Model (Shared / Dedicated / Shared Cap) Understand ‘Automation Rules’ for Resource Management of VMs Get Physical Host‘s Utilization besides VMs’ Utilization

Slide 13

Slide 13 text

Category Challenge Recommendation / Best Practice PaaS No Access to Platform/Runtime’s performance metrics No Access to modify / tune platform runtime Application should be designed to have custom instrumentation (AOP, Log4J Performance Management Challenges - PaaS tune platform runtime configuration E.g.: JVM Heap size or GC Algorithms can not be tuned Performance bottleneck identification using profiling tools (Jprobe/JProfilier/.NET Profiler) is restricted (AOP, Log4J Frameworks) to identify code-level hotspots Define contractual agreements with Vendor to get OS level performance metrics

Slide 14

Slide 14 text

Category Challenge Recommendation / Best Practice PaaS Usage of Enterprise Performance tools (DynaTrace, HP Diagnostics, CA Introscope et al) is Use native monitroing tools such as jvmstat, .NET Review support Performance Management Challenges - PaaS Introscope et al) is restricted by Platform’s support and compatibility thereby limiting the performance monitoring and profiling activities Review support provided by various Platform vendors (Google, Force.com) for Monitoring and Profiling tools required for performance management

Slide 15

Slide 15 text

Category Challenge Recommendation / Best Practice SaaS No Control over Application Code, Platform and Hardware Infrastructure - Application ONLY OPTION is to clearly define contractual agreement and penalty clauses Performance Management Challenges - SaaS Application performance completely depends on how Cloud Vendor manages it penalty clauses with Cloud Provider for end-to-end application performance SLAs

Slide 16

Slide 16 text

Thank You 1 6