Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Elastic vSphere: Now with More Stretch

Scott Lowe
September 27, 2011

Elastic vSphere: Now with More Stretch

Discusses the design considerations around building stretched VMware vSphere clusters

Scott Lowe

September 27, 2011
Tweet

More Decks by Scott Lowe

Other Decks in Technology

Transcript

  1. Before we start • Get involved! • If you use

    Twitter, feel free to tweet about this session (use hashtag #tovmug) • I encourage you to take photos or videos of today’s session and share them online • This presentation will be made available online after the event
  2. Elastic vSphere: Now with More Stretch Examining the design considerations

    for building stretched VMware vSphere clusters Scott Lowe, VCDX 39 vExpert, Author, Blogger, Geek http://blog.scottlowe.org / Twitter: @scott_lowe
  3. Agenda • A quick review of stretched clusters • Stretched

    cluster considerations • Looking forward • Questions and answers
  4. Stretched Cluster Review • Sometimes referred to as a “split

    cluster” • A stretched cluster is a cluster with ESX/ESXi hosts in different physical locations, usually different geographic sites • Stretched clusters are typically built as a way to create “active/active” data centers in order to: • Provide high availability across sites • Do dynamic workload balancing across sites
  5. Stretched Cluster Review • HA/DRS are not required in a

    stretched cluster, although they are typically deployed • Many pros and cons of stretched clusters stem directly from the use of HA/DRS • Stretched clusters are not a requirement for long-distance vMotion
  6. Stretched Cluster Considerations: vMotion • Stretched clusters are not a

    prerequisite for long-distance vMotion • However, vMotion does derive from benefit from stretched clusters: • Intra-cluster vMotions are highly parallelized • Inter-cluster vMotions are serial • Using a stretched cluster could offer benefits to disaster avoidance use cases
  7. Stretched Cluster Considerations: Storage • Active/active data centers require read/write

    storage at both ends • Storage performance suffers otherwise • Storage vMotion required to fix the performance hit • There are a couple different ways of addressing this requirement; each approach has its benefits/drawbacks • Solutions are generally limited to synchronous (~100km) distances
  8. Stretched Cluster Considerations: Storage • One solution is the stretched

    SAN solution • Stretching the SAN fabric between locations will usually involve multiple VSANs and inter-VSAN routing • Typically read/write in one location (read-only in the other) • Cross-connect topology allows hosts on both side to access read/write storage but introduces multipathing considerations • Implementations with only a single storage controller at each location create a SPoF (single point of failure)
  9. Stretched Cluster Considerations: Storage • Another solution is distributed virtual

    storage • Distributes storage in a read/write fashion across multiple sites • Uses data locality algorithms to maximize cache benefits • Typically uses multiple controllers in a scale-out fashion • Needs a clustered file system for simultaneous host access • Must address “split brain” scenarios
  10. Stretched Cluster Considerations: HA • vSphere HA can provide cross-site

    failover • However, vSphere HA is not currently “site aware” • You can’t control failover destination • You can’t designate or define things like: • Per-site failover capacity • Per-site failover hosts • Per-site admission controls
  11. Stretched Cluster Considerations: HA • In vSphere 4.x, no more

    than 8 hosts in HA-enabled stretched clusters or you’ll run afoul of HA primary node limitations • Can only deploy 4 hosts or less per site per cluster • Ensures distribution of HA primary nodes • No supported method to increase the number of primary nodes or to specify HA primary nodes • vSphere 5 has a new HA architecture that eliminates this consideration
  12. Stretched Cluster Considerations: DRS • You can balance load across

    sites using vSphere DRS • However, vSphere DRS (like HA) is not “site aware” • DRS host affinity rules can mimic “site awareness” • DRS host affinity rules are not dynamic • DRS host affinity rules create administrative overhead • DRS host affinity rules are defined and managed on a per- cluster basis
  13. Stretched Cluster Considerations: Storage DRS • vSphere 5 introduces Storage

    DRS • Like vSphere DRS, Storage DRS is not “site aware” • Manually align datastore clusters with storage topology to avoid introducing unnecessary latency • User-defined storage capabilities and profile-driven storage could be used to help mimic site awareness • Watch for impact on storage replication/synchronization
  14. Stretched Cluster Considerations: Networking • Stretched clusters add some networking

    complexity • More complex network configuration is required to provide Layer 2 adjacency (or its equivalent) • More complex networking required to address routing issues created by VM mobility • Technologies to address these concerns are new (OTV, LISP) and require networking expertise to configure and maintain • What about the virtual network design?
  15. Stretched Cluster Considerations: Operations • Movement of VMs between sites

    (perhaps due to HA/DRS) could impact other areas: • Backups • Personnel/support • Disaster recovery/replication • Performance of multi-tier applications
  16. The Future of Stretched Clusters • HA/DRS are simultaneously “greatest

    strength and greatest weakness” for stretched clusters • vSphere 5 improves the situation significantly • Stretched clusters would directly benefit from improvements to HA/DRS in these areas: • “Site awareness” • More scalable/dynamic DRS host affinity rule management (policy-based placement)
  17. The Future of Stretched Clusters • Stretched clusters will benefit

    from further networking developments such as: • LISP (or equivalent) to decouple network routing from network identity • OTV, EoMPLS, or other equivalents to enable Layer 2 adjacency • Not yet clear how VXLAN will play in this space • In the longer-term future, the need for Layer 2 adjacency needs to be addressed and resolved
  18. The Future of Stretched Clusters • Improvements in storage functionality

    will help stretched clusters • Active/active read-write storage at greater distances • Better handling of “split brain” scenarios • Better/more direct integration with replication for topologies with >2 sites (Sync-Sync-Async, for example)