Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Cisco's Journey to Cloud Native

Elastic Co
March 07, 2017

Cisco's Journey to Cloud Native

Cisco’s Commerce Platform, a suite of 35 applications and 300+ services, powers product configuration, pricing, quoting, export compliance, credit checks, and order booking across all Cisco product lines. With a high average financial value of each transaction, the Platform is a critical part of Cisco’s business.

In order to improve customer experience and business agility, Cisco decided to transition the Platform to cloud-native technologies, starting with modernizing the underlying database layer. In this session, Dharmesh will share details around how they’ve implemented a 180+ node Elasticsearch deployment to query reference data, and are now experiencing significant resiliency advantages and zero downtime deployment and performance.

Dharmesh Panchmatia l Director, IT l Cisco

Elastic Co

March 07, 2017
Tweet

More Decks by Elastic Co

Other Decks in Technology

Transcript

  1. Cisco Systems Inc 7th March 2017 Cisco Commerce – Journey

    to Cloud Native Dharmesh Panchmatia, Director IT
  2. Cisco Commerce Architecture Caching/ Security Cisco Commerce (New, Follow-on and

    Renewal) Delivery Platform User Experience Customer Experience Web Mobile (iOS, Android) B2B (Web Services, EDI/XML, Punch-outs) Data Center / Virtualization Richardson (DC1) – UCS-B200 M2 Web, App Server – Active, DB – Active Allen (DC2) – UCS-B200 M2 Web, App Server – Active, DB – Passive* MVDC Capabilities Integrated Capabilities/ Workflow MDM, Services, Biz Rules & Policies Config Engine Deal Routing Rules Party Service Pricing Engine Eligibility Service Offer Mgmt Notification Engine Akamai Caching Services Real User Experience /App Performance Mgmt (AppDynamics, Akamai, ELK) Oracle Access Manager / Oauth Security/ Caching Event Management Framework Configure Price Quote Deal Approval Order Capture Order Orchestration
  3. 40+B Order Bookings 4-6M Hits per Day 140K Unique Users

    Cisco Commerce by the Numbers 300+ Services 36 Applications 4M Lines of Code
  4. Pain Points 1 Multiple platform outages due to infrastructure, storage

    and network failures 4 Lack of agility due to monolithic code base 3 6+ month effort for major software upgrades 2 10-12 hour downtimes for code deployments
  5. Objectives 1 2 3 4 Zero downtime for software upgrades

    and code deployments Increased Agility - move towards API based Architecture 5X performance improvement & scalability Fault tolerance for high application availability
  6. Journey to Cloud Native Ref Data REFERENCE DATA SOURCE DMPRD

    - RDBMS Logging Order Capture DC1 - Tomcat Order Capture Estimate Quote Order Invoice Exceptions Preferences List price Logging DC1 & DC2 Transaction Data Transaction sync Downstream Publish X-Functional Services (73) DC3 DC2 DC1 TRANSACTION DATA STORE P S S S S N1 N2 N3 N4 N5 DC2 - Tomcat 1 2 3 4 5
  7. 1.Zero Downtime Deployment • Migrate all reference data to Elastic

    Search • Build APIs to leverage reference data 3. Reference Data • MongoDB primary database for transactions • Rewrite SQL code in middle tier (~800,000 lines of code ) 4. Transaction Data 2. Decouple dependency on other systems • Feature flag based approach to switch to RDBMS in case of issues • Phase rollout to mitigate functional parity risk 6. Backward Compatibility • Leverage Blue-Green deployment to deploy front end / middle tier code • Use Consul as Service Registry • Migrate from DB access to APIs • Influence other systems to provide fault tolerant APIs Steps towards Cloud Native Design • Leverage Kafka for publishing data to downstream systems 5. Downstream Publish
  8. Migration by the Numbers Changed 110 backend integrations to Rest

    API calls X-Functional Dependencies 7 months without additional head count Delivery TimeLine Delivering 26 Biz capabilities (10H, 7 M, 9 L) along with this change Brownfield Rewrote 120 K lines of back-end code; Touched 600k lines of existing java code Code Risk Mitigation • Backward Compatibility • Extensive shadow testing for B2B Orders • Phase rollout
  9. Completed two quarters without issues 3 infrastructure outages since go

    live, no business impact Multiple upgrades without any downtime Fault Tolerance Software Upgrades in minutes without downtime Zero Downtime Response Times reduced to sub- sec from 3-5 sec earlier 5x Performance Improvement Eliminated ~120,000 lines of BE code and 20 backend jobs Architecture Optimization Benefits & Post Go-Live Status
  10. Cisco Commerce Works pace List Price Build and Price Quoting

    Order Accuprice Configurator TRANSACTION AND REFERENCE DATA SOURCES COMMERCE ELASTICSEARCH FOUNDATION CG1 PROD ODS PROD Eliminated 1.5TB of Logging data DM PROD CSM PROD PDB PROD Mongo DB Ref Data Logging LEAD Estim ate Quote Order Invoice SW Subscription CPR Exceptions SA/IB Prefer ences Profiles List price Disput es Recomme ndations CR Logging Elasticsearch Footprint • 12 clusters in 2 Data Centers • 90 nodes Elastic version 2.2 VMs + SAN storage Shield, Logstash, Kibana, Marvel 120 indices, 414 shards, 2TB storage
  11. Elasticsearch Layout Data Center 1 Application JVM Global Site Selector

    Load Balancer Gateway Data Center 2 Elastic Search Cluster Tomcat Servers SAN Storage Elastic Search Cluster Tomcat Servers SAN Storage Load Balancer
  12. Discounting Rule Engine Reference Data Search Transaction Portal • Ability

    to add attributes dynamically • Complex business capabilities: bundle discounts • Sub second response • Search for data sourced from multiple DBs • Rank based & type ahead search on 30-40 attributes • Sub second response • Single pane of glass for all transaction data • Global search • Ability to thread data: Deal – Order – Invoices • Sub second response Elasticsearch Use Cases
  13. Lessons Learned Lessons Learned Separate master nodes from data nodes

    SAN/SSD is a must Beware of noisy neighbors Do not have multiple shards for small indexes Decide on constant scoring versus ranked queries based on use case Remove indexing for non-searchable fields Keep heap size below 32G to reduce long GC times For timestamp based incremental indexing, use an offset