Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Performance Automation Tools in Orion Health

Performance Automation Tools in Orion Health

This deck goes into automation used to support performance testing in general, and specific tools developed and used in Orion Health. It goes into details on test data generation and test results analysis, and touches on nightly automated performance testing runs.

Viktoriia Kuznetcova

June 28, 2018
Tweet

More Decks by Viktoriia Kuznetcova

Other Decks in Technology

Transcript

  1. Page 2• Copyright © 2013 Orion Health™ group of companies

    • All rights reserved Agenda • Automation in Testing • Specifics of Automation in Performance Testing • Overview of automation tools used in Orion Health • Dive into test data generation • Dive into test data analysis • Dive into SWAT
  2. Page 3• Copyright © 2013 Orion Health™ group of companies

    • All rights reserved Test Automation “In software testing, test automation is the use of special software (separate from the software being tested) to control the execution of tests and the comparison of actual outcomes with predicted outcomes.” © Wikipedia • Exploratory testing – cannot be automated • “Check” automation – checking that outcomes are as expected • Works sometimes for unit testing, sometimes for regression testing
  3. Page 4• Copyright © 2013 Orion Health™ group of companies

    • All rights reserved Automation in Testing We automate anything that helps in testing and does not require human intelligence: – Preparing and validating test environment – Generating Test Data – Monitoring the system under test – Gathering information about production: from underlying data to user workflows and any issues – Analyzing raw test results, producing meaningful human-readable output – …
  4. Page 5• Copyright © 2013 Orion Health™ group of companies

    • All rights reserved Challenges in Performance Testing • Complex production-like, often disposable, test environments • Production-like in volume, complexity and variability test data • Complex user workflows need to be simulated with automation at high volumes • Monitoring can be complicated – lots of nodes and metrics to gather • Results Analysis – too much information means it is easy to miss important details
  5. Page 6• Copyright © 2013 Orion Health™ group of companies

    • All rights reserved What do we automate? • Spinning up a Test Environment: AWS, Puppet, Ansible, bash • Generating Test Data: Data Pot, other in-house tools, PL/SQL • Generating User Load: Apache Jmeter, Gatling, WebPageTest, in-house tools • Monitoring: AWS CloudWatch, sar, Capt. Morgan, ElasticSearch, perfmon, etc. • Processing and Analyzing Test Results: R, Scala, Splunk • Automating simplified performance testing for nightly builds: Ansible, bash, AWS cli
  6. Page 7• Copyright © 2013 Orion Health™ group of companies

    • All rights reserved Test Environment Automation • Infrastructure-level automation: AWS - CloudFormation • Hardware-level automation: AWS - EC2 instances, RDS instances • OS-level automation: AMIs come with EC2 instances, Puppet and/or Ansible can install and configure the rest • Application-level automation: in-house tool Graviton glues together and drives automation for deploying and configuring specific applications making up the system under test
  7. Page 8• Copyright © 2013 Orion Health™ group of companies

    • All rights reserved Test Data – Problem Statement • Clinical data – complex, rich • One way to get good data: data from production – rarely applicable, because it is hard to anonymize it, and legally impossible to use as is • Another way is to generate data resembling production data: – Similar volumes for all relevant data types – Similar variability for all relevant data types and fields – Similar data distributions – Realistic values, where the behavior of the system is data-driven
  8. Page 9• Copyright © 2013 Orion Health™ group of companies

    • All rights reserved Test Data – Solution: Data Pot • In-house tool, but the principles can be applied in a wider context • “Cooks” data in the internal format inside Oracle database, using PL/SQL and reference tables • The data is then transformed into the format system expects via Orion Health Rhapsody • Resulting dishes are fed to the system, which populates internal databases as necessary • Schemas dumps are taken and reused Oracle Rhapsody System under test
  9. Page 10• Copyright © 2013 Orion Health™ group of companies

    • All rights reserved Data Pot: Features • Full control over major data distributions and volumes • Randomized values for all of the relevant fields • Easy to extend and customize data before and after each stage • Layered data generation: allows for complex logic where there are inter-data dependencies (e.g. Lab results depend on the type of Lab tests) • Data content is de-coupled from data format • Fast generation of huge data volumes: performance is mostly limited by the system under test, everything else can be scaled
  10. Page 11• Copyright © 2013 Orion Health™ group of companies

    • All rights reserved Data Pot: Basic Principles • Start with understanding production data • Design data model that accounts for all properties of the production data you want to cover • Generate data in layers • Start simple, add complexity as you go • Remember about performance of data generation
  11. Page 12• Copyright © 2013 Orion Health™ group of companies

    • All rights reserved Test Data: Additional Considerations • Production data changes over time. Test environment should reflect that • Test Data actually used during test run matters: it needs to represent various users and workflows to have a good test coverage • Use understanding of production data to decide how to slice test data • Use SQL to find representative users/test data sets to actually use in the testing • Doesn’t matter if it’s performance testing or functional testing – the principles stand
  12. Page 13• Copyright © 2013 Orion Health™ group of companies

    • All rights reserved Test Load • For web applications we use Jmeter, Gatling and WebPageTest • Jmeter and Gatling generate server load on a protocol level, do not emulate browser • WebPageTest uses real browsers, but doesn’t scale well • Testing is not automated, creating the load and measuring the results is! • To get understanding of what to model, one can use production access logs
  13. Page 14• Copyright © 2013 Orion Health™ group of companies

    • All rights reserved Monitoring • There are many tools for all levels of monitoring • Automating monitoring output using a tool like ELK/Prometheus/NewRelic/etc. makes it easier to see patterns and dig into metrics retrospectively and during the test • Real User Monitoring is very useful, but needs to be built into the code. Alternative – Captain Morgan • Another alternative is monitoring Apache access.log or smth similar • Building good logging into the application greatly improves testability
  14. Page 15• Copyright © 2013 Orion Health™ group of companies

    • All rights reserved Processing Test Results • Types of test results we see in Performance testing: – Jmeter and Gatling logs with response times, codes, server errors etc. – Application logs (we have 3 types of logs with various information) – Access logs – GC logs – AWR reports • Analysis we do: aggregation, finding high resource utilization items, correlating events from different logs to each other, making sure test load was as designed • Tools: fast grep and awk sometimes, Excel sometimes, Splunk sometimes, but mostly R and Scala/Java in-house apps
  15. Page 16• Copyright © 2013 Orion Health™ group of companies

    • All rights reserved Examples of Data Processing • Cleaning up access logs (remove PHI, enable aggregation) • Aggregating Jmeter/Gatling results (percentiles) • Analyzing application logs to find issues • Analyzing HAR files to find issues with caching and Gzip • Analyzing application configuration to find issues (best practices adherence)
  16. Page 17• Copyright © 2013 Orion Health™ group of companies

    • All rights reserved Pritcel • Pritcel – performance test results report, highlights: – Slow requests (from Jmeter/Gatling log ) and events (from app logs) – Slow pages (from WebPageTest) – Slow SQL queries (from AWR report) – Long GC pauses (from GC logs) – Internal resources contention – db connection pools, caches (from app logs) – Error rates (from Jmeter/Gatling log and from app logs) – Concurrency levels (from Jmeter logs) – Response times aggregated, and detailed throughout the test
  17. Page 18• Copyright © 2013 Orion Health™ group of companies

    • All rights reserved SWAT – CI Performance Automation • Meant to help developers measure performance for each new build and get a quick feedback • Does the whole workflow in a simplified form, from creating the environment and preparing test data, to running the tests and processing results • Version 1 uses bash as a glue. Version 2 uses Ansible as glue • PEU owns automation. Developers own using it for their specific project and monitoring results
  18. Page 19• Copyright © 2013 Orion Health™ group of companies

    • All rights reserved Contact Details • https://testinglass.blogspot.com • https://twitter.com/miss-hali • [email protected][email protected] Questions?