Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Virtual mutation analysis of relational database schemas

Virtual mutation analysis of relational database schemas

Interested in learning more about this topic? Visit this web site to read the paper: https://www.gregorykapfhammer.com/research/papers/McMinn2016a/

Gregory Kapfhammer

May 14, 2016
Tweet

More Decks by Gregory Kapfhammer

Other Decks in Technology

Transcript

  1. Virtual Mutation Analysis of Relational Database Schemas Phil McMinn
 Gregory

    M. Kapfhammer
 Chris J. Wright University of Sheffield
 Allegheny College
 University of Sheffield
  2. Relational Databases – 
 Why Should We (Still) Care? A

    vital component of many software systems Despite the wave of interest in “NoSQL” technologies, Relational Databases are still popular (and faster) For developers: schemas provide self-documentation
  3. Relational Databases – 
 Why Should We (Still) Care? Relational

    Databases are still 
 important, popular and relevant A vital component of many software systems Despite the wave of interest in “NoSQL” technologies, Relational Databases are still popular (and faster) For developers: schemas provide self-documentation
  4. Testing the Schema Correctly accepted by the schema Correctly rejected

    by the schema Correctly rejected by the schema
  5. Testing the Schema Correctly accepted by the schema Correctly rejected

    by the schema Correctly rejected by the schema
  6. Testing the Schema Correctly accepted by the schema Correctly rejected

    by the schema Correctly rejected by the schema
  7. Testing the Schema Correctly accepted by the schema Correctly rejected

    by the schema Correctly rejected by the schema
  8. Why Do We Need to Do This? To trap common

    errors when designing a schema
 For example: lack of uniqueness property on usernames, out of range values
  9. Why Do We Need to Do This? To trap common

    errors when designing a schema
 For example: lack of uniqueness property on usernames, out of range values To test development behaviour vs deployment
 DBMSs have subtly different behaviors
  10. Why Do We Need to Do This? To trap common

    errors when designing a schema
 For example: lack of uniqueness property on usernames, out of range values To test development behaviour vs deployment
 DBMSs have subtly different behaviors Nobody throws away a database of data
 To test the success of database migrations
  11. Why Do We Need to Do This? To trap common

    errors when designing a schema
 For example: lack of uniqueness property on usernames, out of range values To test development behaviour vs deployment
 DBMSs have subtly different behaviors Nobody throws away a database of data
 To test the success of database migrations Industry advice
 Destroying database consistency can have huge cost implications
  12. Mutation Analysis Once a test suite has been created, its

    fault finding capability can be estimated with mutation analysis. For relational database schema testing, mutants are created by making small changes to the schema
  13. Mutation Analysis Once a test suite has been created, its

    fault finding capability can be estimated with mutation analysis. For relational database schema testing, mutants are created by making small changes to the schema
  14. Mutation Analysis Once a test suite has been created, its

    fault finding capability can be estimated with mutation analysis. For relational database schema testing, mutants are created by making small changes to the schema
  15. Mutation Analysis Once a test suite has been created, its

    fault finding capability can be estimated with mutation analysis. For relational database schema testing, mutants are created by making small changes to the schema
  16. Mutation Analysis is Costly schema mutants mutant test suite database

    Mutant killed / alive + SchemaAnalyst High cost of communicating with the DBMS and executing SQL queries on it
  17. Reducing the Cost schema mutants mutant test suite database Mutant

    killed / alive + SchemaAnalyst Local, no communication overhead
  18. Reducing the Cost schema mutants mutant test suite database Mutant

    killed / alive + SchemaAnalyst model of Virtual Mutation
 Analysis
  19. Reducing the Cost schema mutants mutant test suite database Mutant

    killed / alive + SchemaAnalyst model of Virtual Mutation
 Analysis Lower execution overhead
  20. The Model Form an acceptance predicate for the table: icp2

    icp3 icp4 icp5 ap = icp1 /\ icp2 /\ icp3 /\ icp4 /\ icp5 Integrity constraint predicate icp1
  21. The Model Form an acceptance predicate for the table: icp2

    icp3 icp4 icp5 ap = icp1 /\ icp2 /\ icp3 /\ icp4 /\ icp5 Integrity constraint predicate icp1 True when DBMS would accept the data False otherwise
  22. Empirical Study RQ1. What is the relative efficiency of the

    virtual approach? RQ2. What are the time savings? RQ3. How do mutation scores compare when the standard approach is run for as long as the virtual one?
  23. RQ1: Efficiency Standard Virtual 1000 100000 1000 100000 1000 100000

    HyperSQL PostgreSQL SQLite C offeeO rders Em ployee Inventory Iso3166 JW hoisServer M ozillaPerm issions N istW eather Person Products C offeeO rders Em ployee Inventory Iso3166 JW hoisServer M ozillaPerm issions N istW eather Person Products Database Schema Mutation Analysis Time (Log Transformed)
  24. RQ1: Efficiency Standard Virtual 1000 100000 1000 100000 1000 100000

    HyperSQL PostgreSQL SQLite C offeeO rders Em ployee Inventory Iso3166 JW hoisServer M ozillaPerm issions N istW eather Person Products C offeeO rders Em ployee Inventory Iso3166 JW hoisServer M ozillaPerm issions N istW eather Person Products Database Schema Mutation Analysis Time (Log Transformed) Virtual Mutation Analysis is significantly more efficient for Postgres and HyperSQL, but not SQLite
  25. RQ2: Time Savings −100 −50 0 50 100 50 100

    150 Number of Mutants Percentage of Mean Time Saved HyperSQL PostgreSQL SQLite
  26. RQ2: Time Savings −100 −50 0 50 100 50 100

    150 Number of Mutants Percentage of Mean Time Saved HyperSQL PostgreSQL SQLite Virtual Mutation Analysis yields large time savings for Postgres and HyperSQL but not always with SQLite, leading to an average time saving of 51% overall
  27. HyperSQL PostgreSQL SQLite 0 50 100 150 C offeeO rders

    Em ployee Inventory Iso3166 JW hoisServer M ozillaPerm issions N istW eather Person Products C offeeO rders Em ployee Inventory Iso3166 JW hoisServer M ozillaPerm issions N istW eather Person Products C offeeO rders Em ployee Inventory Iso3166 JW hoisServer M ozillaPerm issions N istW eather Person Products Database Schema Total Number of Mutants Selective Virtual RQ3: Comparison
  28. HyperSQL PostgreSQL SQLite 0 50 100 150 C offeeO rders

    Em ployee Inventory Iso3166 JW hoisServer M ozillaPerm issions N istW eather Person Products C offeeO rders Em ployee Inventory Iso3166 JW hoisServer M ozillaPerm issions N istW eather Person Products C offeeO rders Em ployee Inventory Iso3166 JW hoisServer M ozillaPerm issions N istW eather Person Products Database Schema Total Number of Mutants Selective Virtual RQ3: Comparison Virtual Mutation Analysis evaluates more mutants
  29. RQ3: Comparison Selective Virtual 0.00 0.25 0.50 0.75 1.00 0.00

    0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 HyperSQL PostgreSQL SQLite C offeeO rders Em ployee Inventory Iso3166 JW hoisServer M ozillaPerm issions N istW eather Person Products C offeeO rders Em ployee Inventory Iso3166 JW hoisServer M ozillaPerm issions N istW eather Person Products Database Schema Mutation Score
  30. RQ3: Comparison Selective Virtual 0.00 0.25 0.50 0.75 1.00 0.00

    0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 HyperSQL PostgreSQL SQLite C offeeO rders Em ployee Inventory Iso3166 JW hoisServer M ozillaPerm issions N istW eather Person Products C offeeO rders Em ployee Inventory Iso3166 JW hoisServer M ozillaPerm issions N istW eather Person Products Database Schema Mutation Score Virtual Mutation Analysis is the best option when highly accurate scores are needed under a time constraint
  31. Conclusions Virtual Mutation Analysis Technique: Removes the need to use

    a real DBMS for relational database schema mutation testing More cost-effective while still being accurate: • More efficient for 22 of 27 configurations studied • Yields time savings of 13 to 99%