Slide 1

Slide 1 text

Intramorphic Testing A New Approach to the Test Oracle Problem Manuel Rigger Zhendong Su https://nus-test.github.io/ https://ast.ethz.ch/

Slide 2

Slide 2 text

2 This Talk Comparison with differential testing and metamorphic testing Broad vision that might inspire future research Illustrative Examples Intramorphic Testing as a general white-box methodology to the test oracle problem

Slide 3

Slide 3 text

3 Motivating Example Is this program correct? def bubble_sort(arr): length = len(arr) for i in range(length): for j in range(0, length - i - 1): if arr[j] > arr[j+1]: arr[j], arr[j+1] = arr[j+1], arr[i]

Slide 4

Slide 4 text

4 Motivating Example def bubble_sort(arr): length = len(arr) for i in range(length): for j in range(0, length - i - 1): if arr[j] > arr[j+1]: arr[j], arr[j+1] = arr[j+1], arr[i] Is this program correct?

Slide 5

Slide 5 text

5 Unit Testing arr = [3, 1, 2] bubble_sort(arr) assert arr == [1, 2, 3] Let’s write a unit test!

Slide 6

Slide 6 text

6 Unit Testing arr = [3, 1, 2] bubble_sort(arr) assert arr == [1, 2, 3] Let’s write a unit test! AssertionError: [1, 2, 1]

Slide 7

Slide 7 text

7 Unit Testing arr = [3, 1, 2] bubble_sort(arr) assert arr == [1, 2, 3] Can we automate the testing process?

Slide 8

Slide 8 text

8 Automated Testing Challenges Test Case Test Oracle

Slide 9

Slide 9 text

9 Test Oracle Incorrect result! “a test oracle (or just oracle) is a mechanism for determining whether a test has passed or failed”

Slide 10

Slide 10 text

10 Automated Testing arr = [3, 1, 2] bubble_sort(arr) assert arr == [1, 2, 3] Can we automate the testing process? Test Case

Slide 11

Slide 11 text

11 Automated Testing arr = [3, 1, 2] bubble_sort(arr) assert arr == [1, 2, 3] Can we automate the testing process? Test Oracle

Slide 12

Slide 12 text

12 Intramorphic Testing I P P' O'' O Intramorphic Testing Modified program Oracle for each other We propose Intramorphic Testing as a general white- box methodology to tackle the test oracle problem

Slide 13

Slide 13 text

13 Background: General Methodologies Differential Testing I P1 P2 O1 P3 O2 O3 Metamorphic Testing I I' P O O' I P P' O'' O Intramorphic Testing Oracle for each other Interchangeable programs: P1 (I) = P2 (I) = P3 (I) Derived follow-up input Oracle for each other Modified program Oracle for each other General methodologies whose concrete realizations require domain-specific insights

Slide 14

Slide 14 text

14 Background: General Methodologies Differential Testing I P1 P2 O1 P3 O2 O3 Metamorphic Testing I I' P O O' I P P' O'' O Intramorphic Testing Oracle for each other Interchangeable programs: P1 (I) = P2 (I) = P3 (I) Derived follow-up input Oracle for each other Modified program Oracle for each other

Slide 15

Slide 15 text

15 Differential Testing I P1 P2 O1 P3 O2 O3 Metamorphic Testing I I' P O O' I P P' O'' O Intramorphic Testing Oracle for each other Interchangeable programs: P1 (I) = P2 (I) = P3 (I) Derived follow-up input Oracle for each other Modified program Oracle for each other Background: General Methodologies

Slide 16

Slide 16 text

16 Background: Differential Testing Many sorting algorithms and implementations exist, so we can validate that they produce the same results

Slide 17

Slide 17 text

17 Background: Differential Testing [3, 1, 2] bubble_sort merge_sort [1, 2, 3] insertion_sort [1, 2, 3] [1, 2, 3] Oracle for each other Interchangeable programs: P1 (I) = P2 (I) = P3 (I) Differential testing is a black-box technique that works well when systems implement the same behavior

Slide 18

Slide 18 text

18 Background: Differential Testing sorting_algorithms = [bubble_sort, merge_sort, insertion_sort] while True: arr = get_random_array() # e.g., [3, 1, 2] sorted_arrays = [alg(arr) for alg in sorting_algorithms] all_same = all(sorted_arr == sorted_arrays[0] for sorted_arr in sorted_arrays) assert all_same

Slide 19

Slide 19 text

19 Background: Differential Testing sorting_algorithms = [bubble_sort, merge_sort, insertion_sort] while True: arr = get_random_array() # e.g., [3, 1, 2] sorted_arrays = [alg(arr) for alg in sorting_algorithms] all_same = all(sorted_arr == sorted_arrays[0] for sorted_arr in sorted_arrays) assert all_same

Slide 20

Slide 20 text

20 Background: Differential Testing sorting_algorithms = [bubble_sort, merge_sort, insertion_sort] while True: arr = get_random_array() # e.g., [3, 1, 2] sorted_arrays = [alg(arr) for alg in sorting_algorithms] all_same = all(sorted_arr == sorted_arrays[0] for sorted_arr in sorted_arrays) assert all_same

Slide 21

Slide 21 text

21 Background: Differential Testing sorting_algorithms = [bubble_sort, merge_sort, insertion_sort] while True: arr = get_random_array() # e.g., [3, 1, 2] sorted_arrays = [alg(arr) for alg in sorting_algorithms] all_same = all(sorted_arr == sorted_arrays[0] for sorted_arr in sorted_arrays) assert all_same

Slide 22

Slide 22 text

22 Background: Differential Testing sorting_algorithms = [bubble_sort, merge_sort, insertion_sort] while True: arr = get_random_array() # e.g., [3, 1, 2] sorted_arrays = [alg(arr) for alg in sorting_algorithms] all_same = all(sorted_arr == sorted_arrays[0] for sorted_arr in sorted_arrays) assert all_same AssertionError: [[1, 2, 1], [1, 2, 3], [1, 2, 3]]

Slide 23

Slide 23 text

23 Background: Differential Testing Applications • C/C++ compilers [Yang et al., PLDI 2011] • Java Virtual Machines (JVMs) [Chen et al., PLDI 2016 and ICSE 2019] • Database Engines [Slutz, VLDB 1998] • Debuggers [Lehmann et al., ESEC/FSE 2018] • Code coverage tools [Yang et al., ICSE 2019] • SMT solvers [Winterer, Zhang, et al., OOPSLA 2020] • Object Relational Mappers (ORMs) [Sotiropoulos et al., ICSE 2021] • …

Slide 24

Slide 24 text

24 Background: General Methodologies Differential Testing I P1 P2 O1 P3 O2 O3 Interchangeable programs: P1 (I) = P2 (I) = P3 (I) Metamorphic Testing I I' P O O' Derived follow-up input Oracle for each other I P P' O'' O Intramorphic Testing Modified program Oracle for each other

Slide 25

Slide 25 text

25 Background: Metamorphic Testing The relative order of sorted elements is maintained when an element is removed from an input array

Slide 26

Slide 26 text

26 Background: Metamorphic Testing Oracles For each other [3, 1, 2] Sort [1, 2, 3] Remove 2 [1, 3] [3, 1, 2] Remove 2 [3, 1] Sort [1, 3]

Slide 27

Slide 27 text

27 Background: Metamorphic Testing while True: arr = get_random_array() if len(arr) >= 1: sorted_arr = bubble_sort(arr) random_elem = random.choice(sorted_arr) arr.remove(random_elem) sorted_smaller_arr = bubble_sort(arr) sorted_arr.remove(random_elem) assert sorted_arr == sorted_smaller_arr

Slide 28

Slide 28 text

28 Background: Metamorphic Testing while True: arr = get_random_array() if len(arr) >= 1: sorted_arr = bubble_sort(arr) random_elem = random.choice(sorted_arr) arr.remove(random_elem) sorted_smaller_arr = bubble_sort(arr) sorted_arr.remove(random_elem) assert sorted_arr == sorted_smaller_arr AssertionError: [2, 1] [2, 3]

Slide 29

Slide 29 text

29 Background: Metamorphic Testing The original tech report illustrated the technique using small examples and inspired the proposed intramorphic testing paper and its content

Slide 30

Slide 30 text

30 Background: Metamorphic Testing Applications • Compilers [Le et al., PLDI 2014] • Database engines [Rigger et al., ESEC/FSE 2020 and OOPSLA 2020] • SMT solvers [Winterer, Zhang, et al., PLDI 2020] • Android apps [Su et al., OOPSLA 2021] • Object detection systems [Wang et al., ASE 2020] • …

Slide 31

Slide 31 text

31 Problem: No Whitebox Technique Differential Testing I P1 P2 O1 P3 O2 O3 Metamorphic Testing I I' P O O' I P P' O'' O Intramorphic Testing Oracle for each other Interchangeable programs: P1 (I) = P2 (I) = P3 (I) Derived follow-up input Oracle for each other Modified program Oracle for each other ?

Slide 32

Slide 32 text

32 Intramorphic Testing I P P' O'' O Intramorphic Testing Modified program Oracle for each other We propose Intramorphic Testing as a white-box methodology aiming to complement differential testing and metamorphic testing

Slide 33

Slide 33 text

33 Intramorphic Testing C1 Cn O P Cn-1 Ci It is intuitive to think of a program P consisting of individual components C1 to Cn

Slide 34

Slide 34 text

34 Intramorphic Testing Replacing a specific component in a system might have an anticipated effect on the overall system C1 Cn O P Cn-1 P' = P[Ci '/Ci ] Ci '/Ci C1 Ci ' Cn O' Cn-1 Ci Intramorphic transformation Test oracles for each other

Slide 35

Slide 35 text

35 Intramorphic Testing Implementing a reverse sorting function and reversing either function’s output should yield the same result as the original sorting function

Slide 36

Slide 36 text

36 Intramorphic Testing [3, 1, 2] bubble_sort [1, 2, 3] [3, 1, 2] bubble_sort_reverse [3, 2, 1] Oracles For each other Reverse [1, 2, 3]

Slide 37

Slide 37 text

37 Intramorphic Testing while True: arr = get_random_array() sorted_arr = bubble_sort(arr.copy()) reverse_sorted_arr = bubble_sort_reverse(arr.copy()) assert sorted_arr.reverse() == reverse_sorted_arr

Slide 38

Slide 38 text

38 Intramorphic Testing while True: arr = get_random_array() sorted_arr = bubble_sort(arr.copy()) reverse_sorted_arr = bubble_sort_reverse(arr.copy()) assert sorted_arr.reverse() == reverse_sorted_arr AssertionError: [1, 2, 1] [3, 2, 3]

Slide 39

Slide 39 text

39 Examples

Slide 40

Slide 40 text

40 Supporting Artifact https://zenodo.org/record/7229326

Slide 41

Slide 41 text

41 Example 1: AST Printing a 3 2 + * class Constant: def __init__(self, value): self.value = value class Variable: def __init__(self, name): self.name = name class Operation: def __init__(self, operator, left, right): self.operator = operator self.left = left self.right = right tree = Operation('*', Operation('+', Variable('a'), Constant(3)), Constant(2))

Slide 42

Slide 42 text

42 Example 1: AST Printing a 3 2 + * (a + 3) * 2 print

Slide 43

Slide 43 text

43 Example 1: AST Printing a 3 2 + * class Operation: # ... def as_string(self): left = self.left.as_string() right = self.right.as_string() if self.operator == '*': if isinstance(self.left, Operation) and self.left.operator == '+': left = '(' + left + ')' if isinstance(self.right, Operation) and self.right.operator == '+': right = '(' + right + ')' return '%s %s %s' % (left, self.operator, right)

Slide 44

Slide 44 text

44 Example 1: AST Printing Provide alternative (simpler) print implementations, and compare that they include the same operands and operators Infix: (a + 3) * 2 Prefix: * + a 3 2 Postfix: a 3 + 2 * a 3 2 + *

Slide 45

Slide 45 text

45 Example 1: AST Printing class Operation: # ... def as_string_prefix(self): return '%s %s %s' % (self.operator, self.left.as_string_prefix(), self.right.as_string_prefix()) def as_string_postfix(self): return '%s %s %s' % (self.left.as_string_postfix(), self.right.as_string_postfix(), self.operator) Infix: (a + 3) * 2 Prefix: * + a 3 2 Postfix: a 3 + 2 *

Slide 46

Slide 46 text

46 Example 1: AST Printing (a + 3) * 2 * + a 3 2 a 3 + 2 * Strip parentheses a + 3 * 2 * + a 3 2 a 3 + 2 * sort * + 2 3 a * + 2 3 a * + 2 3 a Oracles For each other

Slide 47

Slide 47 text

47 Example 2: Monte Carlo Simulation def get_pi_approximation(): inside = 0 for _ in range(1000000): x = random.random() y = random.random() if x**2+y**2 <= 1: inside += 1 pi = 4*inside/1000000 return pi

Slide 48

Slide 48 text

48 Example 2: Monte Carlo Simulation Fewer samples result in a worse approximation

Slide 49

Slide 49 text

49 Example 2: Monte Carlo Simulation def get_pi_approximation(n): inside = 0 for _ in range(n): x = random.random() y = random.random() if x**2+y**2 <= 1: inside += 1 pi = 4*inside/n return pi def get_pi_approximation(): inside = 0 for _ in range(1000000): x = random.random() y = random.random() if x**2+y**2 <= 1: inside += 1 pi = 4*inside/1000000 return pi Add a parameter n

Slide 50

Slide 50 text

50 Example 2: Monte Carlo Simulation get_pi_approximation(10) get_pi_approximation(1000000) Oracles For each other Execute 3.6 3.142748

Slide 51

Slide 51 text

51 Example 3: Knapsack Problem Given a knapsack with a given capacity, the goal is to pack items to maximize the value of items in the knapsack. Capacity: 5 kg Value: 10, weight: 0.1 kg Value: 1500, weight: 2.2 kg

Slide 52

Slide 52 text

52 Example 3: Knapsack Problem Overall value: 3600 Overall weight: 5.0 kg Value: 3000 Weight: 4.4 kg Value: 60 Weight: 0.6 kg

Slide 53

Slide 53 text

53 Example 3: Knapsack Problem vals = [ ('Microphone', 10, 1), ('Laptop', 1500, 22) ] def knapsack_greedy(objects, capacity): packed = [] cum_value = 0 cum_weight = 0 objects.sort(key=lambda triple : float(triple[1]) / triple[2], reverse=True) for (item, value, weight) in objects: while cum_weight + weight <= capacity: cum_weight += weight cum_value += value packed.append(item) return (packed, cum_value, cum_weight)

Slide 54

Slide 54 text

54 Example 3: Knapsack Problem A greedy algorithm should never produce a better result than an optimal algorithm

Slide 55

Slide 55 text

55 Discussion: Characteristics • Granularity (operator, system, …) • Format (source code, binary, …) • Transformation realization (adding new component, replacing it, …) • Completeness (applicable to any input) • False alarms • … Future research could explore their strengths and weaknesses

Slide 56

Slide 56 text

56 Discussion: Domain • Compilers • Database systems • SMT solvers • … Similar to differential testing and metamorphic testing, domain-specific insights will be necessary to realize effective approaches

Slide 57

Slide 57 text

57 Discussion: Practicality Users Developers

Slide 58

Slide 58 text

58 Discussion: Practicality Users Developers I can apply metamorphic testing and differential testing without deep knowledge on the system’s internals

Slide 59

Slide 59 text

59 Discussion: Practicality Users Developers I can use intramorphic testing to incorporate my application knowledge into the testing process

Slide 60

Slide 60 text

60 Discussion: Practicality Developers How should I maintain multiple versions? How can the IDE support me? … We hope that future research will propose approaches to lower the cost of using intramorphic testing

Slide 61

Slide 61 text

61 Summary