Challenges
▪ Evaluation metrics are different and limited
▪ Significant manual effort
▪ No guidelines for manual inspection
▪ One-off solutions evaluated as a whole
▪ Different benchmarks
7
Slide 8
Slide 8 text
Metrics
Evaluation metrics are different and limited
1.
50% != 50% Accuracy != Good
Address Challenge #1 Metrics
15
1. Evaluation metrics are different and limited
Slide 16
Slide 16 text
Address Challenge #1 Metrics
16
1. Evaluation metrics are different and limited
7 Fidelity Metrics
Slide 17
Slide 17 text
Address Challenge #1 Metrics
17
1. Evaluation metrics are different and limited
Effort = Levenshtein Distance (transEvents, gtEvents)
2 Utility Metrics
Slide 18
Slide 18 text
Address Challenges #2 & #3
18
2. Significant manual effort
3. No guidelines for manual inspection
Slide 19
Slide 19 text
19
Slide 20
Slide 20 text
20
Uniform
Representation
Slide 21
Slide 21 text
21
Uniform
Representation
Slide 22
Slide 22 text
22
Slide 23
Slide 23 text
23
a1 b1
a1-1
a1-2
a1-3
b2
b1-1
b2-1
a1 b1
a1-1
a1-2
a1-3
b2 b3
b1-1
b2-1
b3-3
b3-2
b3-1
Wish Canonical Map
a1-1 à email
a1-2 à password
a1-3 à sign in
Etsy Canonical Map
b3-1 à email
b3-2 à password
b3-3 à sign in
Canonical
Map
Slide 24
Slide 24 text
24
Wish Canonical Map
a1-1 à email
a1-2 à password
a1-3 à sign in
Etsy Canonical Map
b3-1 à email
b3-2 à password
b3-3 à sign in
Address
Challenge
#2
ONLY manual effort