Reading Circle
Self-supervised Equivariant Attention Mechanism
for Weakly Supervised Semantic Segmentation
[Wang+, CVPR20]
Slide 2
Slide 2 text
Semantic Segmentation
Assign a semantic category to each pixel.
2
Supervised
Dataset : image &
pixel-level class label
✗ huge annotation cost
↓
Weakly-supervised
Dataset : image &
image-level class label
Slide 3
Slide 3 text
Weakly-Supervised SS Methods
Three steps to train on the image-level label.
1. predict an initial category-wise response map
to localize the object
2. refine the initial response as the pseudo GT
3. train the segmentation network based on
pseudo labels
3
1 2
3
Slide 4
Slide 4 text
Weakly-Supervised SS Methods
Three steps to train on the image-level label.
1. predict an initial category-wise response map
to localize the object
2. refine the initial response as the pseudo GT
3. train the segmentation network based on
pseudo labels
4
1
Slide 5
Slide 5 text
What’s New
Introduce a self-supervised equivariant attention
mechanism (SEAM).
- Narrow the supervision gap between fully and
weakly supervised semantic segmentation
5
Slide 6
Slide 6 text
What’s New
Introduce a self-supervised equivariant attention
mechanism (SEAM).
- Focus on affine transformation
6
Previous Proposed
Slide 7
Slide 7 text
What’s New
Introduce a self-supervised equivariant attention
mechanism (SEAM).
- Focus on affine transformation
7
Previous Proposed
CAM varies depending on the
size of the input image
Slide 8
Slide 8 text
What’s New
Introduce a self-supervised equivariant attention
mechanism (SEAM).
- Focus on affine transformation
8
Previous Proposed
Consistent CAM regardless of
the size of the input image
Slide 9
Slide 9 text
Network Architecture of SEAM
Use Siamese Network and three kinds of Losses.
9
Slide 10
Slide 10 text
Network Architecture of SEAM
Use Siamese Network and three kinds of Losses.
10
Self Attention [Wang+, CVPR18]
Non-local mean operation
x : input signal
y : output signal
g : representation function
f : similarity function (scalar)
13
Gaussian Embedded Gaussian
Dot product Concatenation
Network Architecture of SEAM
Use Siamese Network and three kinds of Losses.
17
Slide 18
Slide 18 text
Loss Design of SEAM
1. Class Loss : multi-label soft margin loss
18
original CAM
Slide 19
Slide 19 text
Network Architecture of SEAM
Use Siamese Network and three kinds of Losses.
19
Slide 20
Slide 20 text
Loss Design of SEAM
2. Equivariant Regularization (ER) Loss
20
Consistency between before and
after affine transformation
Slide 21
Slide 21 text
Network Architecture of SEAM
Use Siamese Network and three kinds of Losses.
21
Slide 22
Slide 22 text
Loss Design of SEAM
2. Equivariant Cross Regularization (ECR) Loss
22
to further improve the ability of
network for equivariance learning
Slide 23
Slide 23 text
Network Architecture
Use Siamese Network and three kinds of Losses.
23
Slide 24
Slide 24 text
Dataset
PASCAL VOC 2012 semantic segmentation benchmark
21 categories
one or multiple object class
1,464 images in training set
1,449 images in validation set
1,456 images in test set
24
Slide 25
Slide 25 text
Result -CAM-
25
Proposed
Baseline
GT
Original
Slide 26
Slide 26 text
Result -Semantic Segmentation-
26
Input
GT
Ours
Slide 27
Slide 27 text
Quantitative Comparison
Evaluation of transformations on equivariant
regularization
- evaluation metric : ↑ mIoU (%)
27