Dynamic Graph-Cut's for Efficient Mobile
Image and Video Segmentation
Middle East Technical University
Interaction Segmentation Mask Application
Interactive Image and Video Segmentation
for Rest of the Video
[Bounding Box + Color GMM + Min-Cut/Max-Flow] = [Rother, SIGGRAPH 05]
[Approx. Bound. + Color and Spatial Distance +Dynamic Alg.] = [Li Y, SIGGRAPH 95]
[Scribble + Color Histogram + Min-Cut/Max-Flow] = [Boykov Y, IJCV 06]
Path (Boundary) Cost
Min-Cut / Max-Flow
Interaction Model Defintion Minimization
Mobile Touch Screen Devices ?
User Centered Interaction
More Interaction Errors
Low Computational Power
Gesture of Coloring a Color Book
Some Details of Method
Color GMMs are used as models
Iterative EM is used [Rother, SIGGRAPH 05]
Image is initially over-segmented for efficiency via
SLIC algorithm [Achanta, PAMI 2012]
Min-cut/Max-flow is used for energy minimization
[Boykov, PAMI 2004]
Min-Cut / Max-Flow Review
Graph Structure Augmentin Path Algorithm
If the graph structure is not changing, previous flows can be
reused in the minimization [Kohli, PAMI 2007]
Throughout the interaction graph structure does not change
at all, but min-cut/max-flow is solved many times.
Only problem which can arise is the edge weights and it
can be solved via additional flow.
Temporally + Spatially Dynamic
Can we extend this concept to spatial dimension ?
At any stage only part of the whole graph containing
foreground object is need to be solved. But, what is the size
of this sub-graph ?
If the external flow which can flow through edges of the
subgraph can not change the solution, there is no need to
enlarge it anymore ?. However, this is hard to achieve.
Clustering supplied by GMM is generally confident;
however, labeling can be wrong.
Temporally + Spatially Dynamic (cont'd)
Our Claim: if the labels of the GMM clusters can not be
changed via external flows, there is no need to enlarge the
Algorithm: Start with the bounding box of the interaction
and enlarges it until
is satisfied for all GMM clusers.
(a): Blue rectangle is bounding box of the current interaction
Red rectangle is the computed bounding box
(b): Result of Min-Cut/Max-Flow for blue rectangle in (a)
(c): Result of Min-Cut/Max-Flow for red rectangle in (a)
Dynamic Graph-Cut in Action
Redefinition of Video Segmentation
Assume MRF energy for the initial frame is
MRF energy of any other frame is linearly
dependent on previous frame. (All superpixels
are model assumption)
Spatio-temporal distance metric should be used
for robust video segmentation.
Geodesic distance is a best candidate with high
computational complexity -O(n^3)-.
Bi-exponential smoother is used for high
performance approximation -O(n)- [Unser M,TIP 2011]
Dynamic Graph-Cut for Linear Filtering
Computation Time Improvement via
Time vs Performance
On Segtrack [Tsai BMVC 2010] dataset:
Thank you for your attention.
Energy minimization can tolerate some level of
error if hard labels are replaced with soft labels.
Question: Hard Labels vs Error Tolerance
Solution: Solve errors before they occur.
Idea is keeping a single RGB gaussian model for the color model of the currently interacted
region. If new superpixel is not confirming the color model, wait for it to come back or accept the
Find path means minimize
False Positives are handled via path finding.
False Negatives requires a restart.
Error Tolerance in Action
Performance Easiness Entertainment Overall
Proposed Method 5:1:.45 4:0:.86 5:1:.74 4:1:.45
GrabCut 3:2:.92 4:1:.75 2:1:.61 3:1:.75
Intelligent Scissors 3:1:.51 2:1:.74 3:2:.89 2:1:.76
15 Subjects (Undergraduate Level Engineering Students)
4 Random images out of 10 images
Grading in the level of 1-5 for 4 different metrics
Results in the format of Median:IQR:STD
P-Values (via dependent ANOVA test): 0.0005