Slide 1

Slide 1 text

The Nemhauser-Trotter Reduction and Lifted Message Passing for the Weighted CSP Hong Xu T. K. Satish Kumar Sven Koenig [email protected], [email protected], [email protected] June 8, 2017 University of Southern California the 14th International Conference on Integration of Artificial Intelligence and Operations Research Techniques in Constraint Programming (CPAIOR 2017) Padova, Italy

Slide 2

Slide 2 text

Agenda The Weighted Constraint Satisfaction Problem (WCSP) The Constraint Composite Graph (CCG) Computational Techniques Facilitated by the CCG The Nemhauser-Trotter (NT) Reduction Min-Sum Message Passing (MSMP) Conclusion 1

Slide 3

Slide 3 text

Executive Summary Using the Constraint Composite Graph (CCG) of a WCSP, • The Nemhauser-Trotter (NT) Reduction, a polynomial-time procedure, can solve about 1/8 of the benchmark instances without search. • The Min-Sum Message Passing (MSMP) algorithm, widely used in the probabilistic reasoning community, produces significantly better solutions on the CCG than on the WCSP’s original form. This further bridges the probabilistic reasoning and CP communities. 2

Slide 4

Slide 4 text

Agenda The Weighted Constraint Satisfaction Problem (WCSP) The Constraint Composite Graph (CCG) Computational Techniques Facilitated by the CCG The Nemhauser-Trotter (NT) Reduction Min-Sum Message Passing (MSMP) Conclusion

Slide 5

Slide 5 text

The Weighted Constraint Satisfaction Problem: Motivation Many real-world problems can be solved using the WCSP: • RNA motif localization (Zytnicki et al. 2008) • Communication through noisy channels using Error Correcting Codes in Information Theory (Yedidia et al. 2003) • Medical and mechanical diagnostics (Milho et al. 2000; Muscettola et al. 1998) • Energy minimization in Computer Vision (Kolmogorov 2005) • · · · 3

Slide 6

Slide 6 text

Weighted Constraint Satisfaction Problem (WCSP) • N variables x = {X1 , X2 , . . . , XN }. • Each variable Xi has a discrete-valued domain Di . • M weighted constraints {Es1 , Es2 , . . . , EsM }. • Each constraint Es specifies the weight for each combination of assignments of values to a subset s of the variables. • Find an optimal assignment of values to these variables so as to minimize the total weight: E(x) = M i=1 Esi (xsi ). • Known to be NP-hard. 4

Slide 7

Slide 7 text

WCSP Example on Boolean Variables X 1 X 2 X 3 X 2 1 0 1 0 X 3 1.0 0.6 1.3 1.1 X 1 1 0 1 0 X 3 0.7 0.4 0.9 0.8 X 1 1 0 1 0 X 2 0.7 0.5 0.6 0.3 X 1 1 0 0.2 0.7 X 3 1 0 1.0 0.1 X 2 1 0 0.8 0.3 E(X1 , X2 , X3 ) = E1 (X1 ) + E2 (X2 ) + E3 (X3 )+ E12 (X1 , X2 ) + E13 (X1 , X3 ) + E23 (X2 , X3 ) 5

Slide 8

Slide 8 text

WCSP Example: Evaluate the Assignment X1 = 0, X2 = 0, X3 = 1 X 1 X 2 X 3 X 2 1 0 1 0 X 3 1.0 0.6 1.3 1.1 X 1 1 0 1 0 X 3 0.7 0.4 0.9 0.8 X 1 1 0 1 0 X 2 0.7 0.5 0.6 0.3 X 1 1 0 0.2 0.7 X 3 1 0 1.0 0.1 X 2 1 0 0.8 0.3 E(X1 = 0, X2 = 0, X3 = 1) = 0.7 + 0.3 + 1.0 + 0.5 + 1.3 + 0.9 = 4.7 (This is not an optimal solution.) 6

Slide 9

Slide 9 text

WCSP Example: Evaluate the Assignment X1 = 1, X2 = 0, X3 = 0 X 1 X 2 X 3 X 2 1 0 1 0 X 3 1.0 0.6 1.3 1.1 X 1 1 0 1 0 X 3 0.7 0.4 0.9 0.8 X 1 1 0 1 0 X 2 0.7 0.5 0.6 0.3 X 1 1 0 0.2 0.7 X 3 1 0 1.0 0.1 X 2 1 0 0.8 0.3 E(X1 = 1, X2 = 0, X3 = 0) = 0.2 + 0.3 + 0.1 + 0.7 + 0.6 + 0.7 = 2.6 This is an optimal solution. Using brute force, it requires exponential time to find. 7

Slide 10

Slide 10 text

Agenda The Weighted Constraint Satisfaction Problem (WCSP) The Constraint Composite Graph (CCG) Computational Techniques Facilitated by the CCG The Nemhauser-Trotter (NT) Reduction Min-Sum Message Passing (MSMP) Conclusion

Slide 11

Slide 11 text

Two Forms of Structure in WCSP X 1 X 2 X 3 X 4 X 1 1 0 1 0 X 2 0.7 0.5 0.6 0.3 Numerical Structure Graphical Structure • Graphical: Which variables are in which constraints? • Numerical: How does each constraint relate the variables in it? How can we exploit both forms of structure computationally? 8

Slide 12

Slide 12 text

Minimum Weighted Vertex Cover (MWVC) 1 2 2 0 1 1 (a) 1 2 2 0 1 1 (b) 1 2 2 0 1 1 (c) 1 2 2 0 1 1 (d) Each vertex is associated with a non-negative weight. Sum of the weights on the vertices in the VC is minimized. 9

Slide 13

Slide 13 text

Projection of Minimum Weighted Vertex Cover onto an Independent Set X 1 + X 3 X 2 X 5 X 6 X 4 X 7 ∞ 1 1 1 1 2 1 X 1 X 2 X 3 X 4 X 5 X 6 X 7 1 1 1 1 2 3 1 = necessarily present in the vertex cover 0 = necessarily absent from the vertex cover X 1 1 0 1 0 X 4 5 4 7 6 1 (Kumar 2008, Fig. 2) 10

Slide 14

Slide 14 text

Projection of MWVC onto an Independent Set Assuming Boolean variables in WCSPs • Observation: The projection of MWVC onto an independent set looks similar to a weighted constraint. • Question 1: Can we build the lifted graphical representation for any given weighted constraint? This is answered by (Kumar 2008). • Question 2: What is the benefit of doing so? 11

Slide 15

Slide 15 text

Lifted Representations: Example X 1 X 2 X 3 X 2 1 0 1 0 X 3 1.0 0.6 1.3 1.1 X 1 1 0 1 0 X 3 0.7 0.4 0.9 0.8 X 1 1 0 1 0 X 2 0.7 0.5 0.6 0.3 X 1 1 0 0.2 0.7 X 3 1 0 1.0 0.1 X 2 1 0 0.8 0.3 E(X1 , X2 , X3 ) = E1 (X1 ) + E2 (X2 ) + E3 (X3 )+ E12 (X1 , X2 ) + E13 (X1 , X3 ) + E23 (X2 , X3 ) 12

Slide 16

Slide 16 text

Lifted Representations: Example X 2 1 0 1 0 X 3 1.0 0.6 1.3 1.1 X 1 1 0 1 0 X 3 0.7 0.4 0.9 0.8 X 1 1 0 1 0 0.7 0.5 0.6 0.3 X 1 1 0 0.2 0.7 X 3 1 0 1.0 0.1 X 2 1 0 0.8 0.3 X 1 A 4 0.2 0.7 X 2 A 5 0.8 0.3 X 3 A 6 1.0 0.1 X 1 A 1 0.2 0.5 X 2 0.1 X 2 A 2 0.4 0.6 X 3 0.7 X 1 A 3 0.3 0.4 X 3 0.5 X 2 13

Slide 17

Slide 17 text

Constraint Composite Graph (CCG) X 1 A 1 0.7 0.5 X 2 1.3 A 2 0.6 X 3 2.2 A 3 0.4 A 4 0.7 A 5 0.3 A 6 0.1 14

Slide 18

Slide 18 text

MWVC on the Constraint Composite Graph (CCG) X 1 A 1 0.7 0.5 X 2 1.3 A 2 0.6 X 3 2.2 A 3 0.4 A 4 0.7 A 5 0.3 A 6 0.1 An MWVC of the CCG encodes an optimal solution of the original WCSP! 15

Slide 19

Slide 19 text

Agenda The Weighted Constraint Satisfaction Problem (WCSP) The Constraint Composite Graph (CCG) Computational Techniques Facilitated by the CCG The Nemhauser-Trotter (NT) Reduction Min-Sum Message Passing (MSMP) Conclusion

Slide 20

Slide 20 text

Agenda The Weighted Constraint Satisfaction Problem (WCSP) The Constraint Composite Graph (CCG) Computational Techniques Facilitated by the CCG The Nemhauser-Trotter (NT) Reduction Min-Sum Message Passing (MSMP) Conclusion

Slide 21

Slide 21 text

The Nemhauser-Trotter (NT) Reduction A C D B w 4 w 3 w 1 w 2 A(w 1 ) B(w 2 ) C(w 3 ) D(w 4 ) A'(w 1 ) C'(w 3 ) D'(w 4 ) B'(w 2 ) A(w 1 ) B(w 2 ) C(w 3 ) D(w 4 ) A'(w 1 ) C'(w 3 ) D'(w 4 ) B'(w 2 ) A is in the minimum weighted VC B is not in the minimum weighted VC C and D are in the Kernel 16

Slide 22

Slide 22 text

Experimental Evaluation: Instances • The UAI 2014 Inference Competition: PR and MMAP benchmark instances (Up to 10 thousands variables and constraints) • Converted to WCSP instances by taking negative logarithms normalization. • WCSP Instances from (Hurley et al. 2016) (Up to less than 1 million variables and millions of constraints) • The Probabilistic Inference Challenge 2011 • The Computer Vision and Pattern Recognition OpenGM2 benchmark • The Weighted Partial MaxSAT Evaluation 2013 • The MaxCSP 2008 Competition • The MiniZinc Challenge 2012 & 2013 • The CFLib (a library of cost function networks) • Only instances in which variables have only binary domains are used. 17

Slide 23

Slide 23 text

Experimental Evaluation: Results 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Fraction 0 25 50 75 100 125 Number of Instances Benchmark instances from UAI 2014 Inference Competition: 19 out of 160 benchmark instances solved by the NT reduction 18

Slide 24

Slide 24 text

Experimental Evaluation: Results 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Fraction 0 50 100 150 200 250 Number of Instances Benchmark instances from (Hurley et al. 2016): 53 out of 410 benchmark instances solved by the NT reduction 19

Slide 25

Slide 25 text

Agenda The Weighted Constraint Satisfaction Problem (WCSP) The Constraint Composite Graph (CCG) Computational Techniques Facilitated by the CCG The Nemhauser-Trotter (NT) Reduction Min-Sum Message Passing (MSMP) Conclusion

Slide 26

Slide 26 text

Min-Sum Message Passing (MSMP) Algorithms • Min-Sum Message Passing Algorithms • are variants of belief propagation • are widely used • have information passed locally between variables and constraints • Original MSMP Algorithm • Perform MSMP on WCSPs directly • Messages are passed between variables and constraints • Lifted MSMP Algorithm • Perform MSMP on the MWVC problem instance of the CCG • Messages are passed between adjacent vertices 20

Slide 27

Slide 27 text

Operations on Tables: Min minX1 X1 X2 0 1 0 1 2 1 4 3 = X1 0 1 1 3 21

Slide 28

Slide 28 text

Operations on Tables: Sum X1 X2 0 1 0 1 2 1 4 3 + X1 0 5 1 6 = X1 X2 0 1 0 1 + 5 = 6 2 + 5 = 7 1 4 + 6 = 10 3 + 6 = 9 22

Slide 29

Slide 29 text

Original MSMP Algorithm: Message Passing for the WCSP (Xu et al. 2017, Fig. 1) • A message is a table over the single variable, which is the sender or the receiver. • A vertex of k neighbors 1. applies sum on the messages from its k − 1 neighbors and internal constraint table, and 2. applies min on the summation result and sends the resulting table to its kth neighbor. 23

Slide 30

Slide 30 text

Original MSMP Algorithm: Example X1 C12 X2 C23 X3 ν X1→C12 = 0, 0 − − − − − − − − − − → − − − − − − − − − − → νX3 →C23 = 0, 0 ˆ νC12 →X2 = 0, 0 ← − − − − − − − − − − ν X2→C23 = 0, 0 − − − − − − − − − − → ˆ νC23 →X3 = 0, 0 ← − − − − − − − − − − ← − − − − − − − − − − ˆ ν C23→X2 = 0, 0 − − − − − − − − − − → νX2 →C12 = 0, 0 ← − − − − − − − − − − ˆ ν C12→X1 = 0, 0 X1 X2 0 1 0 2 3 1 1 2 (a) C12 X2 X3 0 1 0 1 4 1 2 2 (b) C23 24

Slide 31

Slide 31 text

Original MSMP Algorithm: Example X1 C12 X2 C23 X3 ν X1→C12 = 0, 0 − − − − − − − − − − → − − − − − − − − − − → νX3 →C23 = 0, 0 ˆ νC12 →X2 = 0, 1 ← − − − − − − − − − − ν X2→C23 = 0, 0 − − − − − − − − − − → ˆ νC23 →X3 = 0, 0 ← − − − − − − − − − − ← − − − − − − − − − − ˆ ν C23→X2 = 0, 0 − − − − − − − − − − → νX2 →C12 = 0, 0 ← − − − − − − − − − − ˆ ν C12→X1 = 0, 0 X1 X2 0 1 0 2 3 1 1 2 (a) C12 X2 X3 0 1 0 1 4 1 2 2 (b) C23 24

Slide 32

Slide 32 text

Original MSMP Algorithm: Example X1 C12 X2 C23 X3 ν X1→C12 = 0, 0 − − − − − − − − − − → − − − − − − − − − − → νX3 →C23 = 0, 0 ˆ νC12 →X2 = 0, 1 ← − − − − − − − − − − ν X2→C23 = 0, 1 − − − − − − − − − − → ˆ νC23 →X3 = 0, 0 ← − − − − − − − − − − ← − − − − − − − − − − ˆ ν C23→X2 = 0, 0 − − − − − − − − − − → νX2 →C12 = 0, 0 ← − − − − − − − − − − ˆ ν C12→X1 = 0, 0 X1 X2 0 1 0 2 3 1 1 2 (a) C12 X2 X3 0 1 0 1 4 1 2 2 (b) C23 24

Slide 33

Slide 33 text

Original MSMP Algorithm: Example X1 C12 X2 C23 X3 ν X1→C12 = 0, 0 − − − − − − − − − − → − − − − − − − − − − → νX3 →C23 = 0, 0 ˆ νC12 →X2 = 0, 1 ← − − − − − − − − − − ν X2→C23 = 0, 1 − − − − − − − − − − → ˆ νC23 →X3 = 0, 2 ← − − − − − − − − − − ← − − − − − − − − − − ˆ ν C23→X2 = 0, 0 − − − − − − − − − − → νX2 →C12 = 0, 0 ← − − − − − − − − − − ˆ ν C12→X1 = 0, 0 X1 X2 0 1 0 2 3 1 1 2 (a) C12 X2 X3 0 1 0 1 4 1 2 2 (b) C23 24

Slide 34

Slide 34 text

Original MSMP Algorithm: Example X1 C12 X2 C23 X3 ν X1→C12 = 0, 0 − − − − − − − − − − → − − − − − − − − − − → νX3 →C23 = 0, 0 ˆ νC12 →X2 = 0, 1 ← − − − − − − − − − − ν X2→C23 = 0, 1 − − − − − − − − − − → ˆ νC23 →X3 = 0, 2 ← − − − − − − − − − − ← − − − − − − − − − − ˆ ν C23→X2 = 0, 1 − − − − − − − − − − → νX2 →C12 = 0, 0 ← − − − − − − − − − − ˆ ν C12→X1 = 0, 0 X1 X2 0 1 0 2 3 1 1 2 (a) C12 X2 X3 0 1 0 1 4 1 2 2 (b) C23 24

Slide 35

Slide 35 text

Original MSMP Algorithm: Example X1 C12 X2 C23 X3 ν X1→C12 = 0, 0 − − − − − − − − − − → − − − − − − − − − − → νX3 →C23 = 0, 0 ˆ νC12 →X2 = 0, 1 ← − − − − − − − − − − ν X2→C23 = 0, 1 − − − − − − − − − − → ˆ νC23 →X3 = 0, 2 ← − − − − − − − − − − ← − − − − − − − − − − ˆ ν C23→X2 = 0, 1 − − − − − − − − − − → νX2 →C12 = 0, 1 ← − − − − − − − − − − ˆ ν C12→X1 = 0, 0 X1 X2 0 1 0 2 3 1 1 2 (a) C12 X2 X3 0 1 0 1 4 1 2 2 (b) C23 24

Slide 36

Slide 36 text

Original MSMP Algorithm: Example X1 C12 X2 C23 X3 ν X1→C12 = 0, 0 − − − − − − − − − − → − − − − − − − − − − → νX3 →C23 = 0, 0 ˆ νC12 →X2 = 0, 1 ← − − − − − − − − − − ν X2→C23 = 0, 1 − − − − − − − − − − → ˆ νC23 →X3 = 0, 2 ← − − − − − − − − − − ← − − − − − − − − − − ˆ ν C23→X2 = 0, 1 − − − − − − − − − − → νX2 →C12 = 0, 1 ← − − − − − − − − − − ˆ ν C12→X1 = 1, 0 X1 X2 0 1 0 2 3 1 1 2 (a) C12 X2 X3 0 1 0 1 4 1 2 2 (b) C23 24

Slide 37

Slide 37 text

Original MSMP Algorithm: Example X1 C12 X2 C23 X3 ν X1→C12 = 0, 0 − − − − − − − − − − → − − − − − − − − − − → νX3 →C23 = 0, 0 ˆ νC12 →X2 = 0, 1 ← − − − − − − − − − − ν X2→C23 = 0, 1 − − − − − − − − − − → ˆ νC23 →X3 = 0, 2 ← − − − − − − − − − − ← − − − − − − − − − − ˆ ν C23→X2 = 0, 1 − − − − − − − − − − → νX2 →C12 = 0, 1 ← − − − − − − − − − − ˆ ν C12→X1 = 1, 0 • X1 = 1 minimizes ˆ νC12→X1 (X1 ) • X2 = 0 minimizes ˆ νC12→X2 (X2 ) + ˆ νC23→X2 (X2 ) • X3 = 0 minimizes ˆ νC23→X3 (X3 ) • Optimal solution: X1 = 1, X2 = 0, X3 = 0 24

Slide 38

Slide 38 text

Lifted MSMP Algorithm: Finding an MWVC on the CCG • Treat MWVC problems on the CCG as WCSPs and apply the MSMP algorithm on it. • Messages are simplified passed between adjacent vertices. 25

Slide 39

Slide 39 text

Experimental Evaluation: Setup • Use the same benchmark instances as before. • Solutions are reported if the MSMP algorithms do not terminate in 5 min. • Optimal solutions are computed using toulbar2 (Hurley et al. 2016) or integer linear programming. • Experiments were performed on a GNU/Linux workstation with an Intel Xeon processor E3-1240 v3 (8MB Cache, 3.4GHz) and 16GB RAM. 26

Slide 40

Slide 40 text

Experimental Evaluation: Results — Solution Quality 100 107 1014 1021 The Lifted MSMP Solution Quality 100 105 1010 1015 1020 The Original MSMP Solution Quality (a) Benchmark instances from the UAI 2014 Inference Competition: 126/9/18 above/below/close to the diagonal dashed line 100 104 108 1012 The Lifted MSMP Solution Quality 100 103 106 109 1012 The Original MSMP Solution Quality (b) Benchmark instances from (Hurley et al. 2016): 222/68/19 above/below/close to the di- agonal dashed line 27

Slide 41

Slide 41 text

Experimental Evaluation: Results — Solution Quality 0 < 10% ≥ 10%, < 20% ≥ 20%, < 30% > 30% (MSMP solution - optimal solution) / optimal solution 0 25 50 75 100 125 Number of Instances Lifted MSMP Original MSMP UAI 2014 Inference Competition: Compare qualities of solution with the optimal solutions. 28

Slide 42

Slide 42 text

Experimental Evaluation: Results — Solution Quality 0 < 10% ≥ 10%, < 20% ≥ 20%, < 30% > 30% (MSMP solution - optimal solution) / optimal solution 0 25 50 75 100 Number of Instances Lifted MSMP Original MSMP Benmark instances from (Hurley et al. 2016): Compare qualities of solution with the optimal solutions. 29

Slide 43

Slide 43 text

Experimental Evaluation: Results — Convergence Benchmark Instance Set Neither Both Original Lifted UAI 2014 Inference Competition 25 4 124 0 (Hurley et al. 2016) 258 7 44 0 (Xu et al. 2017, Tab. 1) • Neither: Neither of the MSMP algorithms terminates in 5 min. • Both: Both of the MSMP algorithms terminate in 5 min. • Original: Only the original MSMP algorithm terminates in 5 min. • Lifted: Only the lifted MSMP algorithm terminates in 5 min. 30

Slide 44

Slide 44 text

Agenda The Weighted Constraint Satisfaction Problem (WCSP) The Constraint Composite Graph (CCG) Computational Techniques Facilitated by the CCG The Nemhauser-Trotter (NT) Reduction Min-Sum Message Passing (MSMP) Conclusion

Slide 45

Slide 45 text

Conclusion • NT reduction on the CCG is effective for many benchmark instances. • The NT reduction could determine the optimal values of all variables for about 1/8 of the benchmark instances without search. • We revived the MSMP algorithm for solving the WCSP by applying it on its CCG instead of its original form. • The lifted MSMP algorithm produced solutions that are significantly better than the original MSMP algorithm in general. • The lifted MSMP algorithm produced solutions that are close to optimal for a large fraction of benchmark instances. • However, the lifted MSMP algorithm is less advantageous in terms of convergence. • (Future work) Both MSMP algorithms can be easily adjusted to distributed settings. 31

Slide 46

Slide 46 text

References I Barry Hurley, Barry O’Sullivan, David Allouche, George Katsirelos, Thomas Schiex, Matthias Zytnicki, and Simon de Givry. “Multi-language evaluation of exact solvers in graphical model discrete optimization”. In: Constraints 21.3 (2016), pp. 413–434. Vladimir Kolmogorov. Primal-dual Algorithm for Convex Markov Random Fields. Tech. rep. MSR-TR-2005-117. Microsoft Research, 2005. T. K. Satish Kumar. “A framework for hybrid tractability results in boolean weighted constraint satisfaction problems”. In: the International Conference on Principles and Practice of Constraint Programming. Springer, 2008, pp. 282–297. Isabel Milho, Ana Fred, Jorge Albano, Nuno Baptista, and Paulo Sena. “An Auxiliary System for Medical Diagnosis Based on Bayesian Belief Networks”. In: Portuguese Conference on Pattern Recognition. 2000. Nicola Muscettola, P. Pandurang Nayak, Barney Pell, and Brian C. Williams. “Remote Agent: to boldly go where no {AI} system has gone before”. In: Artificial Intelligence 103.1–2 (1998), pp. 5–47.

Slide 47

Slide 47 text

References II Hong Xu, T. K. Satish Kumar, and Sven Koenig. “The Nemhauser-Trotter Reduction and Lifted Message Passing for the Weighted CSP”. In: the 14th International Conference on Integration of Artificial Intelligence and Operations Research Techniques in Constraint Programming (CPAIOR). 2017. Jonathan S Yedidia, William T Freeman, and Yair Weiss. “Understanding belief propagation and its generalizations”. In: Exploring Artificial Intelligence in the New Millennium 8 (2003), pp. 236–239. Matthias Zytnicki, Christine Gaspin, and Thomas Schiex. “DARN! A Weighted Constraint Solver for RNA Motif Localization”. In: Constraints 13.1 (2008), pp. 91–109.