A New Approximation Guarantee for Monotone Submodular Function Maximization via Discrete Convexity

Slide 1

Slide 1 text

A New Approximation Guarantee for Monotone Submodular Function Maximization via Discrete Convexity Tasuku Soma (UTokyo) Joint work with Yuichi Yoshida (NII) ICALP 8@Prague /

Slide 2

Slide 2 text

Overview /

Slide 3

Slide 3 text

Monotone submodular maximization f : V → R+: monotone submod, f(∅) = max f(X) sub. to |X| ≤ k /

Slide 4

Slide 4 text

Monotone submodular maximization f : V → R+: monotone submod, f(∅) = max f(X) sub. to |X| ≤ k Previous work • NP-hard • ( − /e)-approx by greedy [Nemhauser–Wolsey 8] • ( − /e)-approx is tight [Feige 8] • many applications in ML and marketing • extensions to matroid, knapsack, etc /

Slide 5

Slide 5 text

Curvature curvature [Conforti–Cornuéjols 8 ] c = − min i∈V f(i | V − i) f(i) f(V) − f(V − i) Note c = ⇐⇒ f is linear f(i) f(i | V − i) /

Slide 6

Slide 6 text

Curvature curvature [Conforti–Cornuéjols 8 ] c = − min i∈V f(i | V − i) f(i) f(V) − f(V − i) Note c = ⇐⇒ f is linear f(i) f(i | V − i) Previous work • ( − e−c)/c-approx [Conforti–Cornuéjols 8 ] • ( − c/e)-approx [Sviridenko–Vondrák–Ward ] /

Slide 7

Slide 7 text

curvature is unsatisfactry? ex. f(X) = |X| ... c = − O( / √ n) ex. f(X) = min{|X|, n − } ... c = /

Slide 8

Slide 8 text

curvature is unsatisfactry? ex. f(X) = |X| ... c = − O( / √ n) ex. f(X) = min{|X|, n − } ... c = curvature predicts ( − /e)-approx, but greedy ﬁnds an optimal soltuion! curvature is pessimistic /

Slide 9

Slide 9 text

Towards better approx guarantee curvature = “closedness” to linear functions submod func f linear func 6 /

Slide 10

Slide 10 text

Towards better approx guarantee curvature = “closedness” to linear functions submod func f M -concave func linear func 6 /

Slide 11

Slide 11 text

M -concave function [Murota–Shioura ] f : V → R is M -concave def ⇐⇒ ∀X, Y ⊆ V, i ∈ X − Y, either f(X) + f(Y) ≤ f(X − i) + f(Y + i) or ∃j ∈ Y − X s.t. f(X) + f(Y) ≤ f(X − i + j) + f(Y + i − j) i i j /

Slide 12

Slide 12 text

M -concave function [Murota–Shioura ] f : V → R is M -concave def ⇐⇒ ∀X, Y ⊆ V, i ∈ X − Y, either f(X) + f(Y) ≤ f(X − i) + f(Y + i) or ∃j ∈ Y − X s.t. f(X) + f(Y) ≤ f(X − i + j) + f(Y + i − j) i i j • Tractable subclass of submod func • Greedy ﬁnds a maximizer • (weighted) matroid rank func, gross substitute, etc. /

Slide 13

Slide 13 text

Our result • Deﬁne a new quantity γ for closedness to M -concave func • γ ≤ c (always better than curvature) • (under some cond) ( − γ/e)-approx algorithm 8 /

Slide 14

Slide 14 text

Curvature and M -concave curvature /

Slide 15

Slide 15 text

curvature and decomposition (X) := i∈X f(i | V − i) =⇒ f = g + (g: monotone nonnegative submod) hard part easy part /

Slide 16

Slide 16 text

curvature and decomposition (X) := i∈X f(i | V − i) =⇒ f = g + (g: monotone nonnegative submod) hard part easy part c: curvature ≤ f ≤ − c /

Slide 17

Slide 17 text

The M -concave curvature Deﬁnition Given a decomposition f = g + h, where g: monotone nonnegative submod, hard part h: nonnegative M -concave, easy part γ(g, h) := − min X⊆V h(X) f(X) /

Slide 18

Slide 18 text

The M -concave curvature Definition Given a decomposition f = g + h, where g: monotone nonnegative submod, hard part h: nonnegative M -concave, easy part γ(g, h) := − min X⊆V h(X) f(X) Remarks • γ(g, h) depends on the decomposition • γ(g, h) may be difficult to compute • No obvious way to find a nontrivial decomposition /

Slide 19

Slide 19 text

Algorithm /

Slide 20

Slide 20 text

Algorithm Theorem g: nonnegative monotone submod, h: nonnegative M concave, f = g + h. Then, ∃ polytime alg s.t. for any > , it ﬁnds an ( − γ/e − )-approx solution. Remarks • Generalization of ( − c/e)-approx [Sviridenko–Vondrák–Ward ], where h(X) = i∈X f(i | V − i) • continuous greedy+ellipsoid • require value oracles for g, h /

Slide 21

Slide 21 text

Algorithm overview original max f(X) : |X| ≤ k

Slide 22

Slide 22 text

Algorithm overview original max f(X) : |X| ≤ k relaxation max {F(x) : x ∈ P(k)} cont relax

Slide 23

Slide 23 text

Algorithm overview original max f(X) : |X| ≤ k relaxation max {F(x) : x ∈ P(k)} cont relax x ∈ P(k) s.t. G(x) ≥ ( − /e)g(O), ¯ h(x) = h(O) O:opt improved cont greedy [SVW ’ ]

Slide 24

Slide 24 text

Algorithm overview original max f(X) : |X| ≤ k relaxation max {F(x) : x ∈ P(k)} cont relax x ∈ P(k) s.t. G(x) ≥ ( − /e)g(O), ¯ h(x) = h(O) O:opt improved cont greedy [SVW ’ ] X : |X| ≤ k s.t. g(X) ≥ ( − /e)g(O), h(X) = h(O) rounding /

Slide 25

Slide 25 text

Two extensions of set functions f = g + h /

Slide 26

Slide 26 text

Two extensions of set functions f = g + h hard part Use the multilinear extension: G(x) = E X∼D(x) [g(X)] • g: submod =⇒ G: concave along nonnegative directions /

Slide 27

Slide 27 text

Two extensions of set functions f = g + h hard part Use the multilinear extension: G(x) = E X∼D(x) [g(X)] • g: submod =⇒ G: concave along nonnegative directions easy part Use the concave closure: ¯ h(x) = max µ:E[ X ]=x E X∼µ [h(X)] • ¯ h is polyhedral concave • h: M -concave =⇒ ¯ h is polytime computable, and polytime seperation oracle for ¯ h(x) ≥ µ. [Shioura ] /

Slide 28

Slide 28 text

Algorithm (cont. form) Guess α = g(O), β = h(O). x(0) x(1) x(t) v(t) dx(t) dt = v(t), x( ) = where v(t) is a solution of v ∇G(x(t)) ≥ α ¯ h(v) ≥ β i∈V v(i) ≤ k. 6 /

Slide 29

Slide 29 text

Algorithm (cont. form) Guess α = g(O), β = h(O). x(0) x(1) x(t) v(t) dx(t) dt = v(t), x( ) = where v(t) is a solution of v ∇G(x(t)) ≥ α ¯ h(v) ≥ β i∈V v(i) ≤ k. Remark • v must exist. (v = O is always a solution) • Can solve in polytime (we have a separation oracle for ¯ h(v) ≥ β) 6 /

Slide 30

Slide 30 text

Slide 31

Slide 31 text

Rounding algorithm Goal: Round x ∈ [ , ]V to X ⊆ V (|X| ≤ k) s.t. • g(X) ≥ G(x) (preserving multilinear extension) • h(X) ≥ ¯ h(x) (preserving concave extension) 8 /

Slide 32

Slide 32 text

Rounding algorithm Goal: Round x ∈ [ , ]V to X ⊆ V (|X| ≤ k) s.t. • g(X) ≥ G(x) (preserving multilinear extension) • h(X) ≥ ¯ h(x) (preserving concave extension) Extend swap rounding [Chekuri–Vondrák–Zenklusen ] to M -concave constraint : Find convex decomposition: ¯ h(x) = n i= λih(Yi ) (|Yi | = k) : Apply “swap operations” to Y , ... , Yn and obtain X 8 /

Slide 33

Slide 33 text

Proof of main theorem We obtain X s.t. |X| ≤ k, E[g(X)] ≥ ( − /e)g(O), h(X) ≥ h(O) by rounding. E[f(X)] ≥ ( − /e)g(O) + h(O) = ( − /e)f(O) + h(O)/e ≥ ( − /e)f(O) + ( − γ)/e · f(O) = ( − γ/e)f(O). /

Slide 34

Slide 34 text

Proof of main theorem We obtain X s.t. |X| ≤ k, E[g(X)] ≥ ( − /e)g(O), h(X) ≥ h(O) by rounding. E[f(X)] ≥ ( − /e)g(O) + h(O) = ( − /e)f(O) + h(O)/e ≥ ( − /e)f(O) + ( − γ)/e · f(O) = ( − γ/e)f(O). Note • cont greedy part applies solvable+down-closed polytope as well • h can be nonmonotone /

Slide 35

Slide 35 text

Other results Require a nontrivial decomposition f = g + h to beat curvature guarantee. Unfortunately, we cannot say much in the general case ... • Some heuristic for ﬁnding decomposition for quadratic submod func f(X) = XA X + b X using ultrametric ﬁtting. • Some examples with nontrivial decomposition. /

Slide 36

Slide 36 text

Our result • Deﬁne a new quantity γ for closedness to M -concave func • γ ≤ c (always better than curvature) • (under some cond) ( − γ/e)-approx algorithm Future work • Matroid constraint • Interesting application in which M -concavity signiﬁcantly improves approximation /

Slide 37

Slide 37 text

Lemma x := x( ) satisﬁes G(x) ≥ ( − /e)g(O), ¯ h(x) ≥ h(O). /

Slide 38

Slide 38 text

Lemma x := x( ) satisﬁes G(x) ≥ ( − /e)g(O), ¯ h(x) ≥ h(O). Proof. Analysis of cont greedy: G(x) ≥ ( − /e)α = ( − /e)g(O). Jensen’s inequality: ¯ h(x( )) = ¯ h ∫ v(t)dt ≥ ∫ ¯ h(v(t))dt ≥ β = h(O). /

Slide 39

Slide 39 text

Swap operation Let ¯ h(x) = λ h(Y ) + λ h(Y ). For i ∈ Y − Y , ∃j s.t. h(Y ) + h(Y ) ≤ h(Y − i + j) + h(Y + i − j) /

Slide 40

Slide 40 text

Swap operation Let ¯ h(x) = λ h(Y ) + λ h(Y ). For i ∈ Y − Y , ∃j s.t. h(Y ) + h(Y ) ≤ h(Y − i + j) + h(Y + i − j) p = λ /(λ + λ ) W.p. p Y ← Y + i − j, W.p. − p Y ← Y − i + j i j − p p /

Slide 41

Slide 41 text

Swap operation Let ¯ h(x) = λ h(Y ) + λ h(Y ). For i ∈ Y − Y , ∃j s.t. h(Y ) + h(Y ) ≤ h(Y − i + j) + h(Y + i − j) p = λ /(λ + λ ) W.p. p Y ← Y + i − j, W.p. − p Y ← Y − i + j i j − p p ¯ h preserved: E[¯ h(x )] − ¯ h(x) = λ λ λ + λ [h(Y − i + j) − h(Y ) + h(Y + i − j) − h(Y )] ≥ . /

Slide 42

Slide 42 text

Swap rounding G is also preserved! zt = i λi Yi : vector after t swaps (t = , , ... ). z = x, zt+ − zt has at most positive coordinate and at most negative. E[zt+ | zt] = zt. /

Slide 43

Slide 43 text

Swap rounding G is also preserved! zt = i λi Yi : vector after t swaps (t = , , ... ). z = x, zt+ − zt has at most positive coordinate and at most negative. E[zt+ | zt] = zt. Lemma ([Chekuri–Vondrák–Zenklusen ]) zt: vector random process satisfying the adove conditions G: multilinear extension of monotone submod func =⇒ E[G(zt)] ≥ G(x) (t = , , ... ). /