Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Filip Elvander

S³ Seminar
November 07, 2024

Filip Elvander

(Aalto University)

Title — Optimal transport for linear inverse problems: spectral estimation and system identification.

Abstract — In this talk, I will present recent developments on our work on optimal transport for linear inverse problems. Optimal transport (OT) offers a powerful framework for measuring distances between distributions of mass, displaying remarkable robustness to perturbations and with attractive ability to induce interpolants. Although theory and algorithms for OT have predominantly been designed for the direct case, i.e., when one has access to the distributions themselves, there has been recent work on extending the OT ideas also to inverse problems. This opens up for opportunities in leveraging OT in signal processing applications. In particular, we will in this talk see how OT can be applied to problems of broad-band spatio-temporal spectral problems and system identification.

Bio
Filip Elvander (Member, IEEE) received the M.Sc. degree in industrial engineering and management and the Ph.D. degree in mathematical statistics from Lund University, Lund, Sweden, in 2015 and 2020, respectively. He has been a Postdoctoral Research Fellow with the Stadius Center for Dynamical Systems, Signal Processing and Data Analytics, KU Leuven, Leuven, Belgium, and with the Research Foundation – Flanders (FWO). He is currently an Assistant Professor of signal processing with the Department of Information and Communications Engineering, Aalto University, Finland. He is a Member of the EURASIP Technical Area Committee on Signal and Data Analytics for Machine Learning. His research interests include inverse problems, robust estimation, and convex modeling and approximation techniques in statistical signal processing and spectral analysis.

S³ Seminar

November 07, 2024
Tweet

More Decks by S³ Seminar

Other Decks in Research

Transcript

  1. 0. Structured and Stochastic Modeling Group My background • PhD

    in mathematical statistics from Lund University 2020. • 2020-2021, postdoc at KU Leuven. • 2021-2022, postdoctoral fellow with FWO (Belgian research council) and KU Leuven. • Assistant prof. in signal processing at Aalto University since 2022. Theme of group • Statistical signal processing (in very wide sense) • Applications in localization, audio processing, spectroscopy. PhD students at Aalto University • Anton Bj¨ orkman, • Linda Fabiani, • Rumeshika Pallewela, • Yuyang Liu. PhD student at Lund University • David Sundstr¨ om (main supervisor prof. Andreas Jakobsson).
  2. 0. Distance measures in learning • The standard distance measures

    we use are based on the ”y-axis”. • Works great in most applications, but troublesome in some. • Spectral estimation: what are sensible ways of describing spectral perturbations? • Φ(ω)+ε: no problem. • Φ(ω +ε): immediate breakdown. • Vanishing gradients due to saturation, robust recovery impossible,... Optimal (mass) transport offers an alternative with many powerful properties. Shifting a narrow-band spectrum -3 -2 -1 0 1 2 3 100 0.4 0.45 0.5 0.55 0.6 c 100 L 2 distance KL divergence
  3. 0. Optimal mass transport: ingredients • Two spaces, X, Y,

    and two distributions µX ∈ M+(X) and µY ∈ M+(Y). • What is the cheapest way of rearranging µX to µY ? • The effort of rearrangement is measured in terms of a ”ground cost”, c : X ×Y → R+. • Example: X = Y = R3 and c(x,y) = ∥x−y∥2 2 . Image from C. Villani, Optimal Transport - Old and New (Springer, 2009)
  4. 0. Optimal mass transport: Monge Define the distance between µX

    and µY as S(µX ,µY ) ≜ min T:X→Y ˆ X c(x,T(x))dµX (x) , s.t. T#µX = µY . • x ∈ X is mapped to T(x) ∈ Y with ground cost c(x,T(x)). • The amount of mass at x is dµX (x). • T rearranges µX into µY (T#µX = µY ). • Drawbacks: (1) difficult nonlinear problem, (2) Cannot split masses Image from C. Villani, Optimal Transport - Old and New (Springer, 2009)
  5. 0. Kantorovich formulation Instead of a mapping T : X

    → Y, look for a coupling m ∈ M+(X ×Y): minimize m ˆ c(x,y)dm(x,y) , s.t. ˆ dm(·,y) = dµX , ˆ dm(x,·) = dµY . • Equivalent to Monge problem in many interesting cases. • Allows for mass splitting • Convex problem! Figure: m = µX ⊗ µY . Figure: Minimizing m for c(x,y) = (x−y)2.
  6. 0. Induced geometry • The space of distributions (e.g., spectra)

    M+(X) inherits properties of X via the ”ground cost” c. • Possible to define geodesics on M+(X). 1 0 0.5 3 2 1 0.02 0 0 -1 -2 -3 0.04 0.06 0.08
  7. 0. Linear inverse problems In most setups, we only have

    imperfect knowledge: y = A(µ)+w , G linear operator. • Temporal spectral estimation: y covariance function, G Fourier transform. • Spatial spectral estimation: y array covariance, G incorporating array steering vectors. • System identification: y output, G convolution operator, (µ not non-negative in this case). Solving inverse problems with OT regularization: minimize µ 1 2 ∥y−G(µ)∥2 2 +S(µ) , S modeling transport. • In many useful situations, S is convex in µ. • Induce smoothness (in support) for sequential observations, • Averaging for ensembles of observations, • Shrinkage towards a prior µ0, • ...
  8. 0. Tracking with dynamics (rad) 5 4 3 t (s)

    2 1 0 0 6 4 5 2 3 1 Point-wise Capon. 0 1 2 3 4 5 t (s) 2 4 6 (rad) 0 0.5 1 0 1 2 3 4 5 t (s) -1 0 1 v (rad/s) 0 0.5 1 OT regularization.
  9. 0. Spatial barycenter -2 -1.5 -1 -0.5 0 0.5 1

    1.5 x-position -2 -1.5 -1 -0.5 0 0.5 1 1.5 y-position array 1 array 2 array 1, orientation array 1, assumed orientation sources 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 MUSIC. -2 -1.5 -1 -0.5 0 0.5 1 1.5 x-position -2 -1.5 -1 -0.5 0 0.5 1 1.5 y-position array 1 array 2 array 1, orientation array 1, assumed orientation sources 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 OT barycenter.
  10. 0. Clustered transport • If target sources are broad-band, the

    corresponding to spatio-temporal spectra are sparse in location but not frequency. • Easy to include in transport formulation: promote group-sparse flow of mass. • Potential benefit in information sharing between carriers. 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 Temporal spectra. t = 1 0.5 1 1.5 2 2.5 -80 -60 -40 -20 0 20 40 60 80 t = 2 0.5 1 1.5 2 2.5 -80 -60 -40 -20 0 20 40 60 80 t = 3 0.5 1 1.5 2 2.5 -80 -60 -40 -20 0 20 40 60 80 t = 4 0.5 1 1.5 2 2.5 -80 -60 -40 -20 0 20 40 60 80 t = 5 0.5 1 1.5 2 2.5 -80 -60 -40 -20 0 20 40 60 80 Group-sparse OT. t = 1 0.5 1 1.5 2 2.5 -80 -60 -40 -20 0 20 40 60 80 t = 2 0.5 1 1.5 2 2.5 -80 -60 -40 -20 0 20 40 60 80 t = 3 0.5 1 1.5 2 2.5 -80 -60 -40 -20 0 20 40 60 80 t = 4 0.5 1 1.5 2 2.5 -80 -60 -40 -20 0 20 40 60 80 t = 5 0.5 1 1.5 2 2.5 -80 -60 -40 -20 0 20 40 60 80 Group lasso.
  11. 0. Tracking with hydrophone data Temporal spectrum. 0.1 0.2 0.3

    0.4 0.5 0.6 0.7 0.8 0.9 1 Spatial spectrum.
  12. 0. Multi-marginal formulation All these problems are extensions of the

    multi-marginal transport formulation minimize m∈M+(X) ˆ X m(x)c(x)dx+ 1 2 T ∑ t=1 ∥G (Pt(m))−yt∥2 2 • x = (x1,x2,...,xT ). • There are T margins, each being an element of M+(X), with ground-space X. • X = XT := X ×X ×...×X • Projection on each margin, Pt : M+(X) → M+(X). • The way that the ground-cost c is defined determines the interpretation. • In practice, we discretize the space X.
  13. 0. Discretization and entropy regularization Primal problem minimize M ⟨M,C⟩+

    1 2 T ∑ t=1 ∥GPt(M)−yt∥2 2 +εD(M) • With n discretization points in each margin, M and C are tensors in RnT := Rn×n×...×n. • D(M) = ∑i1 ,...,iT [M]i1 ,...,iT log[M]i1 ,...,iT −1 +1. • Impossible to solve directly, but the entropy term leads to a nice dual problem. Dual problem maximize λ λ λt ∈RN ,t=1,...,T −ε⟨U,K⟩− 1 2 T ∑ t=1 ∥λ λ λt∥2 2 + T ∑ t=1 ⟨λ λ λt,yt⟩ • Here U = u1 ⊗u2 ⊗...⊗uT , with ut = exp 1 ε GT λ λ λt . K = exp −1 ε C . • The primal solution is M = U⊙K. • Dimension of dual is considerably smaller than that of the primal. • The dual can be solved efficiently, with computational bottleneck being computation of projections Pt (U⊙K). Often, structure in K can be exploited.
  14. 0. Identification of acoustic systems Input-output relationship for acoustic systems:

    y = x∗h+w • h ∈ RNh room impulse response (RIR). • h depends on relative source-receiver positions, room geometry, air temperature, etc. • Perturbations in the environment manifest as perturbations in the delay path, i.e., support of h. • Seeing an RIR as a set of tuples, h” = ” (h2 k ,τk) k , transport can be used to re-arrange the energy distribution over time. • For example, we can use a crude geometry-based h0 as a ”prior” when estimating h from y.
  15. 0. RIR estimation Letting h0 be the prior RIR, we

    can define S(·,h0) : RNh → R as S(h,h0) = min M∈R Nh×Nh + ⟨C,M⟩+εD(M) s.t. M1Nh = h2 0 , MT 1Nh ≥ h2. • D(M) = ∑i,j[M]i,j log[M]i,j −1 +1, entropic term. • S convex in its first argument. We can estimate h as ˆ h = arg min h∈RN h 1 2 ||y−h∗x||2 2 +ηS(h,h0). • Minimization by forward-backward splitting. • Entropy term D(M) makes for (fairly) easy proximal operator. • Accuracy can be tuned by ε, but with price of slower convergence.
  16. 0. RIR estimation Regularizing with S(h,h0), we expect more robustness

    than with standard metrics. • E.g., ∥h−h0∥p only causes bias when support has changed. 16 18 20 22 24 Temperature, °C -12 -10 -8 -6 -4 -2 0 2 NMSE (dB) Simulated RIR with temperature perturbations. 70 80 90 100 110 120 0 5 10 Amplitude 10-3 OT. 70 80 90 100 110 120 0 5 10 Amplitude 10-3 ℓ1 .
  17. 0. Summary • OT offers a fundamentally different way of

    performing comparison and quantifying distances, as compared to standard metrics. • Bulk of work done has been in cases where the marginals/distributions are known or directly observable ⇒ many opportunities from a signal processing perspective. • Very flexible and modular framework. • Challenges: computational complexity and extensions to transport between non-positive objects.