Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Ami Wiesel

S³ Seminar
October 17, 2022

Ami Wiesel

(Hebrew University of Jerusalem)

Title — Deep learning solutions to estimation and detection

Abstract — In this talk, we will discuss the use of deep learning in statistical signal processing. We will address settings in which the classical solutions are intractable and will propose modern approaches based on neural networks. We will begin with parameter estimation and focus on learning non-linear minimum variance unbiased estimators (MVUE). Next, we will switch to detection theory and focus on learning classifiers with constant false alarm rates (CFAR). In both settings, we provide deep learning methods that achieve these goals in practice, as well as theory that highlights the relations to the classical likelihood based solutions.

References
Learning to estimate without bias https://arxiv.org/pdf/2110.12403.pdf

CFARnet: deep learning for target detection with constant false alarm rate https://arxiv.org/pdf/2208.02474.pdf

Biography — Ami Wiesel received the B.Sc. and M.Sc. degrees in electrical engineering from Tel-Aviv University, Tel-Aviv, Israel, in 2000 and 2002, respectively, and the Ph.D. degree in electrical engineering from the Technion - Israel Institute of Technology, Haifa, Israel, in 2007. He was a postdoctoral fellow in the University of Michigan, Ann Arbor, USA, during 2007–2009. He is currently an Associate Professor in the Rachel and Selim Benin School of Computer Science and Engineering, Hebrew University of Jerusalem, Israel.

S³ Seminar

October 17, 2022
Tweet

More Decks by S³ Seminar

Other Decks in Research

Transcript

  1. Ami Wiesel (HUJI) 1/36 Deep learning solutions to estimation and

    detection Ami Wiesel The Hebrew University of Jerusalem (HUJI) October 17, 2022
  2. Ami Wiesel (HUJI) 2/36 Thanks ▶ Tzvi Diskin ▶ Yiftach

    Beer ▶ Yoav Wald ▶ Uri Ukon ▶ Yonina Eldar ▶ Google
  3. Ami Wiesel (HUJI) 3/36 2002 vs 2022 ▶ Model ▶

    Parameter estimation ▶ Hypothesis testing ▶ Algorithms ▶ (synthetic) Data ▶ Regression ▶ Classification ▶ Neural networks
  4. Ami Wiesel (HUJI) 6/36 Parameter estimation: 2002 vs 2022 ▶

    Model ▶ Maximum Likelihood ▶ Inference was slow ▶ Asymptotically unbiased ▶ Cramer Rao Bound for all ▶ (synthetic) Data ▶ Regression ▶ Inference is fast ▶ Fitted on training set ▶ Best if train=test
  5. Ami Wiesel (HUJI) 7/36 Estimation metrics ▶ Classical metrics: BIASˆ

    y(y) = E [ˆ y (x) |y] − y VARˆ y(y) = E ∥ˆ y(x) − E [ˆ y (x)]∥2 y MSEˆ y(y) = E ∥ˆ y(x) − y∥2 y = VARˆ y(y) + ∥BIASˆ y(y)∥2 ▶ Bayesian metric: BMSEˆ y = E [MSEˆ y(y)]
  6. Ami Wiesel (HUJI) 8/36 Parameter estimation ▶ Classical approaches: 

     XXXXXX X min ˆ y(·) MSEˆ y(y) ▶ Minimum Variance Unbiased Estimation (MVUE) ▶ Maximum Likelihood is asymptotically MVUE BIASˆ y (y) = 0 ∀ y ▶ Bayesian approach: ▶ Minimize BMSE min ˆ y(·) E [MSEˆ y (y)] Learning is Bayesian with respect to training set
  7. Ami Wiesel (HUJI) 9/36 Bias Constrained Estimation (BCE) ▶ Standard

    learning DN = {yi , xi }N i=1 min ˆ y∈H ˆ EN ∥ˆ y(x) − y∥2 ▶ BCE: penalize the average squared bias DNM = yi , {xij }M j=1 N i=1 min ˆ y∈H ˆ ENM ∥ˆ y(x) − y∥2 + λˆ EN ˆ EM [ˆ y(x) − y|y] 2
  8. Ami Wiesel (HUJI) 10/36 Collecting a BCE dataset DNM Synthetic

    data Data augmentation ▶ Fictitious prior pfake(y) ▶ Generate {yi }N i=1 ▶ For each yi generate {xj (yi )}M j=1 Khobahi, Gabrielli, Naimipour, Dreifuerst,...
  9. Ami Wiesel (HUJI) 11/36 Minimum Variance Unbiased Estimator (MVUE) Theorem

    Under technical conditions, BCE is asymptotically MVUE. ▶ Maximum Likelihood is also asymptotically MVUE. ▶ BCE approximates it using deep learning. ▶ Asymptotically in everything! ▶ Note that we penalize the average bias (rather than the max). ▶ Asymptotically, achieves Cramer Rao bound for any value.
  10. Ami Wiesel (HUJI) 12/36 BCE with linear architecture Theorem ˆ

    y = Ax A = ˆ ENM yxT 1 λ + 1 ˆ ENM xxT + 1 − 1 λ + 1 R −1 R = ˆ EN ˆ EM [x|y] ˆ EM xT |y Compare to the Bayesian linear MMSE (linear regression) A = ENM yxT ENM xxT −1
  11. Ami Wiesel (HUJI) 13/36 BCE with linear architecture and linear

    model Theorem ˆ y = Ax x = Hy + n A = HT ˆ Σ−1 x H + 1 λ + 1 ˆ Σ−1 y −1 HT ˆ Σ−1 x Compare to Weighted Least Squares estimator (= MVUE) A = HT ˆ Σ−1 x H −1 HT ˆ Σ−1 x Gauss-Markov theorem.
  12. Ami Wiesel (HUJI) 14/36 Experiment: SNR estimation My MSc with

    Messer in 2002. MMSE is best on training dist. BCE is always near MLE (EM). xi = ai h + ni ai = ±1 w.p. 1 2 ni ∼ N(0, σ2) ρ = h2 σ2
  13. Ami Wiesel (HUJI) 15/36 Experiment: covariance estimation Structured covariance [Chaudhuri]

    p(x; Σ) ∼ N(0, Σ)   1 + y1 0 0 1 2 y6 0 0 1 + y2 0 1 2 y7 0 0 0 1 + y3 0 1 2 y8 1 2 y6 1 2 y7 0 1 + y4 1 2 y9 0 0 1 2 y8 1 2 y9 1 + y5   EMMSE is best on training dist. BCE is always near MVUE.
  14. Ami Wiesel (HUJI) 16/36 BCE for averaging in test time

    ▶ Example: sensor networks. ▶ Example: test-time augmentation. Averaging Unbiasedness is necessary for consistent averaging. BCE is asymptotically unbiased.
  15. Ami Wiesel (HUJI) 17/36 Experiment: Augmentation in test time ▶

    CIFAR10 ▶ Random cropping and flipping ▶ Soft labels via distillation ▶ Both in train and in test ▶ BCE outperforms MMSE [Krizhevsky, Simonyan, Han,...]
  16. Ami Wiesel (HUJI) 18/36 Fairness literature ▶ Related to “fairness”

    “out of distribution generalization”. ▶ Invariant Risk Minimization (IRM) by Arjovski ▶ Calibration and OOD by Wald, and more... ▶ We protect the labels themselves rather than the environments.
  17. Ami Wiesel (HUJI) 20/36 Parameter estimation: 2002 vs 2022 ▶

    Model ▶ Likelihood Ratio Test ▶ Neyman Pearson ▶ Constant false alarm rate ▶ (synthetic) Data ▶ Classification ▶ Minimum prob of error ▶ Works well if train=test
  18. Ami Wiesel (HUJI) 21/36 Simple Hypothesis Testing Goal x ∼

    p(x; y) y ∈ {0; 1} Design a detector T(x) ≷ γ that maximizes PTPR(z) = P(T(x) > γ; y = 1) subject to a false alarm constraint PFPR(z) = P(T(x) > γ; y = 0). LRT = classifier is optimal and easy to learn TLRT(x) = 2 log p(x; y = 1) p(x; y = 0) Easy to learn as a Bayes optimal classifier. Can also optimize AUC-ROC, e.g., Herschtal, Brefeld, etc.
  19. Ami Wiesel (HUJI) 22/36 Composite Hypothesis Testing x ∼ p(x;

    z) y = 0 : z ∈ Z0 noise only y = 1 : z ∈ Z1 target (Ill-posed) Goal Design a detector T(x) ≷ γ that maximizes PTPR(z) = P(T(x) > γ; z ∈ Z1) subject to a constant false alarm rate (CFAR) constraint on PFPR(z) = P(T(x) > γ; z ∈ Z0) for all z ∈ Z0.
  20. Ami Wiesel (HUJI) 23/36 Generalized Likelihood Ratio Test (GLRT) ▶

    GLRT is the standard approach TGLRT(x) = 2 log maxz∈Z1 p(x; z) maxz∈Z0 p(x; z) ▶ Pros: Under regular asymptotic conditions, TGLRT(x) asymp ∼ χ2 r (0) y = 0 χ2 r (λ) y = 1 and has a constant false alarm rate (CFAR). ▶ Cons: likelihood, optimizations, asymptotic.
  21. Ami Wiesel (HUJI) 24/36 Learning to detect targets Learning detectors

    ▶ Choose pfake(y) and pfake(z; y). ▶ For each i = 1, · · · , N: Generate yi . Generate zi given yi . Generate xi given zi . ▶ Solve min ˆ T∈T 1 N N i=1 L( ˆ T(xi ), yi ). References: Ziemann, Kucer and Theiler, Girard, De La Mata-Moya and many more. . .
  22. Ami Wiesel (HUJI) 25/36 Learning to detect targets is easy

    ▶ Also in composite hypothesis (unlike estimation). ▶ Target detection in Gaussian noise with unknown variance. ▶ A ̸= 0 and σ are deterministic and unknown. xi = A + σni i = 1, · · · , N
  23. Ami Wiesel (HUJI) 27/36 Learning CFAR detectors CFAR-NET ▶ Choose

    pfake(y) and pfake(z; y). ▶ For each i = 1, · · · , N: Generate yi . Generate zi given yi . Generate xi given zi . ▶ Solve min ˆ T∈T 1 N N i=1 L( ˆ T(xi ), yi ) + α ˆ R( ˆ T). ˆ R( ˆ T) = i,˜ i under y=0 d { ˆ T(xij }M j=1 ); { ˆ T(˜ x˜ ij )}M j=1 Ensures that ˆ T has the same distribution under all zi .
  24. Ami Wiesel (HUJI) 28/36 Learning CFAR detectors II CFAR penalty

    ˆ R( ˆ T) = i,˜ i under y=0 d { ˆ T(xij }M j=1 ); { ˆ T(˜ x˜ ij )}M j=1 ▶ Differentiable distance between distributions. ▶ We use MMD by Gretton et al: dMMD = 1 N2 i,j k(Xi , Xj ) + 1 N2 i,j k(Yi , Yj ) − 2 N2 i,j k(Xi , Yj ) ▶ Can also use a GAN like loss.
  25. Ami Wiesel (HUJI) 29/36 Detection in i.i.d. noise with unknown

    variance ▶ Gaussian noise: ▶ non-Gaussian noise:
  26. Ami Wiesel (HUJI) 30/36 Detection in correlated noise ▶ Gaussian

    noise covariance estimated using secondary data. ▶ Adaptive Matched Filter (AMF): x = As + w0 xi = wi i = 1, · · · , n w0, wi ∼ N(0, Σ) TAMF(x) = sT ˆ Σ−1x 2 sT ˆ Σ−1s ˆ Σ = 1 n n i=1 wi wT i ▶ Diagonally loaded (LAMF) for regularization Σ + λI.
  27. Ami Wiesel (HUJI) 31/36 CFAR-NET in correlated noise ▶ LAMF,

    NET and CFARnet are better than AMF. ▶ Unlike CFARnet, the LAMF and NET are highly non-CFAR.
  28. Ami Wiesel (HUJI) 32/36 Real Hyperspectral data ▶ Pavia University

    dataset. ▶ 10 labeled materials. ▶ Partial AUC in (0 − 0.05). material net CFARnet unlabeled 0.49 0.47 1 0.31 0.38 2 0.74 0.77 3 0.33 0.35 4 0.69 0.73 5 0.27 0.34 6 0.49 0.53 7 0.47 0.72 8 0.41 0.49 9 0.88 0.9
  29. Ami Wiesel (HUJI) 33/36 How is this related to the

    classics? Roughly speaking ▶ Simple tests: LRT = Bayes optimal classifier. ▶ Composite tests: GLRT = Bayes + CFAR. BayesCFAR : min ˆ T,γ Pr(1T≥γ ̸= y) s.t. ˆ T is CFAR Exact equivalence requires assumptions....
  30. Ami Wiesel (HUJI) 34/36 GLRT solves Bayes CFAR BayesCFAR :

    min ˆ T,γ Pr(1T≥γ ̸= y) s.t. ˆ T is CFAR Theorem Consider an asymptotic linear Gaussian model with a large enough σ2 r then there exists a threshold γ such that GLRT solves BayesCFAR. ▶ Linear model x = Hzr + n. ▶ Noise covariance is parameterized arbitrarily by zn. ▶ CFAR-NET approximates it using deep learning.
  31. Ami Wiesel (HUJI) 35/36 Fairness literature ▶ CFAR-NET is very

    similar to ▶ Setting is slightly different. ▶ CFAR-NET is non-symmetric. ▶ CFAR-NET is cheaper in our settings (1D MMD).
  32. Ami Wiesel (HUJI) 36/36 Conclusions ▶ Everyone is switching to

    deep learning. ▶ But don’t forget the classics. ▶ To make a regressor closer to MLE/MVUE, add a bias penalty. ▶ To make a classifier closer to GLRT, add a CFAR penalty. ▶ Thank you!