Slide 1

Slide 1 text

ON THE EDGE: STATE TOMOGRAPHY, BOUNDARIES, AND MODEL SELECTION Travis L Scholten Center for Computing Research Sandia National Labs CQuIC Talk University of New Mexico 2015 December 2 Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000. CCR Center for Computing Research 59

Slide 2

Slide 2 text

I am interested in tomography, the characterization of quantum systems. 58

Slide 3

Slide 3 text

Characterizing a system means estimating/ inferring something about it. ˆ ⇢ Tomography is a statistical inference problem Estimator (Things with hats) (Experiment) (Inference) Data Estimates POVM {Ej } 57

Slide 4

Slide 4 text

The “best” estimator is very accurate. Many measures of accuracy: We seek high accuracy relative to an unknown truth. Quantum Fidelity Relative Entropy Trace Distance Hilbert-Schmidt Distance 56

Slide 5

Slide 5 text

The “ideal” estimator would be very accurate… and would not fit noise in the data. “Ideal” impossible to achieve! 55

Slide 6

Slide 6 text

We do not have truth, only data. How do we be accurate and fit well? Model selection. Truth Data 54 Model selection: Find best model Fit parameters in model

Slide 7

Slide 7 text

People already use model selection in quantum information…but are they justified in doing so? Schwarz/van Enk - 2013 Error models in quantum computation: an application of model selection Guta et. al. - 2012 Rank-based model selection
 for multiple ions quantum tomography van Enk/Blume-Kohout - 2013 When quantum tomography goes wrong:
 drift of quantum sources and other errors 53

Slide 8

Slide 8 text

Model selection techniques currently used in quantum tomography may have problems. Such as loglikelihood ratio tests, or Akaike’s AIC.

Slide 9

Slide 9 text

Quantum information makes connections to statistical inference in many ways. Model = parametrized family
 of probability distributions Probabilities via the Born rule: Pr(E) = tr(⇢E) Hypothesis = point in the model M H 51 H $ ⇢ M $ {⇢1, ⇢2, · · · }

Slide 10

Slide 10 text

Given some data, plausibility of a model/hypothesis
 is quantified by its likelihood. What is the probability assigned to the data seen? L(H) = Pr(Data|H) Hypothesis: Just compute it! L ( M ) = max H2M Pr (Data |H ) Models: Just maximize! We use likelihoods to
 compare models/hypotheses
 and to make estimates. 50

Slide 11

Slide 11 text

Quantum information makes connections to statistical inference in many ways. State discrimination
 is an instance of
 simple hypothesis testing Which state is it? ⇢ Neyman-Pearson lemma tells us
 this is the most powerful test. ( ⇢, ) = 2 log ⇣ L(⇢) L( ) ⌘ () 0 Choose the highest likelihood! 49 ⇢ () < 0

Slide 12

Slide 12 text

Quantum information makes connections to statistical inference in many ways. Entanglement verification
 is an instance of
 composite hypothesis testing Which region is it? Separable Entangled 48 Choose the highest likelihood! ( HA , HB ) = 2 log ⇣ L(HA) L(HB) ⌘ HB () 0 HA () < 0 HA HB

Slide 13

Slide 13 text

Quantum information makes connections to statistical inference in many ways. State tomography
 is an instance of
 model fitting Which parameters are best? Maximum likelihood estimation ˆ ⇢ ˆ ⇢ = argmax ⇢ L ( ⇢ ) 47 Choose the highest likelihood!

Slide 14

Slide 14 text

Quantum information makes connections to statistical inference in many ways. Z-diagonal state vs not
 is an instance of
 (nested) model selection Is the true state on the line? 0 =) MB ??? LLRS never negative! ( MA, MB) = 2 log ⇣ L(MA) L(MB) ⌘ MA MB 46 Choose the highest likelihood!

Slide 15

Slide 15 text

How will we investigate
 model selection
 in state tomography?

Slide 16

Slide 16 text

I am studying a paradigmatic problem: tomography of continuous-variable systems. Optical modes of light… …as Wigner functions
 or density matrices. ˆ ⇢ 44

Slide 17

Slide 17 text

The models I consider are subspaces of an infinite-dimensional Hilbert space. |0i |1i |2i |d 1i Other models are possible (e.g., by rank). Hd = Span (|0i, |1i, · · · |d 1i) Md = {⇢ | ⇢ 2 B(Hd), Tr(⇢) = 1, ⇢ 0} Models based on 
 low-energy approximation
 and smoothness of 
 Wigner function. 43

Slide 18

Slide 18 text

The models I consider are nested inside one another. Md ⇢ Md+1 How can we use likelihoods
 to compare them? We have to tackle
 nested model selection 42 |0i |1i |2i |d 1i

Slide 19

Slide 19 text

We rethink using the LLRS for nested model selection, based on its null expected value. N h i Md Md0 ( Md, Md0 ) = 2 log ⇣ L(Md) L(Md0 ) ⌘ 41 Case One: Smaller Model is False

Slide 20

Slide 20 text

We rethink using the LLRS for nested model selection, based on its null expected value. Case Two: Both Models True (Null case) N h i Md Md0 40 ( Md, Md0 ) = 2 log ⇣ L(Md) L(Md0 ) ⌘

Slide 21

Slide 21 text

N Md Md0 39 We rethink using the LLRS for nested model selection, based on its null expected value. Devise a threshold to compare the observed value against
 and rule out the smaller model thresh & h i

Slide 22

Slide 22 text

We compare the LLRS to its threshold value ( Md, Md0 ) = 2 log ⇣ L(Md) L(Md0 ) ⌘ 38 We rethink using the LLRS for nested model selection, based on its null expected value. Md0 () thresh Md Md0 N

Slide 23

Slide 23 text

Asymptotic convergence of the LLRS is a consequence of the Wilks Theorem. 1938: Wilks gives distribution of LLRS (Md, Md0 ) ⇠ 2 pd0 pd 37 Gaussian distribution of estimates One fluctuating parameter = one unit of LLRS

Slide 24

Slide 24 text

Another model selection technique
 relies on this result. Information criteria trade off between
 fitting data well and having high accuracy Use of Akaike’s AIC is common Relies on Wilks Theorem
 (bias of estimator of KL divergence) My work helps us create a quantum information criterion 36

Slide 25

Slide 25 text

We now have potential tools for nested model selection
 in tomography.
 How do they perform?

Slide 26

Slide 26 text

I performed a Monte Carlo study of the LLRS and its behavior. Studied: - 17 true states (supported on low-energy subspace) 34

Slide 27

Slide 27 text

I performed a Monte Carlo study of the LLRS and its behavior. Studied: - 17 true states - 100 random datasets for each state (coherent state POVM) Data = {↵j | ↵j 2 C, Pr(↵j) = h↵j |⇢true |↵j i/⇡} 33

Slide 28

Slide 28 text

I performed a Monte Carlo study of the LLRS and its behavior. Studied: - 17 true states - 100 random datasets for each state (coherent state POVM) 32 - 10K to 100K samples for each dataset

Slide 29

Slide 29 text

I performed a Monte Carlo study of the LLRS and its behavior. Studied: - 17 true states - 100 random datasets for each state (coherent state POVM) 31 - 10K to 100K samples for each dataset - MLE over {2…10}-dimensional Hilbert spaces {M2, M3, · · · , M10 } Lots of supercomputer time!

Slide 30

Slide 30 text

The results were puzzling.

Slide 31

Slide 31 text

I checked four predictions of the Wilks Theorem on the behavior of the LLRS. Only one matched. Predictions: Asymptotic convergence Distribution independent of truth A particular expected value Distribution depends on reconstruction dimension 29

Slide 32

Slide 32 text

When the truth is in the smaller model, we observe asymptotic convergence. 28 Wilks: Expected value asymptotes Reality: Expected value asymptotes

Slide 33

Slide 33 text

Monte Carlo averages and expectation values do not agree at all. Wilks: Expected value increases with reconstruction dimension 27 Reality: Expected value essentially constant

Slide 34

Slide 34 text

Wilks theorem predictions for distribution of LLRS do not agree with simulation. Wilks: Distribution independent of true state 26 Reality: Distribution depends on true state

Slide 35

Slide 35 text

Wilks theorem predictions for distribution of LLRS do not agree with simulation. Wilks: Distribution depends strongly on dimension 25 Reality: Distribution depends weakly on dimension

Slide 36

Slide 36 text

Theorems are not “wrong”,
 only “not applicable”. Why does the Wilks Theorem not apply?

Slide 37

Slide 37 text

State tomography is
 on the edge. Let’s see why.

Slide 38

Slide 38 text

The first edge is the positivity constraint. This shows up a lot in quantum information. ˆ ⇢ 0 ˆ ⇢  0 22

Slide 39

Slide 39 text

The first edge is the positivity constraint. This shows up a lot in quantum information. Positivity “piles up” estimates
 on the boundary ! Fluctuations normal
 to boundary are diminished ! Estimator is biased ˆ ⇢ 0 21

Slide 40

Slide 40 text

The second edge is that the models I use nest on the boundary of one another. ˆ ⇢ = 0 B B B @ ⇢00 ⇢01 ⇢02 · · · ⇢10 ⇢11 ⇢12 · · · ⇢20 ⇢21 ⇢22 · · · . . . . . . . . . ... 1 C C C A 20

Slide 41

Slide 41 text

When the true state is mixed, you avoid the first edge, but still run right into the second. 19 Boundaries are unavoidable in state tomography

Slide 42

Slide 42 text

The Wilks Theorem cannot be applied on boundaries - they introduce constraints. 18 ⇠ 2 1

Slide 43

Slide 43 text

The Wilks Theorem cannot be applied on boundaries - they introduce constraints. Boundaries change distribution of MLEs,
 which changes distribution of LLRS. 17 ⇠ 1 2 2 1

Slide 44

Slide 44 text

State tomography is
 on the edge. So must our
 model selection be.

Slide 45

Slide 45 text

Proving a “qWilks theorem” would be hard, in general. Distribution of LLRS depends on true state Quantum state space hard to reason about Distribution depends on dimension + 6 3 15 1 + 2 = = ???????? 2 8

Slide 46

Slide 46 text

Can we find a replacement for the Wilks theorem which respects boundaries? 14 Let’s talk about work in progress.

Slide 47

Slide 47 text

Can we find a replacement for the Wilks theorem which respects boundaries? Quantum states = unitary DOF + classical simplexes 13 Gaussian distribution of estimates Wilks Theorem says: We model LLRS as: One fluctuating parameter = one unit of LLRS

Slide 48

Slide 48 text

Can we find a replacement for the Wilks theorem which respects boundaries? Quantum states = unitary DOF + classical simplexes LLRS depends on rank of true state ˆ ⇢ = 0 B B B @ ⇢00 ⇢01 ⇢02 · · · ⇢10 ⇢11 ⇢12 · · · ⇢20 ⇢21 ⇢22 · · · . . . . . . . . . ... 1 C C C A 2 units of LLRS per rank 12

Slide 49

Slide 49 text

Can we find a replacement for the Wilks theorem which respects boundaries? LLRS depends on spectral fluctuations Requires Monte Carlo for
 simulating effect of boundaries Quantum states = unitary DOF + classical simplexes 11 h ( Md, Md+1) i = 2 rank( ⇢true) + Simplex Result

Slide 50

Slide 50 text

How well does this replacement work? Not as accurate as we expected…what is going on? 10

Slide 51

Slide 51 text

How does LLRS behave
 when truth and estimates are close? Expand LLRS as function of
 true state to second order (Md, Md0 ) = (⇢true, Md0 ) (⇢true, Md) Helpful trick: 09 Wilks and our model both rely on the
 existence of a second-order Taylor series expansion. Is Taylor expansion
 a good predictor of LLRS?

Slide 52

Slide 52 text

Curvature Error Addresses the question “If estimate close to truth,
 what does the LLRS do?” The Taylor series allows us to calculate an approximate expected value. (⇢true, Md) = (⇢true, ˆ ⇢) ⇡ 0 + @ @⇢ ˆ ⇢ (⇢true ˆ ⇢) + 1 2 @2 @⇢2 ˆ ⇢ (⇢true ˆ ⇢)2 h (⇢true, ˆ ⇢)i ⇡ tr ✓⌧ @2L @⇢2 ˆ ⇢ |⇢true ˆ ⇢iihh⇢true ˆ ⇢| ◆ 08

Slide 53

Slide 53 text

In the absence of any boundaries,
 the expected value equals that of Wilks. \ h ( ⇢0, ˆ ⇢ )i ⇡ tr ⇣ ˆ I (ˆ ⇢ ) Cov (ˆ ⇢ ) ⌘ ⇡ d 2 1 No boundaries = no bias in MLE Does not predict distribution, however h (⇢0, ˆ ⇢)i ⇡ tr ✓⌧ @2L @⇢2 ˆ ⇢ |⇢0 ˆ ⇢iihh⇢0 ˆ ⇢| ◆ 07

Slide 54

Slide 54 text

Taylor series seems inaccurate…is something wrong? Most likely. 06 Asymptotic limit? Error in code? We shall see!

Slide 55

Slide 55 text

What was the point of
 all that math?

Slide 56

Slide 56 text

We are going to have a way to do nested model selection in quantum tomography! Cannot use Wilks (too high!) Replacement: Unitary DOF + Simplex (Still too high!) 04 Models have boundaries! Construct Taylor series (reduces to Wilks) Use Taylor series to determine when our model will work Devise quantum replacement for Wilks

Slide 57

Slide 57 text

What is next?

Slide 58

Slide 58 text

There are many ways forward. Use these results to create estimator of expected value Make a quantum information criterion A model selection rule for displaced/squeezed states Apply active subspace methods to speed up optimization What’s with compressed sensing and model selection? 02

Slide 59

Slide 59 text

A year ago, I thought
 model selection using
 the LLRS was easy…
 Today, I am certain it is
 vastly harder than I
 (and others!) thought.

Slide 60

Slide 60 text

Model selection in
 quantum state tomography
 is hard because we
 have to deal with boundaries.