Upgrade to Pro — share decks privately, control downloads, hide ads and more …

An Effective State Space Dimension for a Quantu...

Travis Scholten
November 17, 2016

An Effective State Space Dimension for a Quantum System

A talk I gave at the ARC Centre of Excellence for Engineered Quantum Systems, at the University of Sydney.

Travis Scholten

November 17, 2016
Tweet

More Decks by Travis Scholten

Other Decks in Research

Transcript

  1. An Effective State Space Dimension for a Quantum System Travis

    L Scholten, @Travis_Sch Center for Quantum Information and Control University of New Mexico, USA
  2. Originally, this talk was going to be about this… arXiv:1609.04385

    (Go check it out on GitHub!) Behavior of the Maximum Likelihood in Quantum State Tomography Travis L Scholten and Robin Blume-Kohout Center for Computing Research (CCR), Sandia National Labs and University of New Mexico (Dated: September 22, 2016) Quantum state tomography on a d-dimensional system demands resources that grow rapidly with d. Model selection can be used to tailor the number of fit parameters to the data, but quantum tomography violates some common assumptions that underly canonical model selection techniques based on ratios of maximum likelihoods (loglikelihood ratio statistics), due to the nature of the state space boundaries. Here, we study the behavior of the maximum likelihood in di↵erent Hilbert space dimensions, and derive an expression for a complexity penalty – the expected value of the loglikelihood ratio statistic (roughly, the logarithm of the maximum likelihood) – that can be used to make an appropriate choice for d. antum information science, an experimentalist h to determine the quantum state ⇢0 that is pro- y a specific initialization procedure. This can be ng quantum state tomography [1]: many copies e produced; they are measured in diverse ways; lly the outcomes of those measurements (data) ted and analyzed to produce an estimate ˆ ⇢. This
  3. …but instead, it’s more about this. Hi Travis! I just

    read your new paper on the arXiv, which I think is pretty cool. Nice work! It also might have some wider application than “just” model selection: a mathematician would call your expectation value of gamma the Gaussian width (or statistical dimension) of something I would call the tangential cone at rho_0… These Gaussian widths have quite some applications in other fields, see e.g. https://arxiv.org/abs/1303.6672 who use them to explain many result in Compressed Sensing and related fields.
  4. Our work originally focused on doing model selection in state

    tomography. (How would we handle tomography of large quantum systems?)
  5. There are many techniques we use to characterize quantum systems.

    States Processes (gates) Detectors (measurement) Gatesets Things we Characterize How we Characterize Maximum likelihood Least Squares fitting Nuclear norm minimization Bayesian inference Randomized benchmarking Tomography Direct fidelity estimation Robust Phase Estimation
  6. There are many techniques we use to characterize quantum systems.

    States Processes (gates) Detectors (measurement) Gatesets Things we Characterize How we Characterize Maximum likelihood Least Squares fitting Nuclear norm minimization Bayesian inference Randomized benchmarking Tomography Direct fidelity estimation Robust Phase Estimation
  7. Sets of density matrices are statistical models. Density matrices (set

    of quantum states) {M | M 2 B(H), Tr(M) = 1, M 0} Fixed POVM + Born rule gives probability distribution Prj(⇢) = Tr(Ej⇢) ⇢
  8. Quantum state tomography maps data to density matrices. Data sets

    (from experiments, e.g., POVM counts) {(Ej, nj)}
  9. Quantum state tomography maps data to density matrices. Data sets

    (from experiments, e.g., POVM counts) {(Ej, nj)} Density matrices (set of quantum states) ˆ ⇢ {M | M 2 B(H), Tr(M) = 1, M 0}
  10. {(Ej, nj)} One family of models we considered was based

    on Hilbert space dimension. {M | M 2 B(H2), Tr(M) = 1, M 0} {M | M 2 B(H3), Tr(M) = 1, M 0} “Qubit model” “Qutrit model”
  11. {(Ej, nj)} Different models = different estimates ˆ ⇢ {M

    | M 2 B(H2), Tr(M) = 1, M 0} ˆ ⇢ {M | M 2 B(H3), Tr(M) = 1, M 0} “Qubit model” “Qutrit model” One family of models we considered was based on Hilbert space dimension.
  12. Model selection is necessary for large quantum systems, such as

    optical modes. Formally, infinite-dimensional! Truncation of Hilbert space from physical motivation hn|⇢|ni  N Smoothing of Wigner function G Breitenbach, Nature 387, 1997
  13. Model selection is necessary for large quantum systems, such as

    optical modes. Formally, infinite-dimensional! Truncation of Hilbert space from physical motivation hn|⇢|ni  N Smoothing of Wigner function How do we choose a good Hilbert space dimension for optical systems?? G Breitenbach, Nature 387, 1997
  14. {(Ej, nj)} ˆ ⇢MLE = argmax ⇢ L ( ⇢

    ) = argmax ⇢ Pr( { ( Ej, nj) } | ⇢ ) ˆ ⇢MLE The model selection techniques we considered use maximum likelihood estimation to map data to states. Z Hradil, 1998
  15. {(Ej, nj)} For each model, we can compute its maximum

    likelihood estimate. “Qubit model” “Qutrit model” ˆ ⇢MLE,2 ˆ ⇢MLE,3 ˆ ⇢MLE,d = argmax ⇢2D(Hd) L ( ⇢ ) “d=4 model” ˆ ⇢MLE,4 There’s just one problem, though…
  16. {(Ej, nj)} ˆ ⇢MLE,2 ˆ ⇢MLE,3 ˆ ⇢MLE,4 Our models

    were nested, meaning that picking the model with the highest likelihood was a bad idea! L(ˆ ⇢MLE,2)  L(ˆ ⇢MLE,3)  L(ˆ ⇢MLE,4)  · · · d = 1?
  17. Knowing the average increase in likelihood when a larger model

    is not better, we would know when to stop.
  18. (How do we know that??) Knowing the average increase in

    likelihood when a larger model is not better, we would know when to stop.
  19. {(Ej, nj)} One frequentist model selection technique uses the loglikelihood

    ratio statistic. ˆ ⇢M1 ˆ ⇢M2 ( M1, M2) = 2 log ⇣ L(ˆ ⇢M1 ) L(ˆ ⇢M2 ) ⌘ ˆ ⇢Mj = argmax ⇢2Mj L ( ⇢ )
  20. ˆ ⇢Mj ⇠ N(⇢0, ✏2Id2 ) ⇢0 L ( ⇢

    ) / Exp Tr[( ⇢ ˆ ⇢Mj ) 2 ] / 2 ✏2 Under the assumptions of local asymptotic normality, we can say useful things about this statistic. ˆ ⇢Mj (Simplifying assumption)
  21. L ( ⇢ ) / Exp Tr[( ⇢ ˆ ⇢Mj

    ) 2 ] / 2 ✏2 Under the assumptions of local asymptotic normality, we can say useful things about this statistic. (M1, M2) ⇠ 2 K h i = K = p 2K S Wilks, 1938 ⇢0 ˆ ⇢Mj ⇠ N(⇢0, ✏2Id2 ) ˆ ⇢Mj
  22. L ( ⇢ ) / Exp Tr[( ⇢ ˆ ⇢Mj

    ) 2 ] / 2 ✏2 Under the assumptions of local asymptotic normality, we can say useful things about this statistic. (M1, M2) ⇠ 2 K h i = K = p 2K S Wilks, 1938 ⇢0 ˆ ⇢Mj ⇠ N(⇢0, ✏2Id2 ) ˆ ⇢Mj Not true in quantum, for the models we were considering!
  23. Local asymptotic normality breaks down because the truth is always

    on some boundary. Higher-dimensional models contain lower-dimensional ones… as boundaries! ⇢0 = I2/2 = ✓ .5 0 0 .5 ◆ = 0 @ .5 0 0 0 .5 0 0 0 0 1 A = · · · ⇢0 Gaussian distribution of estimates no longer applies!
  24. We studied how imposing positivity breaks local asymptotic normality. ⇢0

    ˆ ⇢ ⇠ N(⇢0, ✏2Id2 ) L ( ⇢ ) / Exp Tr[( ⇢ ˆ ⇢ ) 2 ] / 2 ✏2 ˆ ⇢ 0 with high probability Case 1: Full-Rank True State ˆ ⇢MLE = ˆ ⇢
  25. We studied how imposing positivity breaks local asymptotic normality. ˆ

    ⇢ ⇠ N(⇢0, ✏2Id2 ) L ( ⇢ ) / Exp Tr[( ⇢ ˆ ⇢ ) 2 ] / 2 ✏2 Case 2: Rank-Deficient True State ˆ ⇢MLE = 8 < : ˆ ⇢ ˆ ⇢ 0 argmin ⇢ Tr[(⇢ ˆ ⇢)2] ˆ ⇢ ⇤ 0 ˆ ⇢ ⇤ 0 generally underly canonical model selection techniques d ratio statistics), due to the nature of the f the maximum likelihood in di↵erent Hilbert mplexity penalty – the expected value of the the maximum likelihood) – that can be used ˆMLE 0 900
  26. What’s the expected value of the loglikelihood ratio statistic? h

    i = hTr[(ˆ ⇢MLE ⇢0)2]i/2✏2 (Computing exact distribution is hard)
  27. What’s the expected value of the loglikelihood ratio statistic? h

    i = hTr[(ˆ ⇢MLE ⇢0)2]i/2✏2 h (⇢0 , Md)i ⇡ 2rd r2 + rq2 + N(N + q2) ⇡ ✓ ⇡ 2 sin 1 ✓ q 2 p N ◆◆ q(q2 + 26N) 24⇡ p 4N q2 (10) where q is given in Equation (8), N = d r, and r = Rank(⇢0 ). and H5 . We use erodyne dataset MLEs over each merical optimiz d, we averaged an empirical ave pair. Behavior of the Maximum Likelihood in Quantum State Tomography Travis L Scholten and Robin Blume-Kohout Center for Computing Research (CCR), Sandia National Labs and University of New Mexico (Dated: September 22, 2016) Quantum state tomography on a d-dimensional system demands resources that grow rapidly with d. Model selection can be used to tailor the number of fit parameters to the data, but quantum tomography violates some common assumptions that underly canonical model selection techniques based on ratios of maximum likelihoods (loglikelihood ratio statistics), due to the nature of the state space boundaries. Here, we study the behavior of the maximum likelihood in di↵erent Hilbert space dimensions, and derive an expression for a complexity penalty – the expected value of the loglikelihood ratio statistic (roughly, the logarithm of the maximum likelihood) – that can be used to make an appropriate choice for d. antum information science, an experimentalist n- n- k . s m, u- o 0 , e d al n e o e al tics. These are the average values of the sorted , so j is the average value of the jth largest value of . Large random samples are usually well approximated (for many purposes) by their order statistics even when the elements of the sample are independent, and level avoid- ance makes the approximation even better. We make one further approximation, by assuming (as an ansatz) that N 1, and thus that the distribution of the j is e↵ectively continuous and identical to Pr(). (See Ap- pendix I for a more detailed discussion of this series of approximations.) To proceed with truncation, we observe that the j are symmetrically distributed around  = 0, so half of them are negative. Therefore, with high probabil- ity, Tr [Trunc(ˆ ⇢)] > 1, and so we will need to subtract q1l from ˆ ⇢ before truncating. The appropriate q solves Tr [Trunc(ˆ ⇢ q1l)] = 1. This equation can be solved us- ing the ansatz established so far, and some series expan- sions (see Appendix I) yield the solution: q ⇡ 2 p N (240r⇡)2/5 4 N1/10 + (240r⇡)4/5 80 N 3/10. (8) h (⇢0 , Md)i ⇡ 2rd r2 + rq2 + N(N + q2) ⇡ ✓ ⇡ 2 sin 1 ✓ q 2 p N q(q2 + 26N) 24⇡ p 4N q2 where q is given in Equation (8), N = d r,
  28. Hi Travis! I just read your new paper on the

    arXiv, which I think is pretty cool. Nice work! It also might have some wider application than “just” model selection: a mathematician would call your expectation value of gamma the Gaussian width (or statistical dimension) of something I would call the tangential cone at rho_0… These Gaussian widths have quite some applications in other fields, see e.g. https://arxiv.org/abs/1303.6672 who use them to explain many result in Compressed Sensing and related fields. What’s the expected value of the loglikelihood ratio statistic? h i = hTr[(ˆ ⇢MLE ⇢0)2]i/2✏2 It’s the statistical dimension of the quantum state space in the neighborhood of the true state??!
  29. If we zoom in on classical simplices, or the quantum

    state space, we find cones. 2-simplex (aka, 3-sided die) C = { x | ⌧ x 2 C 8 ⌧ 0}
  30. If we zoom in on classical simplices, or the quantum

    state space, we find cones. ⇢0 ⇢0 ⇢0 ⇢0 Half-space Full-space C = { | 9r > 0 s.t.(⇢0 + r ) 0}
  31. g ˆ ⇢ ⇢0 Let’s take a cone, put a

    Gaussian distribution of vectors on top of it… g ⇠ N(0, Id) ˆ ⇢ ⇠ N(⇢0, ✏2Id2 )
  32. ⇧C( g ) ⌘ argmin x 2C || x g

    ||2 Properties of truncated vectors tell us about the local geometry of the cone. …and truncate the distribution back to the cone. g ˆ ⇢ ⇢0 ⇧C(ˆ ⇢) ⌘ argmin ⇢ Tr[(⇢ ˆ ⇢)2]
  33. Let’s build intuition about the truncation process. Where do these

    vectors get truncated to? ⇧C( g ) ⌘ argmin x 2C || x g ||2
  34. Some move to vertex, others to edge, and some don’t

    change! Let’s build intuition about the truncation process. h||⇧C(g)||2 2 i = k on a k-facet
  35. We will want to know the probability the truncated vector

    ends up on some face of the cone. vk ⌘ Pr(⇧C( g ) is on k -dimensional face) “intrinsic volumes” of the cone
  36. We will want to know the probability the truncated vector

    ends up on some face of the cone. vk ⌘ Pr(⇧C( g ) is on k -dimensional face) “intrinsic volumes” of the cone v1 v0 v2
  37. Depending on the geometry of the cone, the intrinsic volumes

    change. vk ⌘ Pr(⇧C( g ) on k -dimensional face) 0 ⇡/8 ⇡/4 3⇡/8 ⇡/2 ↵ 0.0 0.1 0.2 0.3 0.4 0.5 v1 v2 v0 v1 v0 v2 2↵
  38. The statistical dimension is the expected value of the distribution

    of intrinsic volumes. What’s the average dimension of the truncated vector? 1 2  (C)  d 1 2 v1 v0 v2 (C) ⌘ P k k ⇤ vk = hki
  39. 0 ⇡/8 ⇡/4 3⇡/8 ⇡/2 ↵ 0.0 0.5 1.0 1.5

    2.0 (C) Needle Half-Space Ice-Cream (C) ⌘ P k k ⇤ vk = hki Depending on the geometry of the cone, the statistical dimension changes. v1 v0 2↵ v2
  40. The statistical dimension is equal to the average norm of

    the truncated vector. (C) ⌘ P k k ⇤ vk = h||⇧C(g)||2i
  41. ⇢0 (C) = h||⇧C(ˆ ⇢) ⇢0 ||2 2 i/✏2 Center

    Rescale For the quantum state space, the statistical dimension is the (scaled) average squared Hilbert-Schmidt distance!
  42. For the quantum state space, the statistical dimension is the

    (scaled) average squared Hilbert-Schmidt distance! ⇢0 (C) = h||⇧C(ˆ ⇢) ⇢0 ||2 2 i/✏2 = hTr[(⇧C(ˆ ⇢) ⇢0)2]i/✏2
  43. Under local asymptotic normality, we can further simplify our expression…

    ⇢0 (C) = h||⇧C(ˆ ⇢) ⇢0 ||2 2 i/✏2 = hTr[(⇧C(ˆ ⇢) ⇢0)2]i/✏2 = hTr(ˆ ⇢MLE ⇢0)2]i/✏2
  44. ⇢0 ( C ) = h|| ⇧C(ˆ ⇢ ) ⇢0

    ||2 2 i/✏2 = h Tr[(⇧C(ˆ ⇢ ) ⇢0) 2 ] i/✏2 = h Tr(ˆ ⇢MLE ⇢0) 2 ] i/✏2 = 2 h log L (ˆ ⇢MLE) i …and discover something we already saw! h i Behavior of the Maximum Likelihood in Quantum State Tomography Travis L Scholten and Robin Blume-Kohout Center for Computing Research (CCR), Sandia National Labs and University of New Mexico (Dated: September 22, 2016)
  45. Classically, the statistical dimension tells us what kind of real

    vector space a cone “looks like”. Related to stability of solutions to convex optimization problems R (C) C is an L -dimensional subspace = ) ( C ) = L ⇡
  46. Does the statistical dimension of the quantum cone offer a

    similar interpretation? Yes, and no R (C) ⇢0 ⇡
  47. 2 4 6 8 10 12 14 16 18 20

    d (Hilbert Space Dimension) 0 50 100 150 200 250 300 350 400 d (C) d2 1 2(d 1) Yes, because it lies between the bounds we would expect.
  48. 2 4 6 8 10 12 14 16 18 20

    d (Hilbert Space Dimension) 0 50 100 150 200 250 300 350 400 d (C) d2 1 2(d 1) Rank( r0) =1 (various colors) Rank( r0) = 2...9 Rank( r0) =10 Yes, because it lies between the bounds we would expect. (Theory breaks down when rank of truth comparable to dimension)
  49. g ( C | on surface) =? No, because it

    is geometrical in nature, not topological.
  50. (C) ⌘ P k k ⇤ vk = hRank(⇧C(g))i 1

    k ! r 1 The geometrical nature of the statistical dimension makes it hard to connect it to ranks of estimates.
  51. ⇢0 (C) ⌘ P k k ⇤ vk 6= hRank(ˆ

    ⇢MLE)i 1 k 6! r 1 The geometrical nature of the statistical dimension makes it hard to connect it to ranks of estimates.
  52. The geometrical nature of the statistical dimension makes it hard

    to connect it to ranks of estimates. ⇢0 (C) ⌘ P k k ⇤ vk 6= hRank(ˆ ⇢MLE)i 1 k 6! r 1 k 6! r(2d r) 1 (Dimension of manifold of rank-r states in d-dimensions)
  53. Numerical results indicate there’s interesting behavior to be explained… 2

    4 6 8 10 12 14 16 Hilbert Space Dimension d 0 2 4 6 8 10 12 hRank(ˆ ⇢MLE )i Average Rank of Maximum Likelihood Estimates 9 8 7 6 5 4 3 2 1
  54. …and maybe some concentration results? 2 4 6 8 10

    12 14 16 Hilbert Space Dimension d 0.0 0.1 0.2 0.3 0.4 0.5 0.6 Rank(ˆ ⇢MLE ) Standard Deviation of Distribution 9 8 7 6 5 4 3 2 1 Concentration of statistical dimension -> concentration of ranks?
  55. h i Studying model selection in state tomography has turned

    into studying the geometry of quantum states.
  56. ⇢0 h i Studying model selection in state tomography has

    turned into studying the geometry of quantum states.
  57. ⇢0 Studying model selection in state tomography has turned into

    studying the geometry of quantum states. h i 2 4 6 8 10 12 14 16 18 20 d (Hilbert Space Dimension) 0 50 100 150 200 250 300 350 400 d (C) d2 1 2(d 1) Rank( r0) =1 (various colors) Rank( r0) = 2...9 Rank( r0) =10
  58. 2 4 6 8 10 12 14 16 18 20

    d (Hilbert Space Dimension) 0 50 100 150 200 250 300 350 400 d (C) d2 1 2(d 1) Rank( r0) =1 (various colors) Rank( r0) = 2...9 Rank( r0) =10
  59. Unfortunately, this result depends on knowing the rank of the

    true state! 2 4 6 8 10 12 14 16 18 20 d (Hilbert Space Dimension) 0 50 100 150 200 250 300 350 400 d (C) d2 1 2(d 1) Rank( r0) =1 (various colors) Rank( r0) = 2...9 Rank( r0) =10
  60. Classically, the statistical dimension tells us how many measurements are

    necessary for stable recovery. y = x0 + ✏ Consider convex optimization of linear system: ˆ x = argmin x f( x ) s.t. || x y ||  ⌘ x0 2 Rd, is m ⇥ d
  61. Classically, the statistical dimension tells us how many measurements are

    necessary for stable recovery. Suppose j · = ( j,1, j,2, · · · ) ⇠ N(0, Id) m & (Df ) + A p (Df ) If then error in estimation is bounded Joel Tropp, arXiv 1405.1102