Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Kernel Approximation for Offline Quantum-Enhanced Machine Learning

Kernel Approximation for Offline Quantum-Enhanced Machine Learning

A talk I gave at the Quantum Artificial Intelligence (QAI) workshop, co-located with IEEE Quantum Week 2021 (QCE21).

Abstract:
Classical machine learning algorithms can be enhanced by access to quantum computers. One such enhancement -- based on quantum kernel matrices -- defines novel similarity measures between pieces of data based on the transition probability of a parameterized quantum circuit (PQC). Utilizing quantum kernels in practice suffers from the problem that extending a kernel matrix to accommodate new data incurs a data transfer cost linear with the number of original data points. Although efficient from a complexity standpoint, in practice data transfer will introduce inefficiencies into the overall quantum-enhanced machine learning workflow. This work shows (a) that given access to kernel values involving a sample of the old data (along with the new), classical matrix completion algorithms can reconstruct with zero error the unsampled quantum kernel matrix entries (in a zero-noise regime) based on a quantity which depends on the rank of the extended kernel matrix, (b) in the presence of shot noise, the reconstruction error degrades gracefully, and (c) that the rank of quantum kernel matrices can be predicted given information about the PQC.

Travis Scholten

October 20, 2021
Tweet

More Decks by Travis Scholten

Other Decks in Science

Transcript

  1. IBM Quantum / © 2021 IBM Corporation Kernel Matrix Completion

    for Offline Quantum-Enhanced Machine Learning Travis L. Scholten @Travis_Sch Quantum Applications Architect 2021/10/20 In collaboration with: * Annie Naveh, Imogen Fitzgerald, & Andrew Lockwood — Woodside Energy * Anna Phan, IBM Quantum Melbourne IBM Quantum, T.J. Watson Research Center, USA
  2. Quantum-enhanced ML: a near-term application of QML? Based on graphic

    by Maria Schuld - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=55676381 “Quantum-inspired” “Quantum-enhanced” “ML For Physics” “Quantum learning” Algorithm Classical Quantum Quantum Classical Data Kernels/Support Vector Machines Neural Networks Quantum Linear Algebra Clustering Data analysis
  3. IBM Quantum / © 2021 IBM Corporation 3 What are

    quantum kernels? Quantum kernel = similarity measure between two pieces of data evaluated using parameterized quantum circuits xj , xk U = U(θ); |ψj ⟩ = U(xj )|0⟩; Kjk = |⟨ψk |ψj ⟩|2 = |⟨0|U†(xk )U(xj )|0⟩|2 U(xj ) U†(xk ) |0⟩ Pr(0) A Rigorous And Robust Speedup in Supervised Machine Learning 2010.02174 Covariant Quantum Kernels for Data with Group Structure 2105.03406 Supervised Quantum Machine Learning Algorithms are Kernel Methods 2101.11020 Power of Data in Quantum Machine Learning 2011.01938 Example: a classifier y(x) = ∑ k wk y(xk )K(x, xk )
  4. IBM Quantum / © 2021 IBM Corporation 4 North West

    Shelf LNG Facility Western Australia https://files.woodside/images/default-source/v2-media-cards/woodside2020---kgp-image-by-jarrad-seng--107-web.jpeg?sfvrsn=3c07be36_4
  5. IBM Quantum / © 2021 IBM Corporation K = K11

    ⋯ K1N ⋮ ⋱ ⋮ KN1 ⋯ KNN 5 Streaming data: a unique challenge for quantum kernels Data set {x1 , x2 , ⋯, xN } But! New data appears {xN+1 , xN+2 , ⋯xN+m } K = K11 ⋯ K1N K1,N+1 ⋯ K1,N+m ⋮ ⋱ ⋮ ⋮ ⋱ ⋮ KN1 ⋯ KNN KN,N+1 ⋯ KN,N+m KN+1,1 ⋯ KN+1,N KN+1,N+1 ⋯ KN+1,N+m ⋮ ⋱ ⋮ ⋮ ⋱ ⋮ KN+m,1 ⋯ KN+m,N KN+m,N+1 ⋯ KN+m,N+m The kernel matrix needs to be extended to include the new data!
  6. IBM Quantum / © 2021 IBM Corporation 6 Now: “All

    data” approach All old & new data must roundtrip QC calls slow workflow Exact results for K Future: “Some data” approach Some old & all new data roundtrips Matrix completion fills in the rest Approximate results: an estimate of ̂ K K New data generation = must extend matrix What’s the minimal amount of data transfer needed? Extending quantum kernel matrices Necessary to process streaming data with quantum kernels
  7. IBM Quantum / © 2021 IBM Corporation 7 Extending quantum

    kernel matrices as a matrix completion problem Desiderata we want Graph-theory-based completion algorithms satisfy these! Deterministic (non-random) sampling of entries to be computed “Offline” with respect to computer — no adaptive, back-and-forth approaches Respect the natural block-diagonal structure (old + new data)
  8. IBM Quantum / © 2021 IBM Corporation 8 Incomplete matrix

    is represented by a graph. - Edges b/t nodes & <—> is known j k Kjk If graph is chordal then can be completed using graph-theory-inspired algorithms. K Chordal graph = every cycle of length greater than three has a chord. Extending quantum kernel matrices as a chordal-graph-based matrix completion problem
  9. IBM Quantum / © 2021 IBM Corporation 9 Chordal Graphs

    and Semidefinite Optimization Vandenberghe & Andersen Chapter 10 Completing : Matrix completion = “perfect elimination ordering” (walk) on the sparsity graph Output: Cholesky decomposition of the inverse of the completion In each step, an update for the completion can be computed using the computed entries. K maximise log(det( ̂ K)) s.t. ̂ Klm = Klm ∀ (l, m) ∈ S and ̂ K ≥ 0 S = indices of computed entries Extending quantum kernel matrices via chordal-graph-based matrix completion
  10. IBM Quantum / © 2021 IBM Corporation Chordal graph completion

    algorithms are well-suited for extending quantum kernel matrices. As new data is acquired, kernel matrix is extended by sending only a “batch” of data to the quantum computer, with controllable amount of old data due to the overlap. No overlap: no old data transferred Some overlap: some old data transferred Block-diagonal patterns induce chordal graphs!
  11. IBM Quantum / © 2021 IBM Corporation The overlap between

    the blocks dictates the sampling complexity. Larger overlap = more redundancy, i.e. the amount of old data that needs to be sent back Data transferred Overlap increasing Once the overlap exceeds the rank of the extended matrix, perfect completion is possible, in principle.
  12. IBM Quantum / © 2021 IBM Corporation Numerical results: extending

    quantum kernel matrices using chordal graph completion algorithms Extend 450x450 w/ 50 new data points 19 circuits* (with multiple template repetitions) Exact (state vector) simulation Error: || ̂ K − K ||/||K || * S. Sim, et al. - Expressibility and entangling capability of parameterized quantum circuits for hybrid quantum-classical algorithms https://arxiv.org/abs/1905.10876
  13. IBM Quantum / © 2021 IBM Corporation In the presence

    of shot noise, reconstruction error degrades gracefully. Increasing # of shots We can reconstruct quantum kernel matrices using matrix completion! Extend 450x450 w/ 50 new data points 19 PQCs (1 template repetition) QASM (shot noise vector) simulation Small overlap: algorithm fails in both noise-free & noisy cases
  14. IBM Quantum / © 2021 IBM Corporation Once the overlap

    exceeds the rank of the extended matrix, perfect completion is possible. How would the rank be known a priori? Can it be predicted given the PQC? 🤔
  15. IBM Quantum / © 2021 IBM Corporation We can derive

    some simple upper bounds on the rank of quantum kernel matrices. Kjk = Tr(ρj ρk ) ; j, k ∈ [1,2,⋯, N] ⟹ rank(K) ≤ N Kjk = Tr(ρj ρk ) = Tr 4w ∑ p=1 cjp σq 4w ∑ q=1 cqk σq = 4w ∑ p,q=1 cjp ckq Tr(σp σq ) ∝ δpq ∝ 4w ∑ p=1 cjp ckp ∝ cj ⋅ ck ⟹ rank(K) ≤ 4w
  16. IBM Quantum / © 2021 IBM Corporation Numerical evidence suggests

    Haar-random states saturate this upper bound. Kjk = Tr(ρj ρk ) ρα Haar-random on w qubits rank(K) = ? min(N,4w)
  17. IBM Quantum / © 2021 IBM Corporation Numerical evidence shows

    some PQCs do not saturate this upper bound. Kjk = Tr(ρj ρk ) ρα from a PQC on w qubits rank(K) ≠ min(N,4w) ≤ min(N, u) u < 4w
  18. IBM Quantum / © 2021 IBM Corporation We find a

    weak relationship between circuit expressibility and Haar-like rank behavior. * S. Sim, et al. - Expressibility and entangling capability of parameterized quantum circuits for hybrid quantum-classical algorithms https://arxiv.org/abs/1905.10876 Expressibility = how much of Hilbert space a PQC covers (wrt Haar matrices)* e(U) = KL(PrU (F)||PrHaar (F)) F = |⟨ψ |ϕ⟩|2 Lower expressibility = more accurately approximates a Hear-random unitary
  19. IBM Quantum / © 2021 IBM Corporation Where does this

    leave us? Streaming data generates a data transfer overhead for using quantum kernels in practice. Matrix completion can extend quantum kernel matrices w/ a block-diagonal sparsity pattern. Perfect reconstruction -> overlap between blocks exceeds rank of extended matrix.
  20. IBM Quantum / © 2021 IBM Corporation Where does this

    leave us? Haar-random circuits (may) saturate an upper bound on the rank: min(N,4w) PQCs with low expressibility follow a similar scaling with respect to the rank of the quantum kernel matrix they generate.
  21. IBM Quantum / © 2021 IBM Corporation Future work Application

    to a real-world, Woodside-relevant data set Real-world data has structure; would this be reflected in rank of quantum kernel matrix? Quantify impact of matrix completion on Woodside-relevant algorithms Do approximated kernel matrices introduce unacceptable errors in ML workflows? Develop theory of scaling of matrix rank with respect to circuit expressibility Can we derive bounds (or exact results) relating expressibility to quantum kernel matrix rank?