2026-04-23_JMFrost_Trento-QMC

Jarvist Moore Frost (ICL, UK) ECT* Trento - Fermion sign
problem workshop April 2026 Department of Chemistry / Department of Physics Imperial College London Email: [email protected] https://frost-group.github.io/ Learning the Fermion sign structure in path-integral Monte Carlo Jarvist Moore Frost.

problem workshop April 2026 Motivation: Fermionic sign problem - Ease the Fermionic sign problem! - Quantum Monte Carlo, sample observables with arithmetic ratio estimator - BUT! Fermions have ﬂuctuating sign, exponentially destroys convergence (versus sqrt(N) MC samples).

problem workshop April 2026 Is arithmetic the best we can do? Why not just ﬁt a function?

problem workshop April 2026 Probabilistic Numerics "I was talking to David [MacKay] about research projects and things and he sorted of handed me a copy of Numerical Recipes, and he said 'I bet that every one of these things, could be interpreted as some kind of estimation problem, and that there would be cleverer things to do... if you made clear what your assumptions were.'" - Philipp Hennig & Ryan Adams, Remembering David MacKay, Talking Machines Podcast, 21st April 2016.

problem workshop April 2026

problem workshop April 2026 Simple models can give enormous complexity… Can we ﬁt such a model to a real system? Krüger, F., Zaanen, J., 2008. Fermionic quantum criticality and the fractal nodal surface. Phys. Rev. B 78, 035104. https://doi.org/10.1103/Phys RevB.78.035104

problem workshop April 2026 PIMC Code: Halcyon.jl - I followed Gabriele Spada et al.'s excellent article: Spada, G., et al. Path-Integral Monte Carlo Worm Algorithm for Bose Systems with Periodic Boundary Conditions Condensed Matter 7, 30. 2022. - Centroid virial estimator - Primitive action (P=200, typically) - UEG: Yakub-Ronchi spherically averaged potential (essentially pairwise → much cheaper and easier than Ewald); Kelbg softening. - Jacknife estimators

problem workshop April 2026 Can we block-sum the permutations? Try and organise things so that error is reduced in the summation? "Fermionic sign problem: an exaggerated myth" - Nikolay V Prokoﬁev (APS 2022 Abstract)

problem workshop April 2026 'S4 Conjugacy class' N!=4!=24 (4 particle exchange) p(N)=p(4)=5 All exchanges within same 'conjugacy class' are equivalent: same <E> same Sign parity. Figure: By LightbulbMEOW - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=174267377

problem workshop April 2026 Integer Partitions (Number theory) Conjugacy classes exponentially scale… but much friendlier! N= 10 p(N)= 48 N= 22 p(N)= 1100 N= 60 p(N)= 1E6 N= 115 p(N)= 1E9 N!

problem workshop April 2026 - Homogenous systems only (Spin Polar UEG, 3He) - Inhomogenous systems 'future work'. - Not entirely clear how they built their models - (I can't reproduce UEG r_s=1 T=0.125.) DuBois, J.L., Brown, E.W., Alder, B.J., 2014. 1409.3262

problem workshop April 2026 Permutation-family representation

problem workshop April 2026 Technical aside: rank and unrank C To histogram, we want to ideally quickly convert an arbitrary vector C to a unique, dense, integer index k ∈ {1, . . . , p(N) }. Pre-compute NxN matrix (dynamically programmed) C ↔ k requires an O(N) walk along matrix, accumulating 'skip' values. https://discrete.openmathbooks.org/more/mdm/sec_adv-linearparts.html julia> integer_partition_count_table(10) 11×11 Matrix{Int64}: 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 0 1 2 2 2 2 2 2 2 2 2 0 1 2 3 3 3 3 3 3 3 3 0 1 3 4 5 5 5 5 5 5 5 0 1 3 5 6 7 7 7 7 7 7 0 1 4 7 9 10 11 11 11 11 11 0 1 4 8 11 13 14 15 15 15 15 0 1 5 10 15 18 20 21 22 22 22 0 1 5 12 18 23 26 28 29 30 30 0 1 6 14 23 30 35 38 40 41 42

problem workshop April 2026 In our experiments, we just then create a dense (instantiated) histogram object. Object is p(N) x (count::Int64 + estimator::Float64 + estimator^2::Float64), so <4GB for N<100. (Everything should work if kept as a sparse object.) Histogram object

problem workshop April 2026 Multiplicity / Degeneracy We treat particles as distinguishable. (But they are not!) Combinatorics gives us the multiplicity (degeneracy) of each permutation-family - the number of different permutations which are equivalent. (For instance, fully connected N!/(N) = (N-1)! ) ⇒ Inﬁnite temperature (exchange energy irrelevant) occupancy.

problem workshop April 2026 N=33 UEG Sparsely visited (~10%) DuBois et al. 2014, Fig S2.

problem workshop April 2026 Models of perm-family probability Independent cycles: product partition function. We deﬁne the 'theta family' of probability models, with a per-cycle free energy. If all thetas=0, reduces to a multiplicity M(C) model, which is the inﬁnite-temperature limit.

problem workshop April 2026 Feynman K / DuBois p2 model • Assume loops only formed by pair interaction (Feynman 1953, Lambda-transition paper) • Free energy associated with this exchange, N= 33 θ= 0.125 r_s= 1

problem workshop April 2026 MaxEnt: just ﬁt everything! N= 33 θ= 0.125 r_s= 1

problem workshop April 2026 Maximum a Posterior (MAP) Learn corrections to the Feynman/DuBois exchange penalty. Where data is greater, it will pull the solution away from the prior.

problem workshop April 2026 N= 33 θ= 0.125 r_s= 1

problem workshop April 2026 N= 33 θ= 0.125 r_s= 1 300M MC moves

problem workshop April 2026 ~2x error

problem workshop April 2026 Long short-term memory (LSTM) ~1997 improvement to Recurrent Neural Networks (RNN) Solving 'vanishing gradients' problem. Short memory + long-term memory (with forgetting) ⇒ Naturally think in terms of logits (log-odds), as distribution over the next character. ⇒ At end of word ⇒ telescopes probability ⇒ p(C)

problem workshop April 2026 What is the next word, given previous words? I saw a cat sat on the… https://lena-voita.github.io/nlp_course/language_modeli ng.html

problem workshop April 2026 Impose reality Impose the canonical ensemble: • If next character exceeds the number of particles left, multiply by zero probability (+ log (0) ) • Therefore we don't waste any expressive power learning what is permitted. • Also very naturally provides a route to include prior knowledge (i.e. our theta family of independent models), just by putting them into the Prior. • (LSTM just learns the correlations as a delta on top of the independent cycle probability model.)

problem workshop April 2026 LSTM on MAP N= 33 θ= 0.125 r_s= 1 300M MC moves

problem workshop April 2026 Same data, sorted by MC count.

problem workshop April 2026 So now we have models for the probability of the sector… (and therefore average sign) But still have to rely on noisy MC estimators for energy. DBA assumes entirely Separable energy

problem workshop April 2026 Linear Energy Models Mean-ﬁeld limit (ideal gas): (1 param) Add a Feynman-style K exchange free-energy: (2 param)

problem workshop April 2026 General linear models with Priors Feynman-like exchange cost: Impose a smoothness prior, with a discretised laplacian Ridge regression on all but the first (mean-field) parameter: try and get as much support on the MF energy, Apply priors, fit with this imposition, and then extract DoF (trace of hat matrix) of fit

problem workshop April 2026 UEG N = 7 r_s = 10.0 T = 1.0

problem workshop April 2026 Asdasd asd a UEG N = 7 r_s = 1.0 T = 1.0

problem workshop April 2026 N=19, θ=1 r_s=1 UEG N = 19 r_s = 1.0 T = 0.125 (490 families)

problem workshop April 2026 TRAP N = 6 Beta = 1.0 Lambda = 0.5 (Harmonic Osc units)

problem workshop April 2026 NN energy models for correlations Again, a delta-machine-learning approach. Learn the corrections to the 'best' linear model. MLP with feature vector of the LSTM hidden state (at end of read); and raw C vector for permutations. Protect from noise with Empirical Bayes variance shrinkage & a Huber loss. (Very fast to train - just a few layers of small sized MLP, with only p(N) data.)

problem workshop April 2026 Importance sampling by bias Vast majority of time spent in very restricted set of permutation families. ~Bosonic simulation (i.e. large cycles at low T), won't necessarily have much overlap with the Fermionic simulation (no statistical support for reweighting). Only one move in the worm algorithm changes the permutation: SWAP. So simply multiply the Metropolis criterion by a ratio of probability models. Like Wang-Landau sampling in classical Stat Mech, reweight simulation to a ﬂat histogram in permutation family.

problem workshop April 2026 Debiasing & variance minim Should be able to de-bias by removing the weight from the observed density (empirical permutation-family count). But! Ergodicity? Detailed-balance? Currently just used the biased simulations to get lower-variance energy models. ⇒ Block-Jackknife resampling to estimate variance Sample proportion to: p x s.d.(Ek)

problem workshop April 2026 Does it work? Sort of! • UEG N=7 • For a system in which you can over sample (small, no sign problem), all the estimators agree; Bias does not break anything etc. • Absolute values? Yakub-Ronchi potential; Kelbg smearing. (Most results Ewald summation.) Some reference data Tobias Dornheim; I don't get exact agreement. (But also the background correction for absolute energies…

problem workshop April 2026 UEG: Linear models just ﬁne! UEG system, the linear models just ﬁne! 10x speedup in convergence DuBois et al. :

problem workshop April 2026 r_s=1 (massive noise (?) in estimators) Still functions in this regime! DuBois et al. :

problem workshop April 2026 2D harmonic trap (Lambda=0.5) In the 2D harmonic trap: the linear models start to break down. (Data shown evaluated on LSTM prob. Model.) (10^7 MC steps) Inhomogenous ⇔ spatial structure, dependencies between cycles ⇒ need to ﬁt the correlation.

problem workshop April 2026 Lambda-sweep in N=6 trap

problem workshop April 2026 N=6 Trap; importance sampling Importance sampling massively speeds convergence of the ﬁt for the energy estimators.

problem workshop April 2026 Discussion • Models of permutation-family probability ◦ For UEG, Linear models suffice ◦ Bayesian approach to fits allows you to impose physical priors in a well motivated and automatic manner ◦ DoF of fits gives insight into ◦ Physically motivated neural networks by imposing Bayesian Priors; spend all expressive power on learning the correlations • Models of permutation-family energy ◦ Learn in the 'Bosonic' sector ◦ My estimators are incredibly noisy (!?) • Estimator becomes a sum over Z, with models that describe the simulation as samples→infinity. • Importance sampling ◦ What we care about is the variance! ◦ Biasing simulation allows you to spend your computational budget on minimising this. ◦ Sign problem lives in the probability: combine Q model with empirical estimators?

problem workshop April 2026 What does the ﬁctitious sign do? Is the Xiong 'Xi' model imposing a Feynman1953 exchange penalty by biasing the simulation?

problem workshop April 2026 Future work • DuBois et al. ideas have merit • We should revisit them now we have much stronger and better unstead computational statistical tools. • Probabilistic numerics ◦ Simulating reality ⇒ agent gathering information ◦ Calculating observable ⇒ inferring a latent quality (the answer) given data (the simulation) • Use Xiong^2 method to bias simulation to different points & thereby learn the exp(Theta) model? • Algorithms all O(N) scaling… but would need to introduce a (distributed?) sparse histogramming method, for N>120 Fermions. Halcyon.jl

problem workshop April 2026 Acknowledgment • (Backﬂow in PIMD) ◦ Ingvars Vitenburgs • (Feynman variational approximation) ◦ Brad Martin • (Fitting fractal nodal surfaces with continuous ﬂows) ◦ Casano Kirlew • (DiagMC Polaron MSci project) ◦ Daryl Lee Yuen, Xiaoyi Yang • (PIMC Polaron MSci project): ◦ Logan Filipovich, YC Wong Fruitful discussions with: Matthew Foulkes, Gabriele Spada, Tommaso Morresi. EPSRC - EP/Y020790/1 Royal Society - URF/R1/191292

problem workshop April 2026 Linear model + LSTM

problem workshop April 2026 2x10^6 MC steps (2 mins on laptop core)

problem workshop April 2026 200x10^6 steps (8' on laptop) 33 Fermions [33] [1,1,1…]

problem workshop April 2026 Sorted by MC datapoint

2026-04-23_JMFrost_Trento-QMC

2026-04-23_JMFrost_Trento-QMC

More Decks by Jarvist Moore Frost

Other Decks in Science

Featured

Transcript