problem workshop April 2026 Department of Chemistry / Department of Physics Imperial College London Email: [email protected] https://frost-group.github.io/ Learning the Fermion sign structure in path-integral Monte Carlo Jarvist Moore Frost.
problem workshop April 2026 Motivation: Fermionic sign problem - Ease the Fermionic sign problem! - Quantum Monte Carlo, sample observables with arithmetic ratio estimator - BUT! Fermions have fluctuating sign, exponentially destroys convergence (versus sqrt(N) MC samples).
problem workshop April 2026 Probabilistic Numerics "I was talking to David [MacKay] about research projects and things and he sorted of handed me a copy of Numerical Recipes, and he said 'I bet that every one of these things, could be interpreted as some kind of estimation problem, and that there would be cleverer things to do... if you made clear what your assumptions were.'" - Philipp Hennig & Ryan Adams, Remembering David MacKay, Talking Machines Podcast, 21st April 2016.
problem workshop April 2026 Simple models can give enormous complexity… Can we fit such a model to a real system? Krüger, F., Zaanen, J., 2008. Fermionic quantum criticality and the fractal nodal surface. Phys. Rev. B 78, 035104. https://doi.org/10.1103/Phys RevB.78.035104
problem workshop April 2026 PIMC Code: Halcyon.jl - I followed Gabriele Spada et al.'s excellent article: Spada, G., et al. Path-Integral Monte Carlo Worm Algorithm for Bose Systems with Periodic Boundary Conditions Condensed Matter 7, 30. 2022. - Centroid virial estimator - Primitive action (P=200, typically) - UEG: Yakub-Ronchi spherically averaged potential (essentially pairwise → much cheaper and easier than Ewald); Kelbg softening. - Jacknife estimators
problem workshop April 2026 Can we block-sum the permutations? Try and organise things so that error is reduced in the summation? "Fermionic sign problem: an exaggerated myth" - Nikolay V Prokofiev (APS 2022 Abstract)
problem workshop April 2026 'S4 Conjugacy class' N!=4!=24 (4 particle exchange) p(N)=p(4)=5 All exchanges within same 'conjugacy class' are equivalent: same <E> same Sign parity. Figure: By LightbulbMEOW - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=174267377
problem workshop April 2026 - Homogenous systems only (Spin Polar UEG, 3He) - Inhomogenous systems 'future work'. - Not entirely clear how they built their models - (I can't reproduce UEG r_s=1 T=0.125.) DuBois, J.L., Brown, E.W., Alder, B.J., 2014. 1409.3262
problem workshop April 2026 In our experiments, we just then create a dense (instantiated) histogram object. Object is p(N) x (count::Int64 + estimator::Float64 + estimator^2::Float64), so <4GB for N<100. (Everything should work if kept as a sparse object.) Histogram object
problem workshop April 2026 Multiplicity / Degeneracy We treat particles as distinguishable. (But they are not!) Combinatorics gives us the multiplicity (degeneracy) of each permutation-family - the number of different permutations which are equivalent. (For instance, fully connected N!/(N) = (N-1)! ) ⇒ Infinite temperature (exchange energy irrelevant) occupancy.
problem workshop April 2026 Models of perm-family probability Independent cycles: product partition function. We define the 'theta family' of probability models, with a per-cycle free energy. If all thetas=0, reduces to a multiplicity M(C) model, which is the infinite-temperature limit.
problem workshop April 2026 Feynman K / DuBois p2 model • Assume loops only formed by pair interaction (Feynman 1953, Lambda-transition paper) • Free energy associated with this exchange, N= 33 θ= 0.125 r_s= 1
problem workshop April 2026 Maximum a Posterior (MAP) Learn corrections to the Feynman/DuBois exchange penalty. Where data is greater, it will pull the solution away from the prior.
problem workshop April 2026 Long short-term memory (LSTM) ~1997 improvement to Recurrent Neural Networks (RNN) Solving 'vanishing gradients' problem. Short memory + long-term memory (with forgetting) ⇒ Naturally think in terms of logits (log-odds), as distribution over the next character. ⇒ At end of word ⇒ telescopes probability ⇒ p(C)
problem workshop April 2026 What is the next word, given previous words? I saw a cat sat on the… https://lena-voita.github.io/nlp_course/language_modeli ng.html
problem workshop April 2026 Impose reality Impose the canonical ensemble: • If next character exceeds the number of particles left, multiply by zero probability (+ log (0) ) • Therefore we don't waste any expressive power learning what is permitted. • Also very naturally provides a route to include prior knowledge (i.e. our theta family of independent models), just by putting them into the Prior. • (LSTM just learns the correlations as a delta on top of the independent cycle probability model.)
problem workshop April 2026 So now we have models for the probability of the sector… (and therefore average sign) But still have to rely on noisy MC estimators for energy. DBA assumes entirely Separable energy
problem workshop April 2026 General linear models with Priors Feynman-like exchange cost: Impose a smoothness prior, with a discretised laplacian Ridge regression on all but the first (mean-field) parameter: try and get as much support on the MF energy, Apply priors, fit with this imposition, and then extract DoF (trace of hat matrix) of fit
problem workshop April 2026 NN energy models for correlations Again, a delta-machine-learning approach. Learn the corrections to the 'best' linear model. MLP with feature vector of the LSTM hidden state (at end of read); and raw C vector for permutations. Protect from noise with Empirical Bayes variance shrinkage & a Huber loss. (Very fast to train - just a few layers of small sized MLP, with only p(N) data.)
problem workshop April 2026 Importance sampling by bias Vast majority of time spent in very restricted set of permutation families. ~Bosonic simulation (i.e. large cycles at low T), won't necessarily have much overlap with the Fermionic simulation (no statistical support for reweighting). Only one move in the worm algorithm changes the permutation: SWAP. So simply multiply the Metropolis criterion by a ratio of probability models. Like Wang-Landau sampling in classical Stat Mech, reweight simulation to a flat histogram in permutation family.
problem workshop April 2026 Debiasing & variance minim Should be able to de-bias by removing the weight from the observed density (empirical permutation-family count). But! Ergodicity? Detailed-balance? Currently just used the biased simulations to get lower-variance energy models. ⇒ Block-Jackknife resampling to estimate variance Sample proportion to: p x s.d.(Ek)
problem workshop April 2026 Does it work? Sort of! • UEG N=7 • For a system in which you can over sample (small, no sign problem), all the estimators agree; Bias does not break anything etc. • Absolute values? Yakub-Ronchi potential; Kelbg smearing. (Most results Ewald summation.) Some reference data Tobias Dornheim; I don't get exact agreement. (But also the background correction for absolute energies…
problem workshop April 2026 2D harmonic trap (Lambda=0.5) In the 2D harmonic trap: the linear models start to break down. (Data shown evaluated on LSTM prob. Model.) (10^7 MC steps) Inhomogenous ⇔ spatial structure, dependencies between cycles ⇒ need to fit the correlation.
problem workshop April 2026 Discussion • Models of permutation-family probability ◦ For UEG, Linear models suffice ◦ Bayesian approach to fits allows you to impose physical priors in a well motivated and automatic manner ◦ DoF of fits gives insight into ◦ Physically motivated neural networks by imposing Bayesian Priors; spend all expressive power on learning the correlations • Models of permutation-family energy ◦ Learn in the 'Bosonic' sector ◦ My estimators are incredibly noisy (!?) • Estimator becomes a sum over Z, with models that describe the simulation as samples→infinity. • Importance sampling ◦ What we care about is the variance! ◦ Biasing simulation allows you to spend your computational budget on minimising this. ◦ Sign problem lives in the probability: combine Q model with empirical estimators?
problem workshop April 2026 What does the fictitious sign do? Is the Xiong 'Xi' model imposing a Feynman1953 exchange penalty by biasing the simulation?
problem workshop April 2026 Future work • DuBois et al. ideas have merit • We should revisit them now we have much stronger and better unstead computational statistical tools. • Probabilistic numerics ◦ Simulating reality ⇒ agent gathering information ◦ Calculating observable ⇒ inferring a latent quality (the answer) given data (the simulation) • Use Xiong^2 method to bias simulation to different points & thereby learn the exp(Theta) model? • Algorithms all O(N) scaling… but would need to introduce a (distributed?) sparse histogramming method, for N>120 Fermions. Halcyon.jl