Lock in $30 Savings on PRO—Offer Ends Soon! ⏳

Semi Parametric Inducing Point Networks and Neu...

Avatar for Richa Rastogi Richa Rastogi
November 10, 2025
3

Semi Parametric Inducing Point Networks and Neural Processes

We introduce semi-parametric inducing point networks (SPIN), a general-purpose architecture that can query the training set at inference time in a compute-efficient manner. Semi-parametric architectures are typically more compact than parametric models, but their computational complexity is often quadratic. In contrast, SPIN attains linear complexity via a cross-attention mechanism between datapoints inspired by inducing point methods. Querying large training sets can be particularly useful in meta-learning, as it unlocks additional training signal, but often exceeds the scaling limits of existing models. We use SPIN as the basis of the Inducing Point Neural Process, a probabilistic model which supports large contexts in meta-learning and achieves high accuracy where existing models fail. In our experiments, SPIN reduces memory requirements, improves accuracy across a range of meta-learning tasks, and improves state-of-the-art performance on an important practical problem, genotype imputation.

Avatar for Richa Rastogi

Richa Rastogi

November 10, 2025
Tweet

Transcript

  1. Semi-Parametric Inducing Point Networks and Neural Processes May 2023 Richa

    Rastogi, Yair Schiff, Alon Hacohen, Zhaozhi Li, Ian Lee, Yuntian Deng, Mert R. Sabuncu, Volodymyr Kuleshov Richa Rastogi
  2. ▪We have access to training set at inference time: ▪Goal

    is to learn parametric mapping conditioned on this dataset: 𝒟 𝑡 𝑟 𝑎 𝑖 𝑛 = { 𝒙 ( 𝑖 ), 𝒚 ( 𝑖 ) } 𝑛 𝑖 =1 𝒚 = 𝑓 𝜽 ( 𝒙 𝒟 𝑡 𝑟 𝑎 𝑖 𝑛 ) Semi-parametric setup:
  3. Meta Learning setup: 𝒟 c → f(x; 𝒟 c )

    Image credit: Dubois, Yann and Gordon, Jonathan and Foong, Andrew YK. ”Neural Process Family." (2020). http://yanndubs.github.io/Neural-Process-Family
  4. 𝒟 c → f(x; 𝒟 c ) Neural Processes 𝒟

    c → p(y|x, 𝒟 c ) Image credit: Dubois, Yann and Gordon, Jonathan and Foong, Andrew YK. ”Neural Process Family." (2020). http://yanndubs.github.io/Neural-Process-Family
  5. Most parametric models scale superlinearly in size of dataset (e.g.,

    attention between attributes scales quadratically). Meta-learning tasks benefit from conditioning on larger contexts.
  6. Reference dataset Motivating example: Parametric models are poor fit for

    long sequence imputation and cannot scale to larger reference datasets A long sequence, such as time series, biological sequence or text sequence where missing chunks of information need to be retrieved from a reference dataset
  7. Semi- Parametric Inducing Point Networks Inducing points for attention between

    datapoints, in addition to attention between attributes Linear time and space complexity in the size and the dimension of the data during training. Neural Processes architecture that supports larger context sizes.
  8. SPIN Overview • During training, learn the inducing points H

    • Encoder module maps • At inference, discard only keep • Predictor module is parametric and maps D → H D H (Xquery , H) → Yquery
  9. Image credit: Dubois, Yann and Gordon, Jonathan and Foong, Andrew

    YK. ”Neural Process Family." (2020). http://yanndubs.github.io/Neural-Process-Family Applying SPIN to Neural Processes…
  10. SOTA results on genome imputation SPIN outperforms state-of-the-art, and is

    more efficient than alternative Transformer-based approaches (Non-Parametric Transformers, Set- Transformers)
  11. Summary SPIN is linear time and space complexity in the

    size and the dimension of the data. SPIN learns a compact encoding of the training set for downstream applications. At inference time, computational complexity does not depend on training set size. IPNP is uncertainty aware, meta-learning algorithm that scales to larger context sizes.